TreeMap and PriorityQueue in Java

2023 Jun 20

TreeMap

A Red-Black tree based NavigableMap implementation.

For the containsKey, get, put and remove operations, TreeMap guarantees log(n) time cost. While HashMap implements these four operation with O(1) time complexity. So when acting as a traditional hash table, TreeMap is slower than HashMap.

However, TreeMap has its own advantages. It’s not just a simple Map, but a SortedMap. It provides methods to get lowest and greatest key/entry more efficient than HashMap.

  • firstEntry() / firstKey() / pollFirstEntry(), to return/remove lowest key or entry.
  • lastKey() / lastEntry() / pollLastEntry(), for greatest key or entry.

To get the lowest or greatest, HashMap does it in O(n) time cost, while TreeMap only needs O(logn). (TreeMap keeps going left or right to retrieve the lowest or greatest.)

  // TreeMap source code
  final Entry<K,V> getFirstEntry() {
      Entry<K,V> p = root;
      if (p != null)
          while (p.left != null)
              p = p.left;
      return p;
  }

Check TreeMap Javadoc for more details.

PriorityQueue

An unbounded priority queue based on a priority heap.

The head of this queue is the least element with respect to the specified ordering.

So, for retrieving the lowest, PriorityQueue is faster than TreeMap. PriorityQueue provides O(1) time cost, while TreeMap provides log(n).

  // PriorityQueue source code
  public E peek() {
      return (E) queue[0];
  }

For removing the lowest in a PriorityQueue, it’s an O(logn) operation, which is same as doing it a TreeMap.

As its name indicates, PriorityQueue is a better choice when it’s used as a priority queue, that is, letting the lowest one be in the front of the queue. Retrieving the lowest, removing it and letting the next lowest value be in the front of the queue are fast, which are O(logn) operations.

However, if sometimes we need to remove an arbitrary entry, not the lowest entry, and want to retrieve the lowest fast, in this case, TreeMap provides a better performance than PriorityQueue. For example, we have a composite operation which has the following operations.

  1. Add a new element.
  2. Remove an existed element, not the lowest.
  3. Retrieve the lowest value

For PriorityQueue, the second step is slowest operation, which is O(n). So, the overall time cost of the composite operation using PriorityQueue is O(n). While using TreeMap, all three steps cost O(logn), so the overall time cost is O(logn).

Check PriorityQueue Javadoc for more details.

only() vs. times(1) in Mockito

2021 Dec 13

Use times(1) to verify a method is called exactly once.

  verify(orderService, times(1)).getOrder(anyLong());
  verify(orderService, times(1)).createOrder(any(CreateOrderRequest.class));

The above code verifies that both getOrder and createOrder are called exactly once on orderService.

If the times(1) in above example is replaced by only(), the test would fail.

As per Mockito Javadoc of only(),

Allows checking if given method was the only one invoked

verify(mock, only()).someMethod(); is same as

  verify(mock).someMethod();
  verifyNoMoreInteractions(mock);

i.e.

  verify(mock, times(1)).someMethod();
  verifyNoMoreInteractions(mock);

Therefore,

  verify(orderService, only()).getOrder(anyLong());
  verify(orderService, only()).createOrder(any(CreateOrderRequest.class));

is equivalent to

  verify(orderService, times(1)).getOrder(anyLong()); // L1
  verifyNoMoreInteractions(orderService); // L2
  verify(orderService, times(1)).createOrder(any(CreateOrderRequest.class)); // L3
  verifyNoMoreInteractions(orderService); // L4

It would fail at line “L2”.

Why @Mock Annotated Object Is Null

2021 May 19

Mockito provides annotations like @Mock, @Spy, @Captor to make code simpler. Since most of our code is just copied from Google search and Stack Overflow, we may use these Mockito annotations like below, after quick reading of Google search results like this article.

@RunWith(MockitoJUnitRunner.class)
class OrderTest {
    @Captor
    ArgumentCaptor<Offer> offerCaptor;

    @Test
    void test() {
      // use the `offerCaptor` object ...
    }

However, when run, the test throws NullPointerException due to the offerCaptor is null, even though it’s annotated by @Captor.

(Spending hours of Googling and debugging.)

The root cause of NPE is that the test is using JUnit 5, but @RunWith is a JUnit 4 annotation.

@RunWith no longer exists; superseded by @ExtendWith.

Therefore, @RunWith(MockitoJUnitRunner.class) is totally ignored by JUnit 5, no setup happens for objects annotated by @Captor.

To support JUnit 5 @ExtendWith, Mockito provides a MockitoExtension class.

This extension is the JUnit Jupiter equivalent of our JUnit4 MockitoJUnitRunner.

Below code has no NPE, for test cases using JUnit 5 and Mockito annotations.

@ExtendWith(MockitoExtension.class)
class OrderTest {
    @Captor
    ArgumentCaptor<Offer> offerCaptor;

Some Thoughts

It would be better if JUnit 5 warns us if it finds the obsolete JUnit 4 @RunWith, instead of failing silently.

Another solution is that, if a project is going to use JUnit 5, just exclude the JUnit 4 from the dependencies of the project. So using @RunWith would be a compile error.

CompletableFuture: join() vs. get()

2021 Mar 23

What’s the difference between join() and get() of the CompletableFuture class?

Both of them wait for the future completion. The only difference is what kind of exceptions these two methods throw. The get(), originated in the Future interface, throws “checked exceptions”. While join() throws “unchecked exceptions”.

In old days, like Java 1.5 era when Future.get() was designed, checked exceptions might be preferred. Nowadays, unchecked exceptions are more preferred, therefore join() is preferred.

As per Javadoc of CompletableFuture,

To simplify usage in most contexts, this class also defines methods join() and getNow(T) that instead throw the CompletionException directly in these cases.

In other words, join() is simpler to use, in the sense of exception handling. It’s kind of an official recommendation.

Reminder Operator Returns Negative Value in Java

2021 Mar 23

The % operator is the reminder operator in Java. In other languages, it may be called modulo operator.

”%”, which divides one operand by another and returns the remainder as its result.

It seems a simple operator, by looking at example code like this, 10 % 7 = 3.

However, the % operator may return a negative value.

  assertEquals(-1, -9 % 2, "return negative value here");
  assertEquals(1, 9 % -2, "return positive value here");

It’s because Java’s reminder implementation uses “truncated division”. See details in this Wiki page. The “truncation (integer part)” of -9 / 2, i.e. -4.5, is -4. So,

  -9 % 2 = -9 - 2 * (-4) = -9 - (-8) = -1

For the 9 % -2 case, the “truncation” is also -4. However,

  9 % -2 = 9 - (-2) * (-4) = 9 - 8 = 1

A Problem

Sometime, this % returning negative behavior may cause problems.

  var batches = 10;
  // ... some RxJava code
  .groupBy(input -> Objects.hash(input) % batches)
  //...

For example, in above code you may want to split your work into 10 batches. However, because the Objects.hash() may return negative hash code, Objects.hash(input) % 10 has 19 possible values (integers), from -9 to 9. So unexpectedly, your work is split int 19 batches.

A Rescue

Java 8 provides a Math.floorMod() method, which can be used in situations like above. According to its Javadoc,

If the signs of the arguments are the same, the results of floorMod and the % operator are the same.

If the signs of the arguments are different, the results differ from the % operator.

floorMod(+4, -3) == -2;   and (+4 % -3) == +1
floorMod(-4, +3) == +2;   and (-4 % +3) == -1
floorMod(-4, -3) == -1;   and (-4 % -3) == -1

The floorMod(x, y) is calculated using below relationship.

floorDiv(x, y) * y + floorMod(x, y) == x

And for the Math.floorDiv() method, its Javadoc says

Returns the largest (closest to positive infinity) int value that is less than or equal to the algebraic quotient.

For example, floorDiv(-4, 3) == -2, whereas (-4 / 3) == -1.

Given dividend is -4 and divisor is 3, its algebraic quotient is -1.333333. “The largest (closest to positive infinity) int value that is less than or equal to” -1.333333 is -2, not -1 which is larger than the quotient. Therefore, floorDiv(-4, 3) == -2.

  floorMod(-4, 3) = -4 - floorDiv(-4, 3) * 3 = -4 - (-2)*3 = 2

Compare Two OffsetDateTime of the Same Instant

2021 Mar 16

For two Java OffsetDateTime objects of the same instant, but with different timezone offset, to check whether they represent the same point in time, the isEqual method should be used.

var date1 = OffsetDateTime.parse("2008-12-03T11:00+01:00");
var date2 = OffsetDateTime.parse("2008-12-03T12:00+02:00");
date1.isEqual(date2); // the result is true
date1.equals(date2); // the result is false
date1.compareTo(date2); // the result < 0

In the above code, both of the OffsetDateTime objects represent “2008-12-03T10:00Z”. However, the equals or the compareTo method will tell these two object are not “equal”.

The OffsetDateTime class uses a field of LocalDateTime and a field of ZoneOffset to represent an instant. Its equals method cares about whether all these field of two objects are same or not. Obviously, fields of date1 are not equals with ones of date2.

The compareTo has same behavior as equals.

It is “consistent with equals”, as defined by Comparable.

As per Javadoc of OffsetDateTime, what isEqual does is dateTime1.toInstant().equals(dateTime2.toInstant()). That is OffsetDateTime object are convert to Instant object first, and use the equals of Instant for comparison.

From the Javadoc of Instant,

An instantaneous point on the time-line.

the class stores a long representing epoch-seconds and an int representing nanosecond-of-second, … The epoch-seconds are measured from the standard Java epoch of 1970-01-01T00:00:00Z

An Instant object has no timezone info stored, only epoch of fixed UTC timezone stored, a long and a int.

Side note: what’s the different between ZonedDateTime and OffsetDateTime

The major difference is that ZonedDateTime knows Daylight saving time (DST), while OffsetDateTime does not. It’s because OffsetDateTime only stores simple offset to UTC, while ZonedDateTime stores much richer timezone info.

Mod in Java Produces Negative Numbers

2021 Feb 1

The problem here is that in Python the % operator returns the modulus and in Java it returns the remainder. These functions give the same values for positive arguments, but the modulus always returns positive results for negative input, whereas the remainder may give negative results.

-21 mod 4 is 3 because -21 + 4 x 6 is 3.

But -21 divided by 4 gives -5 with a remainder of -1.

Jackson Mix-in Annotations

2020 May 20

Jackson has a feature called Mix-in annotations. With this feature, we can write cleaner code for the domain classes. Imagine we have domain classes like below.

// package com.example
public interface Item {
    ItemType getType();
}

@Value
public class FooItem implements Item{
    @NonNull String fooId;

    @Override
    public ItemType getType() {
        return ItemType.FOO;
    }
}

When implement these domain classes, no Jackson annotation is put on them. To support serialization and deserialization with Jackson for these classes, add “Mix-in” classes, for example in a separate package called com.example.jackson.

Add ItemMixin below to let Jackson be able to serialize and deserialize Item and its subclasses.

@JsonTypeInfo(
        use = JsonTypeInfo.Id.NAME,
        include = JsonTypeInfo.As.PROPERTY,
        property = "type")
@JsonSubTypes({
        @JsonSubTypes.Type(value = FooItem.class, name = "FOO")
        // ...
})
public abstract class ItemMixin {
}
// merge the Jackson annotations in the Mix-in class into the Item class,
// as if these annotations are in the Item class
objectMapper.addMixIn(Item.class, ItemMixin.class);

String json = "[{\"fooId\": \"1\", \"type\": \"FOO\"}]";
List<Item> items = mapper.readValue(json, new TypeReference<List<Item>>(){});

Note that FooItem is implemented as an immutable class using @Value from Lombok. With @Value annotated, FooItem has no default constructor, which makes Jackson unable to serialize it by default. Add FooItemMixin below to fix it.

@JsonIgnoreProperties(value={"type"}, allowGetters=true)
abstract class FooItemMixin {
    @JsonCreator
    public FooItemMixin(@JsonProperty("fooId") String fooId) {

    }
}

With help of Mix-in annotations, the domain classes don’t have to be compromised for Jackson support. All Jackson relevant annotations are in separate Mix-in classes in a separate package. Further, we could provide a simple Jackson module like below.

@Slf4j
public class MixinModule extends SimpleModule {
    @Override
    public void setupModule(SetupContext context) {
        context.setMixInAnnotations(Item.class, ItemMixin.class);
        context.setMixInAnnotations(FooItem.class, FooItemMixin.class);
        log.info("module set up");
    }
}

The consumers of the domain classes can simple register this module to take in all the Mix-in annotations.

objectMapper.registerModule(new MixinModule());

Jackson Mix-in helps especially if the domain classes are from a third party library. In this case, the source of the domain classes cannot be modified, using Mix-in is more elegant than writing custom serializers and deserializers.

Which Accessor Style? "Fluent" vs. "Java Bean"

2020 May 19

Basically, there are two styles of accessor naming convention in Java.

One is the traditional “Java Bean” style.

public class Item {
    ItemType getType();
}

Another is called the “fluent” style.

public class Item {
    ItemType type();
}

The fluent style saves some typing when writing code, also makes code a bit less verbose, item.type().

For example, the Lombok library supports this fluent style.

lombok.accessors.fluent = [true | false] (default: false)
If set to true, generated getters and setters will not be prefixed with the bean-standard ‘get, is or set; instead, the methods will use the same name as the field (minus prefixes).

Which style is better?

Actually, the verbose Java Bean style, is the better one. It’s because a lot of third party libraries are acknowledging the Java Bean style. For example, if Item is going to be serialized by the Jackson library as a JSON string, the fluent style wouldn’t work out of the box. Also, most of DTO mapping libraries are using the Java Bean style too. Therefore, using the standard Java Bean accessor style save effort when integrate our classes with other libraries and frameworks.

With the Lombok library and the auto-completion in modern IDEs, the Java Bean style doesn’t necessarily mean more typing.

The contains() Method in Java Collection Is Not "Type Safe"

2020 May 8
Currency currency;
//...
if(currency.getSupportedCountries().contains(country)) {
    //...
}

The Currency.getSupportedCountries() returns a Collection. Originally, the returned Collection was Collection<Country>. The country object in the above if-condition was of type Country. The program has been well tested and worked as expected.

However, due to whatever reason, the getSupportedCountries() is refactored to return a Collection<String>. The Java compiler complains nothing about the refactor. But the if-condition now is never true in any cases, since the equals() method of String has no idea about the equality with Country and vice versa. A bug! It’s hard to detect this kind of bug, if the code is not well covered by unit tests or end-to-end tests.

In this sense, the contains() method in Java Collection is not type safe.

How to Avoid

First, never change the behavior of an API when refactor. In the above case, the signature of the getSupportedCountries() API has changed. This is a breaking change, which usually causes the client code fails to compile. Unfortunately, in above case the client code doesn’t fail fast in the compile phase. It’s better to add new API like getSupportedCountryCodes() which returns a Collection<String>, and @Deprecated the old API, which can be further deleted some time later.

Second, make code fully covered by test cases as much as possible. Test cases can detect the bug earlier in the test phase.

Why contains() Is Not Generic

Why contains() is not designed as contains(E o), but as contains(Object o)? There are already some discussion on this design in StackOverflow, like this one and this one. It’s said it’s no harm to let methods like contains() in Collection be no generic. Being no generic, the contains() can accept a parameter of another type, which is seen as a “flexible” design. However, the above case shows that this design does have harm and cannot give developers enough confidence.

A method accepting a parameter of Object means it accepting any type, which is too “dynamic” for a “static” language.

Another question is why a static language needs a “root” Object?

← Previous