Reminder Operator Returns Negative Value in Java

2021 Mar 23

The % operator is the reminder operator in Java. In other languages, it may be called modulo operator.

”%”, which divides one operand by another and returns the remainder as its result.

It seems a simple operator, by looking at example code like this, 10 % 7 = 3.

However, the % operator may return a negative value.

  assertEquals(-1, -9 % 2, "return negative value here");
  assertEquals(1, 9 % -2, "return positive value here");

It’s because Java’s reminder implementation uses “truncated division”. See details in this Wiki page. The “truncation (integer part)” of -9 / 2, i.e. -4.5, is -4. So,

  -9 % 2 = -9 - 2 * (-4) = -9 - (-8) = -1

For the 9 % -2 case, the “truncation” is also -4. However,

  9 % -2 = 9 - (-2) * (-4) = 9 - 8 = 1

A Problem

Sometime, this % returning negative behavior may cause problems.

  var batches = 10;
  // ... some RxJava code
  .groupBy(input -> Objects.hash(input) % batches)
  //...

For example, in above code you may want to split your work into 10 batches. However, because the Objects.hash() may return negative hash code, Objects.hash(input) % 10 has 19 possible values (integers), from -9 to 9. So unexpectedly, your work is split int 19 batches.

A Rescue

Java 8 provides a Math.floorMod() method, which can be used in situations like above. According to its Javadoc,

If the signs of the arguments are the same, the results of floorMod and the % operator are the same.

If the signs of the arguments are different, the results differ from the % operator.

floorMod(+4, -3) == -2;   and (+4 % -3) == +1
floorMod(-4, +3) == +2;   and (-4 % +3) == -1
floorMod(-4, -3) == -1;   and (-4 % -3) == -1

The floorMod(x, y) is calculated using below relationship.

floorDiv(x, y) * y + floorMod(x, y) == x

And for the Math.floorDiv() method, its Javadoc says

Returns the largest (closest to positive infinity) int value that is less than or equal to the algebraic quotient.

For example, floorDiv(-4, 3) == -2, whereas (-4 / 3) == -1.

Given dividend is -4 and divisor is 3, its algebraic quotient is -1.333333. “The largest (closest to positive infinity) int value that is less than or equal to” -1.333333 is -2, not -1 which is larger than the quotient. Therefore, floorDiv(-4, 3) == -2.

  floorMod(-4, 3) = -4 - floorDiv(-4, 3) * 3 = -4 - (-2)*3 = 2

Avoid Wrong Tracking When Create Branches in Git

2020 Sep 8

Just made a mistake to push commits to a wrong remote branch. Below is the detail.

  1. Need to create a new branch br-x, which needs to be based on the newest remote dev branch.
  2. Run git fetch to get newest change from the remote.
  3. Run git checkout -b br-x origin/dev to create branch br-x.
  4. Change and commit files in branch br-x.
  5. Run git push origin -u br-x to push commits to the remote.

In step 3, the origin/dev is used to as the “start-point” of the new br-x branch. As per git branch --help,

When a local branch is started off a remote-tracking branch, Git sets up the branch (specifically the branch.<name>.remote and branch.<name>.merge configuration entries) so that git pull will appropriately merge from the remote-tracking branch. This behavior may be changed via the global branch.autoSetupMerge configuration flag.

In other words, the git checkout -b br-x origin/dev not only create a new br-x branch, but also let the br-x track the remote dev branch. As a result, in step 5, the git push origin -u br-x doesn’t push commits into a same-name remote branch. However, it pushes commits into the remote dev branch, which the local br-x is tracking since its creation. The remote dev branch is accidentally modified. 😞

To avoid it, one method is use the local dev branch as the “start-point” in step 3. Consider the local dev may be behind the remote dev. You may have to switch to the local dev and git pull to update it first. Another method is using --no-track option, i.e. git checkout -b --no-track br-x origin/dev.

A more thorough method is using git config --global branch.autoSetupMerge false to change the default behavior of Git. When branch.autoSetupMerge is false, when create a branch, Git will not setup its tracking branch even if the “start-point” is a remote-tracking branch. From more details search “branch.autoSetupMerge” in git config --help.

For what is “remote-tracking” branch, check this link. Simply put,

Remote-tracking branch names take the form <remote>/<branch>.

The contains() Method in Java Collection Is Not "Type Safe"

2020 May 8
Currency currency;
//...
if(currency.getSupportedCountries().contains(country)) {
    //...
}

The Currency.getSupportedCountries() returns a Collection. Originally, the returned Collection was Collection<Country>. The country object in the above if-condition was of type Country. The program has been well tested and worked as expected.

However, due to whatever reason, the getSupportedCountries() is refactored to return a Collection<String>. The Java compiler complains nothing about the refactor. But the if-condition now is never true in any cases, since the equals() method of String has no idea about the equality with Country and vice versa. A bug! It’s hard to detect this kind of bug, if the code is not well covered by unit tests or end-to-end tests.

In this sense, the contains() method in Java Collection is not type safe.

How to Avoid

First, never change the behavior of an API when refactor. In the above case, the signature of the getSupportedCountries() API has changed. This is a breaking change, which usually causes the client code fails to compile. Unfortunately, in above case the client code doesn’t fail fast in the compile phase. It’s better to add new API like getSupportedCountryCodes() which returns a Collection<String>, and @Deprecated the old API, which can be further deleted some time later.

Second, make code fully covered by test cases as much as possible. Test cases can detect the bug earlier in the test phase.

Why contains() Is Not Generic

Why contains() is not designed as contains(E o), but as contains(Object o)? There are already some discussion on this design in StackOverflow, like this one and this one. It’s said it’s no harm to let methods like contains() in Collection be no generic. Being no generic, the contains() can accept a parameter of another type, which is seen as a “flexible” design. However, the above case shows that this design does have harm and cannot give developers enough confidence.

A method accepting a parameter of Object means it accepting any type, which is too “dynamic” for a “static” language.

Another question is why a static language needs a “root” Object?

Daily Dev Log: Avoid the Pitfall of Using the Same File to Redirect Input and Output

2019 Jan 15

Pitfalls

Do Not Use the Same File to Redirect Input and Output

tr -d '\015' <DOS-file >DOS-file

The above command will delete all content in the file!

From man bash,

[n]>word, if it does exist it is truncated to zero size.

(How did I find the file back? Luckily, the working directory is managed by Dropbox, and I found it back in the Dropbox.)

CLI

Convert Line Endings from DOS/Windows Style to Unix/Linux Style

tr -d '\015' <DOS-file >UNIX-file

(For what character \015 is, see man 7 ascii or ascii '\015' if the ascii command is installed.)

More ascii Command Examples

$ ascii '\r'
ASCII 0/13 is decimal 013, hex 0d, octal 015, bits 00001101: called ^M, CR
Official name: Carriage Return
C escape: '\r'
Other names: 

Search Manuals

-k Search the short descriptions and manual page names for the keyword

$ man -k ascii
ascii (1)            - report character aliases
ascii (7)            - ASCII character set encoded in octal, decimal, and hexadecimal
...

Miss Newline Characters When "cat" Text Files

2019 Jan 4

The cat is often used to concatenate text files into one single file. In most cases, the cat works fine like below.

$ echo line 1 > file1.txt
$ echo line 2 > file2.txt
$ cat file{1,2}.txt
line 1
line 2

However, if some of files to be concatenated don’t end with the newline character, using cat to concatenate files may not generate expected file.

# -n, let echo not add the trailing newline character
$ echo -n line 1 > file1.txt
$ echo line 2 > file2.txt
$ cat file{1,2}.txt
line 1line 2

Note that in the above example, file1.txt doesn’t end with newline, so when two files concatenated there is no newline between them. This may not be the expected result. For example, we have multiple large text files. Every line in each file is a user ID. We want to concatenate these files into one file to be fed into a processing program at once. If some of files are not ended with newline, using cat may generate ill user IDs like user-id-foouser-id-bar. If the input volume is huge, these problematic IDs usually would not be detected by human eyes.

If the newlines between files is important in your case, using awk is safer.

# -n, let echo not add the trailing newline character
$ echo -n line 1 > file1.txt
$ echo line 2 > file2.txt
$ $ awk 1 file{1,2}.txt
line 1
line 2

See this SO answer.

Also, it’s a good idea to tune text editors to always show non-printable characters like the newline. Or, use cat -e, which prints invisible characters and a $ for the newline.

$ cat -e file1.txt | tail -1

在文本编辑器里显示空白字符

2016 Oct 28

下午同事遇到一个bug,数据库始终连接不上。 网络检查正常(用数据库的client可以正常连接)、配置文件也是“正常的”(和其他可以正确连接的同事的配置“一摸一样”)。 最后发现是配置文件中的数据库密码的结尾多了几个空格😓。

所以最好在编辑器/IDE里显示空白字符。绝大部分编辑器都是支持空白字符显示的,包括Vim。

另一个空格相关的bug