Reminder Operator Returns Negative Value in Java
The %
operator is the reminder operator in Java.
In other languages, it may be called modulo operator.
”%”, which divides one operand by another and returns the remainder as its result.
It seems a simple operator, by looking at example code like this, 10 % 7 = 3
.
However, the %
operator may return a negative value.
assertEquals(-1, -9 % 2, "return negative value here");
assertEquals(1, 9 % -2, "return positive value here");
It’s because Java’s reminder implementation uses “truncated division”. See details in this Wiki page.
The “truncation (integer part)” of -9 / 2
, i.e. -4.5
, is -4
. So,
-9 % 2 = -9 - 2 * (-4) = -9 - (-8) = -1
For the 9 % -2
case, the “truncation” is also -4
. However,
9 % -2 = 9 - (-2) * (-4) = 9 - 8 = 1
A Problem
Sometime, this %
returning negative behavior may cause problems.
var batches = 10;
// ... some RxJava code
.groupBy(input -> Objects.hash(input) % batches)
//...
For example, in above code you may want to split your work into 10 batches.
However, because the Objects.hash()
may return negative hash code,
Objects.hash(input) % 10
has 19 possible values (integers), from -9 to 9.
So unexpectedly, your work is split int 19 batches.
A Rescue
Java 8 provides a Math.floorMod() method, which can be used in situations like above. According to its Javadoc,
If the signs of the arguments are the same, the results of floorMod and the % operator are the same.
If the signs of the arguments are different, the results differ from the % operator.
floorMod(+4, -3) == -2; and (+4 % -3) == +1
floorMod(-4, +3) == +2; and (-4 % +3) == -1
floorMod(-4, -3) == -1; and (-4 % -3) == -1
The floorMod(x, y)
is calculated using below relationship.
floorDiv(x, y) * y + floorMod(x, y) == x
And for the Math.floorDiv() method, its Javadoc says
Returns the largest (closest to positive infinity) int value that is less than or equal to the algebraic quotient.
For example, floorDiv(-4, 3) == -2, whereas (-4 / 3) == -1.
Given dividend is -4 and divisor is 3, its algebraic quotient is -1.333333.
“The largest (closest to positive infinity) int value that is less than or equal to” -1.333333
is -2, not -1 which is larger than the quotient.
Therefore, floorDiv(-4, 3) == -2
.
floorMod(-4, 3) = -4 - floorDiv(-4, 3) * 3 = -4 - (-2)*3 = 2
Avoid Wrong Tracking When Create Branches in Git
Just made a mistake to push commits to a wrong remote branch. Below is the detail.
- Need to create a new branch
br-x
, which needs to be based on the newest remotedev
branch. - Run
git fetch
to get newest change from the remote. - Run
git checkout -b br-x origin/dev
to create branchbr-x
. - Change and commit files in branch
br-x
. - Run
git push origin -u br-x
to push commits to the remote.
In step 3, the origin/dev
is used to as the “start-point” of the new br-x
branch. As per git branch --help
,
When a local branch is started off a remote-tracking branch, Git sets up the branch (specifically the
branch.<name>.remote
andbranch.<name>.merge
configuration entries) so that git pull will appropriately merge from the remote-tracking branch. This behavior may be changed via the global branch.autoSetupMerge configuration flag.
In other words, the git checkout -b br-x origin/dev
not only create a new br-x
branch, but also let the br-x
track
the remote dev
branch. As a result, in step 5, the git push origin -u br-x
doesn’t push commits into a same-name remote branch.
However, it pushes commits into the remote dev
branch, which the local br-x
is tracking since its creation.
The remote dev
branch is accidentally modified. 😞
To avoid it, one method is use the local dev
branch as the “start-point” in step 3. Consider the local dev
may be behind
the remote dev
. You may have to switch to the local dev
and git pull
to update it first.
Another method is using --no-track
option, i.e. git checkout -b --no-track br-x origin/dev
.
A more thorough method is using git config --global branch.autoSetupMerge false
to change the default behavior of Git.
When branch.autoSetupMerge
is false
, when create a branch, Git will not setup its tracking branch even if the “start-point” is a remote-tracking branch.
From more details search “branch.autoSetupMerge” in git config --help
.
For what is “remote-tracking” branch, check this link. Simply put,
Remote-tracking branch names take the form
<remote>/<branch>
.
The contains() Method in Java Collection Is Not "Type Safe"
Currency currency;
//...
if(currency.getSupportedCountries().contains(country)) {
//...
}
The Currency.getSupportedCountries()
returns a Collection
. Originally, the returned Collection
was Collection<Country>
.
The country
object in the above if-condition was of type Country
. The program has been well tested and worked as expected.
However, due to whatever reason, the getSupportedCountries()
is refactored to return a Collection<String>
.
The Java compiler complains nothing about the refactor. But the if-condition now is never true
in any cases, since
the equals()
method of String
has no idea about the equality with Country
and vice versa.
A bug!
It’s hard to detect this kind of bug, if the code is not well covered by unit tests or end-to-end tests.
In this sense, the contains()
method in Java Collection is not type safe.
How to Avoid
First, never change the behavior of an API when refactor.
In the above case, the signature of the getSupportedCountries()
API has changed.
This is a breaking change, which usually causes the client code fails to compile.
Unfortunately, in above case the client code doesn’t fail fast in the compile phase.
It’s better to add new API like getSupportedCountryCodes()
which returns a Collection<String>
, and @Deprecated
the old API, which can be further deleted some time later.
Second, make code fully covered by test cases as much as possible. Test cases can detect the bug earlier in the test phase.
Why contains() Is Not Generic
Why contains()
is not designed as contains(E o)
, but as contains(Object o)
?
There are already some discussion on this design in StackOverflow, like this one
and this one.
It’s said it’s no harm to let methods like contains()
in Collection
be no generic.
Being no generic, the contains()
can accept a parameter of another type, which is seen as a “flexible” design.
However, the above case shows that this design does have harm and cannot give developers enough confidence.
A method accepting a parameter of Object
means it accepting any type, which is too “dynamic” for a “static” language.
Another question is why a static language needs a “root” Object
?
Daily Dev Log: Avoid the Pitfall of Using the Same File to Redirect Input and Output
Pitfalls
Do Not Use the Same File to Redirect Input and Output
tr -d '\015' <DOS-file >DOS-file
The above command will delete all content in the file!
From man bash
,
[n]>word, if it does exist it is truncated to zero size.
(How did I find the file back? Luckily, the working directory is managed by Dropbox, and I found it back in the Dropbox.)
CLI
Convert Line Endings from DOS/Windows Style to Unix/Linux Style
tr -d '\015' <DOS-file >UNIX-file
(For what character \015
is, see man 7 ascii
or ascii '\015'
if the ascii
command is installed.)
More ascii Command Examples
$ ascii '\r'
ASCII 0/13 is decimal 013, hex 0d, octal 015, bits 00001101: called ^M, CR
Official name: Carriage Return
C escape: '\r'
Other names:
Search Manuals
-k Search the short descriptions and manual page names for the keyword
$ man -k ascii
ascii (1) - report character aliases
ascii (7) - ASCII character set encoded in octal, decimal, and hexadecimal
...
Miss Newline Characters When "cat" Text Files
The cat
is often used to concatenate text files into one single file.
In most cases, the cat
works fine like below.
$ echo line 1 > file1.txt
$ echo line 2 > file2.txt
$ cat file{1,2}.txt
line 1
line 2
However, if some of files to be concatenated don’t end with the newline character,
using cat
to concatenate files may not generate expected file.
# -n, let echo not add the trailing newline character
$ echo -n line 1 > file1.txt
$ echo line 2 > file2.txt
$ cat file{1,2}.txt
line 1line 2
Note that in the above example, file1.txt doesn’t end with newline, so when two files
concatenated there is no newline between them.
This may not be the expected result. For example, we have multiple large text files.
Every line in each file is a user ID. We want to concatenate these files into one file
to be fed into a processing program at once. If some of files are not ended with newline,
using cat
may generate ill user IDs like user-id-foouser-id-bar
.
If the input volume is huge, these problematic IDs usually would not be detected by human
eyes.
If the newlines between files is important in your case, using awk
is safer.
# -n, let echo not add the trailing newline character
$ echo -n line 1 > file1.txt
$ echo line 2 > file2.txt
$ $ awk 1 file{1,2}.txt
line 1
line 2
See this SO answer.
Also, it’s a good idea to tune text editors to always show non-printable characters like the
newline. Or, use cat -e
, which prints invisible characters and a $
for the newline.
$ cat -e file1.txt | tail -1
在文本编辑器里显示空白字符
下午同事遇到一个bug,数据库始终连接不上。 网络检查正常(用数据库的client可以正常连接)、配置文件也是“正常的”(和其他可以正确连接的同事的配置“一摸一样”)。 最后发现是配置文件中的数据库密码的结尾多了几个空格😓。
所以最好在编辑器/IDE里显示空白字符。绝大部分编辑器都是支持空白字符显示的,包括Vim。
另一个空格相关的bug。