Miss Newline Characters When "cat" Text Files

The cat is often used to concatenate text files into one single file. In most cases, the cat works fine like below.

$ echo line 1 > file1.txt
$ echo line 2 > file2.txt
$ cat file{1,2}.txt
line 1
line 2

However, if some of files to be concatenated don’t end with the newline character, using cat to concatenate files may not generate expected file.

# -n, let echo not add the trailing newline character
$ echo -n line 1 > file1.txt
$ echo line 2 > file2.txt
$ cat file{1,2}.txt
line 1line 2

Note that in the above example, file1.txt doesn’t end with newline, so when two files concatenated there is no newline between them. This may not be the expected result. For example, we have multiple large text files. Every line in each file is a user ID. We want to concatenate these files into one file to be fed into a processing program at once. If some of files are not ended with newline, using cat may generate ill user IDs like user-id-foouser-id-bar. If the input volume is huge, these problematic IDs usually would not be detected by human eyes.

If the newlines between files is important in your case, using awk is safer.

# -n, let echo not add the trailing newline character
$ echo -n line 1 > file1.txt
$ echo line 2 > file2.txt
$ $ awk 1 file{1,2}.txt
line 1
line 2

See this SO answer.

Also, it’s a good idea to tune text editors to always show non-printable characters like the newline. Or, use cat -e, which prints invisible characters and a $ for the newline.

$ cat -e file1.txt | tail -1

Miss Newline Characters When "cat" Text Files

Related Posts