# filepaths.txt is a file with thousands lines cat filepaths.txt | xargs -n 1 basename
It takes a while (seconds) to finish running the above command. A file with thousands lines usually is not considered as a big volume. Why is xargs slow in the above command?
After read a SO post, it turns out
xargs in the above command runs
basename thousands times,
therefore it has bad performance.
Can it be faster?
xargs reads items from the standard input … delimited by blanks … or newlines and executes the command … followed by items read from standard input. The command line for command is built up until it reaches a system-defined limit (unless the -n and -L options are used). … In general, there will be many fewer invocations of command than there were items in the input.
This will normally have significant performance benefits.
xargs can pass a batch of “items” to the command.
-n 1 option in the command forces
xargs to just take one “item” a time.
To make it fast, use the
-a option of
basename, which let
basename be able to handle multiple arguments at once.
time cat filepaths.txt | xargs -n 1 basename > /dev/null real 0m2.409s user 0m0.044s sys 0m0.332s
time cat filepaths.txt | xargs basename -a > /dev/null real 0m0.004s user 0m0.000s sys 0m0.000s
Thousands times faster.
cat /dev/null | xargs --show-limits --no-run-if-empty Your environment variables take up 2027 bytes POSIX upper limit on argument length (this system): 2093077 POSIX smallest allowable upper limit on argument length (all systems): 4096 Maximum length of command we could actually use: 2091050 Size of command buffer we are actually using: 131072 Maximum parallelism (--max-procs must be no greater): 2147483647
xargs can feed a lot bytes into the command once (2091050 bytes here).
Some commands can usefully be executed in parallel too; see the -P option.