Hi all,
I am trying to use parallel for regular Linux commands, as I have to deal with huge files on daily basis. But few times I have tried, I don't see any improvement. Is there a threshold for the file size after which the parallel is beneficial? Or am I doing it wrong?
Eg.,
$ time head -n 1000000 huge.vcf | parallel --pipe "awk '{print $123}'" | wc -l
1000000
Wall Time 0m29.326s
User Mode 0m22.489s
Kernel Mode 17m55.061s
CPU Usage 3745.90%
$ time head -n 1000000 huge.vcf | awk '{print $123}' | wc -l
1000000
Wall Time 0m10.329s
User Mode 0m12.447s
Kernel Mode 0m4.540s
CPU Usage 164.46%