Greetings,
I was trying to make some processes faster which use tar a bit on large archives.
I was able to use the command line options --use-compress-program pbzip2 and that helped a lot.
I was wondering if anyone has experimented using pthreads in the tar implementation. I don't see any references in version 1.30 of the source or readme I downloaded. Anyone discussed a strategy for using pthreads in previous postings?
My first thoughts:
- create a thread pool for file reading to limit the number of threads available
- have one thread for writing from a file queue.
- for small files of a determined threshold, they are read into a buffer and inserted into the write queue. The logic would have some limits on the number of small files to keep in memory.
- processing large files can be accomplished with a thread specifically for reading large files. Once an initial buffer was filled it would be inserted into the write queue. The write queue would cause the read queue to read as needed to keep the processing efficient.
Thanks,