|
From: | Giuseppe Aprea |
Subject: | Re: parallel + blast + LSF |
Date: | Wed, 15 Apr 2015 19:20:40 +0200 |
Hi,
Thanks for clarifying. I want to use GNU Parallel to bsub jobs. This way I can use GNU Parallel to throttle the number of jobs that are submitted to LSF, and it is easier than writing a loop.
parallel -j 100 my_script [bsub options] ::: {1..2000}
my_script (pseudo-code):
#!/bin/bash
...
bsub [bsub options] command ...
post-process data
This way I can submit jobs, say 100 at a time. When I submit all 2000 jobs, it gets problematic and I start hitting limits with file descriptors, etc.
Thanks for sharing,
Martin
On 15-04-15 11:35 AM, Giuseppe Aprea wrote:
Hi Martin,
I am not sure I understand. As far as I can see, things work exactly the opposite way: you have an LSF script which launches GNU Parallel on some hosts provided by LSF. Something like:
--------------------------------------------------------------------------------------------------------------------------------------------------------------#!/bin/bash
#BSUB -J gnuParallel_blast_test # Name of the job.#BSUB -o %J.out # Appends std output to file %J.out. (%J is the Job ID)#BSUB -e %J.err # Appends std error to file %J.err.#BSUB -q large # Queue name.#BSUB -n 30 # Number of CPUs.
module load 4.8.3/ncbi/12.0.0module load 4.8.3/parallel/20150122
SLOTS=`cat ${LSB_DJOB_HOSTFILE} |wc -l`
SERVER=""
for i in `cat ${LSB_DJOB_HOSTFILE}| sort`
doecho "/afs/enea.it/software/bin/blaunch.sh ${i}" >> serversdone
cat absolute_path_to_sequences.fasta | parallel --no-notice -vv -j ${SLOTS} --slf servers --plain --recstart '>' -N 1 --pipe blastp -evalue 1e-05 -outfmt 6 -db absolute_path_to_db_file -query - -out absolute_path_to_result_file_{%}
--------------------------------------------------------------------------------------------------------------------------------------------------------------
LSF is the one which gives you the execution hosts so if you are launching bsub from GNU parallel how do you know how to set the --slf option?
g
On Wed, Apr 15, 2015 at 4:24 PM, Martin d'Anjou <martin.danjou14@gmail.com> wrote:
On 15-04-15 09:34 AM, Giuseppe Aprea wrote:
Hi all,
I would like to ask you, please, some help in using parallel with blast alignment software.
I am trying to use GNU parallel v. 20150122 with blast for a very large sequences alignment. I am using Parallel on a cluster which uses LSF as queue system.
Hello Giuseppe,
I am an avid LSF user, and I want to use GNU Parallel to dispatch jobs to LSF. Could you please explain a little bit to me how GNU Parallel works with LSF? I do not see it in the on-line tutorials. For example, I would like to understand how to pass "bsub" options like -oo, -q queue_name, etc. to LSF from GNU Parallel.
Thanks,
Martin
[Prev in Thread] | Current Thread | [Next in Thread] |