blast_off

This perl 5 program sends DNA sequences of any length to a BLAST email server to be searched and then recovers them from the users mail box. Long sequences are broken up and sent in pieces to the BLAST server. Options which remain constant for a particular user are specified in a text configuration file, blast_off.config by default. There are four modes in which the program can be invoked:

1) Help Mode, usage info is printed.

2) Standard mode, DNA sequence is given as input, searches are prepared and mailed off a few at a time, and the BLAST search results are recovered from the users mailbox when they're all completed.

3) Send and receive searches Mode; Restart Mode, If searches are prepared, but not sent off (Mode 4), or sent off, but the program is terminated or it takes too long (currently more than 3 days) to get the search results back (due to net work problems or high BLAST server load), this mode picks up where things left off and finishes the search.

4) Prepare searches mode, DNA sequence is input, email formatted searches are prepared, and then the program quits. Useful if you want to prepare searches, but not send them off now, or prepare them, and then send only a few searches at time off by invoking the program in mode 3 using the -fStart_file,End_file option.

Calling the program using 'blast_off -i' (interactive), has the program request the needed options by querying the user. If desired, some of the options can be entered on the command line, and the program will query the user for the remaining options. Typing 'blast_off' alone is equivilant to 'blast_off -i'.

The config file: Parameters which remain the same, e.g. the user's email address, are stored in a text configuration file. By default this file is named blast_off.config. Each line contains a description of the parameter followed by a colon. After the colon, the value of the parameter is entered. This file can be edited by the user (using pico or vi, for example). This config file is checked for in two places, first the directory the program is run from is checked, then the directory where the program resides is checked. The first copy found is used.

As blast_off can take a long time to run, it should be invoked using nohup, if this is required on your system. To do this, invoke blast_off as 'nohup blast_off ...options...'. If nohup is not installed on your system, this will give you an error message :).

Usage:

Help mode: blast_off -q

Standard mode: blast_off -dDb -tBLAST_program [-bconfig.file -cBLAST_email -eEmail -gMailbox1,Mailbox2... -i -jLog.file -mX -nSearch_name -oOutbox -pSearch_dir -sStart_bp,End_bp] <DNA.seq

Send and receive searches Mode; Restart Mode: blast_off -a -pSearch_dir [-bconfig.file -fStart_file,End_file -gMailbox1, Mailbox2... -i -jLog.file -nSearch_name -oOutbox

Prepare searches mode: blast_off -dDb -h -tBLAST_program [-bconfig.file -cBLAST_email -eEmail -i -jLog.file -mX -nSearch_name -pSearch_dir -sStart_bp,End_bp] <DNA.seq

Switches:

-a Run searches only. Used after a previous invocation of the program has written formatted email BLAST files.

-bConfig.file Use the given Config file, instead of the default config file, blast_config. The program looks first in the calling directory, then in the directory in which the program resides.

-cBLAST_server_email_addr Indicate email address of BLAST server.

-dBLAST_database BLAST database to search,

-eUser_email_addr Indicate user's email address. This is the address the searches will be mailed to by the BLAST server.

-fStart_file,End_file When invoking the program in 'run searches only' (-a) mode, this option specifies a subset of the formatted email files for searching. Useful if some searches finshed but others didn't during the previous invocation.

-gMail_box1,Mail_box2,... List of mailboxes to be checked for BLAST search results. Mailboxes can be 'true' mailboxes, or files holding partial results of previous searches.

-h Make email files only. When run in this mode, the program will write formatted BLAST email searches to the indicated directory, then quit.

-i Interactively enter searching options.

-j Specify a log file. Default log file is blast_off.report, but with this option

you can choose another.

-m DNA masking bp. Dedault is 'X'. Use -m alone to indicate no masking bp.

-n Sequence label. This label will identify this DNA sequence, and should be unique. No spaces, or commas can be part of the label.

-oOutbox Outbox. This is the file to which the BLAST search results will be written.

-pSearch_directory The formatted BLAST email files are stored in a temporary directory while the searches are being performed and deleted when the searches are successfully finished. By default the program picks a name for this temp. directory, but you can indicated it using this option. If run in 'run searches only' mode, the user must indicate the directory contianing the email formatted files.

-sStart_bp,End_bp Using this option, a subsequence of the input DNA sequence can be selected for BLASTing.

-txBLASTx BLAST program to be used for searches, i.e., BLASTN, TBLASTX.

Examples:

Standard Mode:

nohup blast_off -dnr -tBLASTN -oc143.BLASTN <c143.mask &

Help Mode:

blast_off -q

Send and receive searches Mode; Restart Mode:

nohup blast_off -a -pblast_off.tmp0 -oc143.BLASTN

Prepare searches mode: blast_off -h -dnr -tBLASTN <c143.mask



Last updated: 11/98

Written by Jim Lund in the lab of Roger Reeves, Johns Hopkins University