Comparative Genomic Analysis Tools (CGAT)
These programs can handle arbitrarily large DNA sequences. They were
written to analyze genomic sequences. Sets of BAC or PAC analyses can be merged
as the sequence of overlapping clones becomes available.
The programs run on a UNIX platform, and all use a text interface. They are written
in Perl (with the exception of lineplot, written in C), and can thus be easily
ported to the Mac or to Windows. Some of the programs have been ported the Macintosh.
News: updates and changes.
Instructions for downloading and installing the package of programs.
Instructions for using the package of programs
Getting started using the package of programs.
Instructions on using fplot and on the fplot file format.
DNA formats used by these programs.
Follow the links to view the help page for each program.
Runs the most common set of analyses automatically. Genomic DNA is masked,
searched against the Genbank databases using BLASTN, searched for exon and
other features using GRAIL 2. cDNA sequence is also translated and searched
against the nr database. For cDNA sequences, ORF analysis is performed, but GRAIL 2 isn't used. The results of these analyses are combined and displayed as
an HTML page.
- Displays sequence features graphed along DNA.
Produces output in postscript or dynamic Html formats. The input file is a lightly structured
text file, and can easily be manipulated by hand, or with an associated program, preplot.
- Produces a text file showing the DNA sequence with features from an fplot format file, splice sites and polyA sites, and the conceptual translation graphed along the DNA.
- Imports output from other sequence analysis programs into
a fplot input file, and manipulates an fplot input file. Currently, can read in GRAIL services,
lineplot, GRAIL repeats, and MZEF output files.
- Searches arbitrarily long sequences against the NCBI databases using
the BLAST email server.
- Summarizes and filters BLAST output. Works with blast_off
output, or with saved BLAST searches. Output can optionally be sent to preplot and
fplot. Uninteresting databases sequences or classes of sequences
can be filtered out. BLAST results can also be filtered based on the strength of the
matches (percent match, Pval, or HSP score).
- Performs a filtered dotplot comparison of two DNA sequences.
- Sends DNA off for GRAIL services: exon prediction, Pol II
promoter prediction, polyA site prediction, and CpG island prediction.
- Masks repetitive sequences using the GRAIL server. The sequence is
masked for both complex repeats (B1, Alu, LINE) and simple seq. repeats.
- Sends off DNA to the RepeatMasker email server (results come
- Gives the requested DNA subsequence.
Can be used to filter line numbers or other garbage from
DNA, and will also give the reverse complement, or format the DNA.
- Gives count of bp in DNA.
- Calculates GC content of DNA and makes an fplot graph of GC content along the sequence.
- Lists restriction enzyme sites in DNA
- Generates list of open reading frames (ORFs) in DNA, and the corresponding AA sequence.
- Translates DNA to AA. The translations can be printed directly, or with the AA's printed along the DNA.
- Formats the results of RNASPL as an fplot file.
RNASPL is a program which finds exon-exon splice sites in cDNA sequence,