Primer name update

9-24-01 update! Corrected a number of errors.

Changes:

The previous version of the the update
The first version of the the update

Description of the Kim lab chip primer set

1323 Primers were poor and were re-made
1323 Re-made poor primers
17924 Primers made once
---------
20570 total primer pairs

During the following dicussion, I break up the primers into 2 lists:

1) Poor primers (remade)
2) Primers made once and the re-made primers (all_primers)


Approach to updating the names

1) Download AceDB. The feature file (.gff) for each chromosome contains the bp positions of each exon and repetitive seqeunce features.

2) Compare the Kim chip primers with the chromosome sequence. Find where the primers match the genomic sequence. Assemble primer matches into PCR products with their location in the genomic sequence.

3) Find what AceDB features (exons and repetitive features) overlap the PCR products.

4) Filter and summarize the results.

5) Correct for some irregularities.


Decisions to be made in the analysis

1) 5 kb cutoff used.

2) 150 bp overlap between a PCR product and gene cutoff.

4) Where the PCR product matches in the gene is not taken into account.

4) Repetitive sequence filtering is not done. See histogram below.

5) Some genes amplify multiple pcr products. Additional PCR products are included up to 1.5 times the size of the smallest predicted product.


Updated gene names

Updated!

10-01-01 update all_primers file

10-02-01 update remade file


Overall summary

Which PCR products changed names?

all_primers

Changed 2899
Same 15879
None 353
No predicted PCR product 82

remade

Changed 115
Same 1212
None 11
No predicted PCR product 5

Compare the full set of genes in AceDB and in the list of what the Kim primer set amplifies. If we discard PCR products that contain more than 1 gene, and ask for the full set of genes in AceDB and in the list of what the Kim primer set amplifies again:

Kim primers amplify single genes 18412 (96% of primer pairs)
AceDB genes 19733
AceDB genes not amplified 2271 (11.5% of AceDB genes)

Now use stricter criteria. Require that the 2000_pcr result be good, faint, or not_run_on_a_gel in addition to the above criteria.

Kim primers amplify single genes 17036 (89% of primer pairs)
AceDB genes 19733
AceDB genes not amplified 3461 (17.5% of AceDB genes)


Summary histograms

Number of genes matching the PCR products

=0	434
1	17478
2	1100
3	107
4	23
5	13
6	11
7	7
8	4
9	1
10	0
>10	33

Number of PCR products per primer pair

=0	200
1	18412
2	377
3	45
4	15
5	17
6	16
7	6
8	5
9	1
10	2
>10	117

PCR product size histogram:

=0	307
1 - <101	3
101 - <201	7
201 - <301	4
301 - <401	17
401 - <501	29
501 - <601	30
601 - <701	146
701 - <801	159
801 - <901	860
901 - <1001	1553
1001 - <1101	2568
1101 - <1201	6049
1201 - <1301	16
1301 - <1401	369
1401 - <1501	641
1501 - <1601	1109
1601 - <1701	2091
1701 - <1801	84
1801 - <1901	162
1901 - <2001	295
2001 - <2101	339
2101 - <2201	482
2201 - <2301	813
2301 - <2401	224
2401 - <2501	435
2501 - <2601	74
2601 - <2701	129
2701 - <2801	210
2801 - <2901	0
2901 - <3001	1
>3000	5

PCR product exon content (bp) histogram:

=0	447
1 - <101	15
101 - <201	280
201 - <301	822
301 - <401	1170
401 - <501	1278
501 - <601	1333
601 - <701	1421
701 - <801	5085
801 - <901	3402
901 - <1001	2068
1001 - <1101	1031
1101 - <1201	275
1201 - <1301	123
1301 - <1401	88
1401 - <1501	76
1501 - <1601	47
1601 - <1701	41
1701 - <1801	36
1801 - <1901	25
1901 - <2001	29
2001 - <2101	23
2101 - <2201	20
2201 - <2301	18
2301 - <2401	19
2401 - <2501	3
2501 - <2601	6
2601 - <2701	7
2701 - <2801	1
2801 - <2901	3
2901 - <3001	0
>3000	21

Simple sequence repeats up to 9mers were tracked.

PCR product repetitive content (bp) histogram:

=0	17726
1 - <101	571
101 - <201	343
201 - <301	136
301 - <401	97
401 - <501	62
501 - <601	73
601 - <701	15
701 - <801	11
801 - <901	29
901 - <1001	19
1001 - <1101	20
1101 - <1201	37
1201 - <1301	2
1301 - <1401	9
1401 - <1501	6
1501 - <1601	5
1601 - <1701	8
1701 - <1801	2
1801 - <1901	1
1901 - <2001	10
>2000	26


Jim Lund
Beckman Center, B365
279 Campus Dr.
Stanford University
Stanford, CA 94305
Phone: (650) 723-5996
FAX: (650) 725-7739
E-Mail: jiml@stanford.edu
Home page worm-chip.stanford.edu