9-24-01 update! Corrected a number of errors.
Changes:
The previous version of the the update
The first version of the the update
During the following dicussion, I break up the primers into 2 lists:
1) Poor primers (remade)
2) Primers made once and the re-made primers (all_primers)
2) Compare the Kim chip primers with the chromosome sequence. Find where the primers match the genomic sequence. Assemble primer matches into PCR products with their location in the genomic sequence.
3) Find what AceDB features (exons and repetitive features) overlap the PCR products.
4) Filter and summarize the results.
5) Correct for some irregularities.
2) 150 bp overlap between a PCR product and gene cutoff.
4) Where the PCR product matches in the gene is not taken into account.
4) Repetitive sequence filtering is not done. See histogram below.
5) Some genes amplify multiple pcr products. Additional PCR products are included up to 1.5 times the size of the smallest predicted product.
10-01-01 update all_primers file
all_primers
Changed 2899
Same 15879
None 353
No predicted PCR product 82
remade
Changed 115
Same 1212
None 11
No predicted PCR product 5
Compare the full set of genes in AceDB and in the list of what the Kim primer set amplifies. If we discard PCR products that contain more than 1 gene, and ask for the full set of genes in AceDB and in the list of what the Kim primer set amplifies again:
Kim primers amplify single genes 18412 (96% of primer pairs)
AceDB genes 19733
AceDB genes not amplified 2271 (11.5% of AceDB genes)
Now use stricter criteria. Require that the 2000_pcr result be good, faint, or not_run_on_a_gel in addition to the above criteria.
Kim primers amplify single genes 17036 (89% of primer pairs)
AceDB genes 19733
AceDB genes not amplified 3461 (17.5% of AceDB genes)
Number of genes matching the PCR products
=0 434 1 17478 2 1100 3 107 4 23 5 13 6 11 7 7 8 4 9 1 10 0 >10 33
Number of PCR products per primer pair
=0 200 1 18412 2 377 3 45 4 15 5 17 6 16 7 6 8 5 9 1 10 2 >10 117
PCR product size histogram:
=0 307 1 - <101 3 101 - <201 7 201 - <301 4 301 - <401 17 401 - <501 29 501 - <601 30 601 - <701 146 701 - <801 159 801 - <901 860 901 - <1001 1553 1001 - <1101 2568 1101 - <1201 6049 1201 - <1301 16 1301 - <1401 369 1401 - <1501 641 1501 - <1601 1109 1601 - <1701 2091 1701 - <1801 84 1801 - <1901 162 1901 - <2001 295 2001 - <2101 339 2101 - <2201 482 2201 - <2301 813 2301 - <2401 224 2401 - <2501 435 2501 - <2601 74 2601 - <2701 129 2701 - <2801 210 2801 - <2901 0 2901 - <3001 1 >3000 5
PCR product exon content (bp) histogram:
=0 447 1 - <101 15 101 - <201 280 201 - <301 822 301 - <401 1170 401 - <501 1278 501 - <601 1333 601 - <701 1421 701 - <801 5085 801 - <901 3402 901 - <1001 2068 1001 - <1101 1031 1101 - <1201 275 1201 - <1301 123 1301 - <1401 88 1401 - <1501 76 1501 - <1601 47 1601 - <1701 41 1701 - <1801 36 1801 - <1901 25 1901 - <2001 29 2001 - <2101 23 2101 - <2201 20 2201 - <2301 18 2301 - <2401 19 2401 - <2501 3 2501 - <2601 6 2601 - <2701 7 2701 - <2801 1 2801 - <2901 3 2901 - <3001 0 >3000 21
Simple sequence repeats up to 9mers were tracked.
PCR product repetitive content (bp) histogram:
=0 17726 1 - <101 571 101 - <201 343 201 - <301 136 301 - <401 97 401 - <501 62 501 - <601 73 601 - <701 15 701 - <801 11 801 - <901 29 901 - <1001 19 1001 - <1101 20 1101 - <1201 37 1201 - <1301 2 1301 - <1401 9 1401 - <1501 6 1501 - <1601 5 1601 - <1701 8 1701 - <1801 2 1801 - <1901 1 1901 - <2001 10 >2000 26