Preliminary primer name update

These are preliminary results! They need double checking, and some tuning of the filter criteria.

8-3-01 update! Added genes found on 3 cosmid which aren't assembled on a chromosome.

Changes:

The first version of the the update

Description of the Kim lab chip primer set

1323 Primers were poor and were re-made
1323 Re-made poor primers
17924 Primers made once
---------
20570 total primer pairs

During the following dicussion, I break up the primers into 2 lists:

1) Poor primers (remade)
2) Primers made once and the re-made primers (all_primers)


Plan to update names

1) Download AceDB. The feature file (.gff) for each chromosome contains the bp positions of each exon and repetitive seqeunce features.

2) Compare the Kim chip primers with the chromosome sequence. Find where the primers match the genomic sequence. Assemble primer matches into PCR products with their location in the genomic sequence.

3) Find what AceDB features (exons and repetitive features) overlap the PCR products.

4) Filter and summarize the results.


Decisions to be made in the analysis

1) PCR size cutoff? Currently, 5 kb cutoff used.

2) How large does the overlap between a PCR product and gene need to be to be counted? Currently use 150 bp.

3) Where does the PCR product match the gene? PCR products containing the 5' ends of genes may not hyb, as the 5' ends of long genes don't get labelled. Currently ignore.

4) Repetitive sequence filtering. Currently ignore.

5) Some genes amplify multiple pcr products. This is information is now included. Additional PCR products are included up to 1.5 times the size of the smallest predicted product. NEW!


Updated gene names

Updated!

8-3-01 update all_primers file

8-2-01 update remade file

8-3-01 good PCR product, unpredicted (all_primers file)

8-2-01 good PCR product, unpredicted (remade file)


Overall summary

Which PCR products changed names?

all_primers

Changed 3866
Same 14132
None 1132
No predicted PCR product 76

remade

Changed 163
Same 1107
None 50
No predicted PCR product 3

Compare the full set of genes in AceDB and in the list of what the Kim primer set amplifies. If we discard PCR products that contain more than 1 gene, and ask for the full set of genes in AceDB and in the list of what the Kim primer set amplifies again:

Kim primers amplify single genes 16049 (81%)
AceDB genes 19733
AceDB genes not amplified 3692 (19%)

Now use stricter criteria. Require that the 2000_pcr result be good, faint, or not_run_on_a_gel in addition to the above criteria.

Kim primers amplify single genes 14982 (76%)
AceDB genes 19733
AceDB genes not amplified 4758 (24%)


Summary histograms

Number of genes matching the PCR products

0	1127
1	17089
2	774
3	43
4	14
5	10
6	10
7	9
9	2
10	1
11	3
17	9
23	6
25	6
26	4
51	1
52	1
53	2
55	1

Number of PCR products per primer pair

=0	95
1	18574
2	382
3	50
4	19
5	20
6	16
7	6
8	5
9	1
10	2
>10	132

PCR product size histogram:

=0	0
1 - <101	3
101 - <201	7
201 - <301	6
301 - <401	21
401 - <501	34
501 - <601	37
601 - <701	158
701 - <801	163
801 - <901	879
901 - <1001	1568
1001 - <1101	2588
1101 - <1201	6084
1201 - <1301	18
1301 - <1401	379
1401 - <1501	649
1501 - <1601	1108
1601 - <1701	2099
1701 - <1801	86
1801 - <1901	163
1901 - <2001	306
2001 - <2101	342
2101 - <2201	486
2201 - <2301	823
2301 - <2401	231
2401 - <2501	446
2501 - <2601	74
2601 - <2701	129
2701 - <2801	216
2801 - <2901	0
2901 - <3001	1
>3000	6

PCR product exon content (bp) histogram:

=0	1164
1 - <101	0
101 - <201	552
201 - <301	1318
301 - <401	1578
401 - <501	1608
501 - <601	1506
601 - <701	1481
701 - <801	4534
801 - <901	3058
901 - <1001	1871
1001 - <1101	925
1101 - <1201	243
1201 - <1301	121
1301 - <1401	81
1401 - <1501	69
1501 - <1601	47
1601 - <1701	30
1701 - <1801	41
1801 - <1901	20
1901 - <2001	24
2001 - <2101	17
2101 - <2201	15
2201 - <2301	10
2301 - <2401	15
2401 - <2501	2
2501 - <2601	5
2601 - <2701	6
2701 - <2801	1
2801 - <2901	2
2901 - <3001	1
>3000	20

Simple sequence repeats up to 9mers were tracked.

PCR product repetitive content (bp) histogram:

=0	17777
1 - <101	601
101 - <201	365
201 - <301	124
301 - <401	92
401 - <501	64
501 - <601	58
601 - <701	6
701 - <801	9
801 - <901	6
901 - <1001	2
1001 - <1101	1
1101 - <1201	3
1201 - <1301	0
1301 - <1401	1
1401 - <1501	0
1501 - <1601	0
1601 - <1701	0
1701 - <1801	1
1801 - <1901	0
1901 - <2001	1
>2000	1


Jim Lund
Beckman Center, B365
279 Campus Dr.
Stanford University
Stanford, CA 94305
Phone: (650) 723-5996
FAX: (650) 725-7739
E-Mail: jiml@stanford.edu
Home page worm-chip.stanford.edu