The Iowa State University Maize EST Project

Conducted by the Schnable Laboratory with assistance from Drs. Hui-Hsien Chou (Iowa State University), Bento Soares (University of Iowa), and Michael Scanlon (Cornell University) Funded by the National Science Foundation.

As of 04/01/2005, over 63,000 ESTs have been submitted to GenBank. Most of these ESTs were generated via the 3' sequencing of clones from B73 cDNA libraries. The corresponding clones are available for distribution. Details on the cDNA libraries that are being sequenced are provided in the table below:

Inbred B73 B73 B73 B73 Mo17 B73 B73
Vector pT7T3PAC pT7T3PAC pT7T3PAC pSlip7 pCMV-Script EX Uni-Zap XR Uni-Zap XR
Host Strain DH10B DH10B DH10B DH10B DH10B XL1-Blue XL1-Blue
Normalized? No Yes Yes No No No No
Complexity 4.6 x 106 4.6 x 106 1.7 x 106 1.0 x 106 6.0 x 106 3.0 x 106 3.0 x 106
Plate #s MEST8, MEST33-36 MEST9-32, MEST37-125, MEST595-616 MEST126-410 MEST500, MEST502-554 MEST501, MEST555, MEST564-583 MEST700 MEST900, MEST1220
Vector Primer
Forward / Reverse
T7-1 / T3 T7-1 / T3 T7-1 / T3 T7-1 / Sp6 T3 / T7-1 T7-1 / T3 T7-1 / T3
Cloning Site 5' / 3' EcoRI/NotI EcoRI/NotI EcoRI/NotI EcoRI/NotI EcoRI/XhoI EcoRI/XhoI EcoRI/XhoI
Sequence Count 628 9,095 20,250 3,685 815 64 30,973
Fasta File (tar.gz) 114 KB 1.0 MB 3.0 MB 631 KB 116 KB 9 KB 6.0 MB
ISUM3 Library

ISUM3 library was prepared by Dr. Fang Qiu using mRNA isolated from B73 seedling and silk tissue. ds-cDNA molecules were generated as follows. First-strand cDNA was prepared from oligo-dT selected mRNA by priming with a NotI oligo-dT primer (5' AACTGGAAGAATTCGCGGCCGCAGGAATTTTTTTTTTTTTTTTTT). The resulting DNA:RNA hybrid was treated with RNase H and used as a template for DNA PolI-catalyzed second strand synthesis. After the addition of EcoRI adaptors (5'-d[AATTCGGCACGAGG]-3', 3'-(GCCGTGCTCC)p-5'; Amersham 27-7805-01), the ds-cDNAs were digested with NotI. Molecules between 0.5 and 2.0 kb were directionally cloned into the EcoRI and NotI sites of the pT7T3PAC vector.

ISUM4 Library

Dr. Fang Qiu prepared ISUM4 by normalizing library ISUM3 using the protocols of Dr. Bento Soares (University of Iowa), modified from the following:

Bonaldo MF, G Lennon, MB Soares (1996) Normalization and subtraction: Two approaches to facilitate gene discovery. Genome Research, 6(9): 791-806
ISUM5 Library

ISUM5 was prepared by Dr. Fang Qiu and normalized (via Bento Soares' methods) using a complex mixture of mRNAs that would be expected to contain substantially more maize genes than ISUM3 and ISUM4. These mRNAs were extracted from almost 70 different B73 tissue samples that included many different organs, stages of development, and conditions including following treatment with the hormones gibberellic acid, cytokinin, ethylene, abscisic acid, auxin, bassinolide, and jasmonate, and from calli treated with cycloheximide (see ISUM6 for details). In addition to causing difficulties in sequencing cDNA clones, long polyA tails have the potential to reduce gene complexity following normalization. To circumvent these problems, this library was prepared using a strategy developed by Wang et al. (2000).

Wang SM, SC Fears, L Zhang, JJ Chen, JD Rowley (2000) Screening poly(dA/dT)- cDNAs for gene identification. Proceedings National Academy Science 97(8): 4162-7
ISUM6 Library

ISUM6 was prepared by Dr. Fang Qiu using the same mRNA samples as ISUM5. First-strand cDNAs were prepared from 21 pools of oligo-dT selected mRNAs by priming with 21 different NotI oligo-dT tag primers (5'-AACTGGAAGAATTCGCGGCCGCNNNNNNTTTTTTTTTTTTTTTTTT-3'). Distinguishable "bar code" tags, (N)6, were used for each separate first-strand cDNA synthesis. Hence, these bar code tags can be used to identify the mRNA pool from which a particular cDNA clone was derived. The "bar code" tags associated with specific tissue sources are:

  • ATACGC--Germinated seeds and seedlings (1, 2, 8, 11 DAG)
  • ACTGGC--Mixed mature tissues (17, 21, 38, 69, 77 DAG)
  • CACAGC--Kernels (3, 5, 10, 15, 20, 25, 30, DAP)
  • TAACCC--Adventious roots (65 DAG)
  • CAGGCG--Tassels (3-39 cm, 53 and 56 DAG)
  • AGGTAC--Immature ears (0.2-3.0 cm, 53, 56, 59 DAG)
  • TGAGCG--Husks (73 DAG)
  • GACCAC--Silks
  • AATCGG--unpollinated first ears
  • CTAAGG--ear shanks
  • GAAGAG--etiolated seedlings
  • AGTGAG--callus
  • GTGGAC--Cycloheximide-treated callus
  • GTCACC--Anaerobic treated seedlings
  • CGTCCA--NAA (a-Naphthalene acetic acid)-treated seedlings
  • GATGCC--Kinetin-treated seedlings
  • AAGACC--ACPC (1-aminocyclopropane-1-carboxylix acid)-treated seedlings
  • GCCTCA--Brassinolide-treated seedlings
  • CTAGCC--ABA (Abscisic acid)-treated seedlings
  • TACGGA--GA (Gibberellic acid)-treated seedlings
  • GCAGGA--JA (Jasmonic acid)-treated seedlings

Equal amounts of first-strand cDNA from each reaction were combined and used as template for DNA PolI-catalyzed 2nd strand synthesis. After the addition of EcoRI adaptors, ds-cDNAs were digested with NotI. Molecules between 0.5 and 2.0 kb were directionally cloned into the EcoRI and NotI sites of the pSlip7 expression vector (pSPORT1, GIBCO BRL)(manuscript in preparation). Plasmid DNA isolated from the library was digested with NotI to remove empty vector clones. Linear DNAs from 5.4 to 7 kb were gel purified and ligated at low concentration to promote recircularization. Ligation products were precipitated and transformed into DH10B host cells.

ISUM7 Library

ISUM7 was prepared by Yongjie Yang using RNA from one-month old Mo17 plants. First-strand cDNA was generated from oligo-dT selected mRNA by priming with an XhoI oligo-dT primer (5' GAGAGAGAGAGAGAGAGAGAACTAGTCTCGAGTTTTTTTTTTTTTTTTTT 3'). The resulting DNA:RNA hybrid was treated with RNase H and used as a template for DNA PolI-catalyzed second strand synthesis. After the addition of EcoRI adaptors (5'-OH-AATTCGGCACGAGG-3', 3'-GCCGTGCTCCp-5', Stratagene), ds-cDNAs were digested with XhoI. Molecules between 0.5 and 1.5 kb were directionally cloned into the EcoRI and XhoI sites of the lambdaZAP-CMV XR vector. The resulting phage library was converted to a pCMV-Script EX phagemid library by en masse in vivo excision. DNA isolated from the phagemid library was digested with NotI to remove empty vector clones. Linear DNAs from 4.8 to 6 kb were gel purified and ligated at low concentration to promote recircularization. Ligation products were precipitated and transformed into DH10B host cells.

UGA-ZmSAM-XZ1 Library

UGA-ZmSAM-XZ1 was constructed by Xiaolan Zhang. Vegetative Shoot Apical Meristem (SAM) and leaf primordia staged P1-P4 from 14-17 day-after germination seedlings were quickly dissected into dry ice under a light microscope. Total RNA was isolated using Trizol and mRNA was purified with Dynal Oligo-DT25. ds-cDNA molecules were generated as follows. First-strand cDNA was prepared from oligo-dT selected mRNA by priming with an XhoI oligo-dT primer (5'-GAGAGAGAGAGAGAGAGAGAACTAGTCTCGAGTTTTTTTTTTTTTTTTTT). The resulting DNA:RNA hybrid was treated with RNase H and used as a template for DNA PolI-catalyzed second strand synthesis. After the addition of EcoRI adaptors, the ds-cDNAs were digested with XhoI and size-selected from 300 bp to 600 bp. The resulting molecules were directionally cloned into the EcoRI and XhoI sites of the Uni-Zap XR vector. The lambda library was packaged with Gigapack III Gold packaging extract and was mass excised by XL1-Blue cells and ExAssist Helper phage. Excised phagemids were titered in SOLR cells and plated onto LB-ampicillin agar plates. Base calling was conducted using Phred. Trimming was performed using Lucy with the following criteria: (-minimum 200 -error 0.01 0.01 -bracket 10 0.01). A low complexity filter was applied and additional trimming was conducted to remove E. coli, vector, and organelle contamination.

UGA-ZmSAM-XZ2

UGA-ZmSAM-XZ2 was constructed by Xiaolan Zhang. Vegetative Shoot Apical Meristem (SAM) and leaf primordia staged P1-P4 from 14-17 day-after germination seedlings were quickly dissected into dry ice under a light microscope. Total RNA was isolated using Trizol and mRNA was purified with Dynal Oligo-DT25. ds-cDNA molecules were generated as follows. First-strand cDNA was prepared from oligo-dT selected mRNA by priming with an XhoI oligo-dT primer (5'-GAGAGAGAGAGAGAGAGAGAACTAGTCTCGAGTTTTTTTTTTTTTTTTTT). The resulting DNA:RNA hybrid was treated with RNase H and used as a template for DNA PolI-catalyzed second strand synthesis. After the addition of EcoRI adaptors, the ds-cDNAs were digested with XhoI and size-selected to be >600 bp. The resulting molecules were directionally cloned into the EcoRI and XhoI sites of the Uni-Zap XR vector. The lambda library was packaged with Gigapack III Gold packaging extract and was mass excised by XL1-Blue cells and ExAssist helper phage. Excised phagemids were titered in SOLR cells and plated onto LB-ampicillin agar plates. Base calling was conducted using Phred. Trimming was performed using Lucy with the following criteria: (-minimum 200 -error 0.01 0.01 -bracket 10 0.01). A low complexity filter was applied and additional trimming was conducted to remove E. coli, vector, and organelle contamination. After processing ~30% of the sequences contained a minimum of 10 Ts at the beginning of the sequence. For reasons that are not understood many of the clones in this library lack an XhoI site at their 3' ends.

Sequencing Pipeline

The Schnable laboratory isolates plasmid sequencing templates (see http://schnablelab.plantgenomics.iastate.edu for protocols); electrophoresis is conducted on an ABI 3700 at the ISU DNA Sequencing and Synthesis Facility (DSSF)

Bioinformatic Analyses

After sequences are received from the DSSF they are subjected to Phred analysis (Ewing et al., 1998; Ewing and Greene, 1998) to improve base calling and to define the limits of quality sequence. Only a few percent of sequence attempts fail Phred, primarily as a consequence of sequencing failures. Subsequently, Lucy (version 1.16s) is used to trim vector sequences from the ESTs and perform additional quality control. Specifically, Lucy's parameters are set to ensure an overall trimmed sequence quality of >97.5%. Assistance implementing Lucy was obtained from ISU faculty member, Dr. Hui-Hsien Chou, who developed Lucy while employed at TIGR. Between 13-18% of sequence attempts fail Lucy, primarily as a consequence of empty vector sequences, low-quality sequences, or short sequence reads. Even so, the modal lengths of trimmed sequence reads that pass Lucy in all libraries exceed 500 bp. Prior to deposition in GenBank, trimmed insert sequences that are less than 200 bp and those that contain various sequence irregularities are filtered out. The remaining sequence attempts are deposited to GenBank and are downloaded by ZmDB and the TIGR Gene Index.

ESTs generated by sequencing the 3' ends of cDNA clones are expected to contain a poly-T prefix. Low-quality bases between the poly-T and the high-quality region were replaced with N's to serve as spacers using a Perl program (<est_process.pl>), written by Dr. Hui-Hsien Chou. Those 3' sequences without a polyT prefix are flagged in GenBank with the phrase, "Caution: this insert was apparently cloned in the wrong orientation." Clones that lack both a poly-T prefix and a poly-A at the other end are flagged in GenBank with the phrase, "Caution: this clone's poly-T may have been lost during the procedure used to remove empty vector clones from the library."