Open main menu

Kogic.net β

454 GS FLX

Revision as of 20:01, 28 July 2009 by WikiSysop (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

454 Sequencing is a large-scale parallel pyrosequencing system capable of sequencing roughly 400-600 megabases of DNA per 10-hour run on the Genome Sequencer FLX with GS FLX Titanium series reagents. The technology is known for its unbiased sample preparation and long, highly accurate sequence reads (400-500 base pairs in length), including paired reads.[citation needed] Software analysis tools, including an assembler, mapper and amplicon variant analyzer, are included with the system.

The system relies on fixing nebulized and adapter-ligated DNA fragments to small DNA-capture beads in a water-in-oil emulsion. The DNA fixed to these beads is then amplified by PCR. Each DNA-bound bead is placed into a ~29 μm well on a PicoTiterPlate, a fiber optic chip. A mix of enzymes such as DNA polymerase, ATP sulfurylase, and luciferase are also packed into the well. The PicoTiterPlate is then placed into the GS FLX System for sequencing.

Genome Sequencer FLX workflow

DNA library preparation and emPCR

Genomic DNA is fractionated into smaller fragments (300-800 base pairs) that are subsequently polished (made blunt at each end). Short adaptors are then ligated onto the ends of the fragments. These adaptors provide priming sequences for both amplification and sequencing of the sample-library fragments. One adaptor (Adaptor B) contains a 5'-biotin tag for immobilization of the DNA library onto streptavidin-coated beads. After nick repair, the non-biotinylated strand is released and used as a single-stranded template DNA (sstDNA) library. The sstDNA library is assessed for its quality and the optimal amount (DNA copies per bead) needed for emPCR is determined by titration.[citation needed]

The sstDNA library is immobilized onto beads. The beads containing a library fragment carry a single sstDNA molecule. The bead-bound library is emulsified with the amplification reagents in a water-in-oil mixture. Each bead is captured within its own microreactor where PCR amplification occurs. This results in bead-immobilized, clonally amplified DNA fragments.

Sequencing

sstDNA library beads are added to the DNA Bead Incubation Mix (containing DNA polymerase) and are layered with Enzyme Beads (containing sulfurylase and luciferase) onto a PicoTiterPlate device. The device is centrifuged to deposit the beads into the wells. The layer of Enzyme Beads ensures that the DNA beads remain positioned in the wells during the sequencing reaction. The bead-deposition process maximizes the number of wells that contain a single amplified library bead (avoiding more than one sstDNA library bead per well).

The loaded PicoTiterPlate device is placed into the Genome Sequencer FLX Instrument. The fluidics sub-system delivers sequencing reagents (containing buffers and nucleotides) across the wells of the plate. The four DNA nucleotides are added sequentially in a fixed order across the PicoTiterPlate device during a sequencing run. During the nucleotide flow, millions of copies of DNA bound to each of the beads are sequenced in parallel. When a nucleotide complementary to the template strand is added into a well, the polymerase extends the existing DNA strand by adding nucleotide(s). Addition of one (or more) nucleotide(s) generates a light signal that is recorded by the CCD camera in the instrument. This technique is based on sequencing-by-synthesis and is called pyrosequencing.[citation needed] The signal strength is proportional to the number of nucleotides; for example, homopolymer stretches, incorporated in a single nucleotide flow generate a greater signal than single nucleotides. However, the signal strength for homopolymer stretches is linear only up to eight consecutive nucleotides after which the signal falls-off rapidly.[3] Data are stored in standard flowgram format (SFF) files for downstream analysis.

Contents

Applications

454 Sequencing can sequence any double-stranded DNA and enables a variety of applications including de novo whole genome sequencing, re-sequencing of whole genomes and target DNA regions, metagenomics and RNA analysis.

Full (Whole) Genome Sequencing (de novo sequencing and resequencing)

Full Genome Sequencing (FGS), also referred to as Whole Genome Sequencing (WGS), consists of projects dealing with the sequencing of the entire genome of an organism, for example, humans, dogs, mice, viruses or bacteria. 454 Sequencing technology is ideal for whole genome assembly due to its ultra high throughput and long single reads (500 base pairs). The ability to combine shotgun reads with paired end reads facilitates high genome coverage with minimal gaps. As a result, the 454 platform has effectively sequenced and assembled a number of complex genomes. In June 2006 they launched a project with the Max Planck Institute for Evolutionary Anthropology to sequence the genome of the Neanderthal, the extinct closest relative of humans. This has implications for the understanding of human evolution and development. At 3 billion base pairs, a complete sequence of the Neanderthal genome is expected to take two years to finish[4][5]. In September 2008 the complete Neanderthal mitochondrial genome was sequenced, establishing the divergence between humans and Neanderthal at 660,000 +/- 140,000 years[6].

In May 2007, researchers at the Baylor College of Medicine used the 454 Sequencing system to sequence and assemble the complete human diploid genome of Dr. James Watson. The feat demonstrated that generating high-quality sequence from humans, quickly and affordably, is now feasible.

Amplicon (Ultra Deep) Sequencing

Amplicon (Ultra Deep) Sequencing is a new field which is largely being enabled through 454 Sequencing technology. Unlike Sanger, 454 sequencing allows mutations to be detected at extremely low levels. Researchers are able to PCR amplify specific, targeted regions of DNA. This method is particularly useful for identifying low frequency somatic mutations in cancer samples or discovery of rare variants in HIV infected individuals. The Genome Sequencer FLX system offers dedicated analysis software, the GS Amplicon Variant Analyzer, which automatically computes the alignment of reads from amplicon-based samples against a reference sequence.

Transcriptome Sequencing

Transcriptome sequencing encompasses experiments including small RNA profiling and discovery, mRNA transcript expression analysis (full-length mRNA, expressed sequence tags (ESTs) and ditags, and allele-specific expression) and the sequencing and analysis of full-length mRNA transcripts. The transcriptome data derived from the Genome Sequencer FLX is ideally suited to detailed transcriptome investigation in the areas of novel gene discovery, gene space identification in novel genomes, assembly of full-length genes, single nucleotide polymorphism (SNP), insertion-deletion and splice-variant discovery.

Metagenomics

Metagenomics is the study of the genomic content in a complex sample. The two primary goals of this approach are to characterize the organisms present in a sample and identify what roles each organism has within a specific environment. Metagenomics samples are found nearly everywhere, including several microenvironments within the human body, soil samples, extreme environments such as deep mines and the various layers within the ocean. The Genome Sequencer FLX System enables a comprehensive view into the diversity and metabolic profile of an environmental habitat. The system’s long reads ensure the enormous specificity needed to compare sequenced reads against DNA or protein databases. Researchers often use the platform for counting environmental gene tags to analyze the relative abundance of microbial species under varying environmental conditions.

Advantages and Disadvantages

454 Sequencing with GS FLX Titanium series reagents sequence 400-600 million bases per 10-hour run, allowing large amounts of DNA to be sequenced at low cost compared to Sanger chain-termination methods. With Q20 read lengths of 400 bases (99% accuracy at the 400th base and higher for preceding bases) and significantly higher throughput, de novo assembly with 454 Sequencing is at least equivalent to Sanger assembly, while being dramatically faster and an order of magnitude less expensive. G-C rich content is not as much of a problem, and the lack of reliance on cloning means that unclonable segments are not skipped. Also, it is capable of detecting mutations in an amplicon pool at a high sensitivity level, which may have implications in clinical research, especially cancer and HIV.[7][8] A limitation of 454 sequencing remains resolution of homopolymer DNA segments; i.e. regions of template which contain multiple simultaneous copies of a single base (A, C, G or T). Since pyrosequencing relies on the magnitude of light emitted to determine the number of repetative bases, erroneous base calls can be a problem with homopolymers. Another disadvantage of 454 sequencing is that while it is cheaper and faster per base, each run is quite expensive, and it is therefore unsuited for sequencing targeted fragments from small numbers of DNA samples, such as for phylogenetic analysis. For some sequencing applications the high cost of an individual 454 sequencing run can be offset by subdividing sequencing plates into multiple regions and using sample specific molecular identifier (MID) tags of 10 bp to multiplex many individual samples in each sequencing run.

External links

Patents Awarded

References

A complete listing of peer-reviewed research articles can be found on the Roche/454 Sequencing website :

  1. ^ Wheeler, David A. (2008-04-17). "The complete genome of an individual by massively parallel DNA sequencing". Nature 452: 872–876. doi:10.1038/nature06884. http://www.nature.com/nature/journal/v452/n7189/pdf/nature06884.pdf. 
  2. ^ "Project Jim: Watson’s Personal Genome Goes Public". http://www.health-itworld.com/newsitems/2007/may/05-31-07-watson-genome. 
  3. ^ Margulies, Marcel; Michael Egholm and 54 additional coauthors (2005-09-15). "Genome Sequencing in Open Microfabricated High Density Picoliter Reactors". Nature (Nature Publishing Group) 437 (7057): 376–380. doi:10.1038/nature03959. PMID 16056220. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1464427. Retrieved on 2007-12-16. 
  4. ^ 454 Life Sciences (2006-07-20). 454 Life Sciences and Max Planck Institute to Sequence Neandertal Genome. Press release. http://www.454.com/news-events/press-releases.asp?display=detail&id=51. Retrieved on 2007-12-16. 
  5. ^ Max Planck Society (2006-07-20). Neandertal Genome to be Deciphered. Press release. http://www.mpg.de/english/illustrationsDocumentation/documentation/pressReleases/2006/pressRelease20060720/index.html. Retrieved on 2007-12-16. 
  6. ^ Green, Richard E.; et al. (2008-08-08). "A complete Neandertal Mitochondrial Genome Sequence Determined by High-Throghput Sequencing". Cell (Elsevier) 134: 416–426. doi:10.1016/j.cell.2008.06.021. http://www.eva.mpg.de/genetics/pdf/Green_Complete_Cell_2008.pdf. 
  7. ^ Jan Fredrik Simons; Michael Egholm and 11 additional coauthors (2005-11-17). "Ultra-Deep Sequencing of HIV from Drug Resistant Patients" (JPG). 454 Life Sciences. http://www.454.com/downloads/news-events/publications/HIV%20Resistance%20Workshop_454_final.jpg. Retrieved on 2007-12-16. 
  8. ^ name="HIVWkShop2005">"Report on the XIV International HIV Drug Resistance Workshop: Parts 4-5". Selected Highlights from the XIV International HIV Drug Resistance Workshop. HIVandHepatitis.com. 2005-07-08. http://www.hivandhepatitis.com/2005icr/hivdrug/docs/070805_b.html. Retrieved on 2007-12-16.