Solid sequencing

From Kogic.net
Revision as of 16:03, 16 January 2011 by WikiSysop (talk | contribs) (Created page with "<p><b>SOLiD</b> (Sequencing by Oligonucleotide Ligation and Detection) is a next-generation sequencing technology developed by <font color="#0645ad">Life Technologies</font> and ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

SOLiD (Sequencing by Oligonucleotide Ligation and Detection) is a next-generation sequencing technology developed by Life Technologies and has been commercially available since 2008. These next generation technologies generate hundreds of millions to billions of small sequence reads at one time. Well-known examples of such DNA sequencing methods include 454 pyrosequencing (introduced in 2005, generating millions of 200-400bp reads in 2009), the Solexa system (introduced in 2006, generating hundreds of millions of 50-100bp reads in 2009) and the SOLiD system (introduced in 2007, generating billions of 50bp reads in 2009). These methods have reduced the cost from $0.01/base in 2004 to nearly $0.0001/base in 2006 and increased the sequencing capacity from 1,000,000 bases/machine/day in 2004 to more than 5,000,000,000 bases/machine/day in 2009. Over 30 publications exist describing its use first for nucleosome positioning from Valouev et al.,[1] transcriptional profiling or strand sensitive RNA-Seq with Cloonan et al.,[2] single cell transcriptional profiling with Tang et al.[3] and ultimately human resequencing with McKernan et al.[4]

Chemistry

A library of DNA fragments is prepared from the sample to be sequenced, and are used to prepare clonal bead populations. That is, only one species of fragment will be present on the surface of each magnetic bead. The fragments attached to the magnetic beads will have a universal P1 adapter sequence attached so that the starting sequence of every fragment is both known and identical. Emulsion PCR takes place in microreactors containing all the necessary reagents for PCR. The resulting PCR products attached to the beads are then covalently bound to a glass slide.

Primers hybridize to the P1 adapter sequence within the library template. A set of four fluorescently labeled di-base probes compete for ligation to the sequencing primer. Specificity of the di-base probe is achieved by interrogating every 1st and 2nd base in each ligation reaction. Multiple cycles of ligation, detection and cleavage are performed with the number of cycles determining the eventual read length. Following a series of ligation cycles, the extension product is removed and the template is reset with a primer complementary to the n-1 position for a second round of ligation cycles.

Five rounds of primer reset are completed for each sequence tag. Through the primer reset process, each base is interrogated in two independent ligation reactions by two different primers. For example, the base at read position 5 is assayed by primer number 2 in ligation cycle 2 and by primer number 3 in ligation cycle 1.

Throughput and Accuracy

According to ABI, the SOLiD 3plus platform yields 60 gigabases of usable DNA data per run. Due to the two base encoding system, an inherent accuracy check is built in to the technology and offers 99.94% accuracy. The chemistry of the systems also means that it is not hindered by homopolymers unlike the Roche 454 FLX system and so large and difficult homopolymer repeat regions are no longer a problem to sequence.

Applications

Naturally the technology will be used to sequence DNA, but because of the high parallel nature of the all next generation technologies they also have applications in transcriptomics and epigenomics.

Microarrays have been the mainstay of the transcriptomics world for the last ten years and array based technology has branched out to other areas. But they are limited in that only information can be obtained for probes that are on the chip. Only information for organisms for which chips are available can obtained, and they come with all the problems of hybridizing large numbers of molecules (differing hybridizing temperatures).

Transcriptomics by next gen sequencing will mean these barriers no longer hold true. Any organism's entire transcriptome could be potentially sequenced in one run (for very small bacterial genomes) and not only would the identification of each transcript be available but expression profiling is possible as quantitative reads can also be achieved.

Chromatin immunoprecipitation (ChIP) is a method for determining transcription factor binding sites and DNA-protein interactions. It has in the past been combined with array technology (ChIP-chip) with some success. Next gen sequencing can also be applied in this area. Methylation immunoprecipitation (MeDIP) can also be performed and also on arrays.

The ability to learn more about methylation and TF binding sites on a genome wide scale is a valuable resource and could teach us much about disease and molecular biology in general.

See also

References

  1. ^ Valouev A, Ichikawa J, Tonthat T, et al. (July 2008). "A high-resolution, nucleosome position map of C. elegans reveals a lack of universal sequence-dictated positioning". Genome Research 18 (7): 1051–63. doi:10.1101/gr.076463.108. PMID 18477713. 
  2. ^ Cloonan N, Forrest AR, Kolle G, et al. (July 2008). "Stem cell transcriptome profiling via massive-scale mRNA sequencing". Nature Methods 5 (7): 613–9. doi:10.1038/nmeth.1223. PMID 18516046. 
  3. ^ Tang F, Barbacioru C, Wang Y, et al. (May 2009). "mRNA-Seq whole-transcriptome analysis of a single cell". Nature Methods 6 (5): 377–82. doi:10.1038/nmeth.1315. PMID 19349980. 
  4. ^ McKernan KJ, Peckham HE, Costa GL, et al. (September 2009). "Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding". Genome Research 19 (9): 1527–41. doi:10.1101/gr.091868.109. PMID 19546169. 

Further reading

External links