Next generation sequencing (NGS) is a relatively new technology that has found uses in many branches of biology and could become the biggest methodological break through since PCR. The technique allows sequence data from up to 300 million inserts in a DNA library to be simultaneously determined. This information in turn can either allow the origin of the library inserts to be deduced or identify the complete sequence of the DNA used to make the library. Just like PCR, the techniques utility arises from the ease by which different sources of nucleic acid can be analysed. Consequently, NGS is widely used in both basic and translational research as well as a routine diagnostic tool.
For example, libraries made from a patient’s cell free circulating DNA can be used to detect the presence of undiagnosed cancers or allow the non-invasive screening of pregnancies for deleterious copy number variation in an embryo.
RNA-Seq data can allow a researcher to determine the relative levels of all the transcripts expressed in a sample and identify important transcriptional changes linked to various stimuli or a tissue’s disease status. Even in situations where an annotated transcriptome is not known differentially expressed genes can be identified by first generating a de novo transcriptome assembly from the RNA-Seq data and using this transcriptome to screen for altered transcriptional activity. Diagnostically, de novo contig assembly using RNA-Seq data can be used to simultaneously detect the presence of a RNA virus in a sample and where applicable screen the virus for clinically important sequence variants.
Human whole genome (WGS) and exome (WES) libraries are routinely used to identify novel disease genes and/or deleterious variants in a patient cohorts. Less than 3 years ago routine diagnostic screening was limited to a very small pool of common disease causing genes. However, as WGS and WES simultaneously screens a patient’s entire genome or protein coding sequences, one single test methodology can be used to screen any patient for any set of disease causing genes. This allows 50% of patients with a rare recessive condition to receive a diagnosis within 40 days. Prior to the use of diagnostic WES the vast majority of these patients would not receive a molecular diagnosis. Where patients cannot be given a diagnosis, the sequence data could be used in a research setting to identify novel disease genes which can ultimately provide a diagnosis to a further 10% of patients.
While the generation of NGS data has become all most trivial, the volume and complexity of the data creates a wide range of bioinformatics challenges varying with library type and sample origin. This gives members of the LeedOmics community a wide scope to develop analysis methodologies to answer both basic and translational research questions which up until the development of NGS techniques could not be even attempted.
The University of Leeds, Next Generation Sequencing facility is able to perform a wide range of standard sequencing methodologies on samples of varying quality as well as develop novel NGS techniques. The facility has very strong links to the University’s Single Cell Genomics centre and is able to utilise the centre’s automation equipment to rapidly process large sample cohorts. The facility is also a partnership between the University of Leeds and the local NHS trust and processes several thousand patient samples per year and consequently plays an important role in patient diagnosis across the Yorkshire regions.