From GenoSeq
UCLA Genotyping and Sequencing Core Resources
Last updated 8/19/08
Overview
The UCLA Genotyping and Sequencing Core Facility offers a wide range of core genetic techniques to research groups on the UCLA campus and in the broader scientific community. Services include DNA sequencing; genotyping of SNP and microsatellite markers, both in genome scans and in fine-mapping studies; mutation detection; quantitative PCR for gene expression analysis; database and bioinformatics services; and statistical genetics. The Core also provides training and trouble-shooting for students and postdoctoral fellows, and offers frequent seminars on emerging genetic technologies.
Key Personnel
Dr Papp is an Associate Professor in the Department of Human Genetics at UCLA, Director of the UCLA Genotyping and Sequencing Core since 2000, and an active member of the Human Genetics Bioinformatics Core. She has spent the last ten years of her career working in genetic core facilities. Dr. Papp has extensive experience directing both large and small genomics projects, and also developing genomic databases and error checking algorithms to improve the quality of genotyping data. Her databases have been used at the Wellcome Trust Centre for Human Genetics at the University of Oxford, Saint Bartholomew's Hospital in London, the Pasteur Institute of France, and the French National Center for Genotyping. Before coming to UCLA Dr. Papp worked in large genome centers in France - The National Institute of Genotyping in Paris - and England - The Wellcome Trust Center for Human Genetics at the University of Oxford
The Core also employs a statistical genetics consultant, five Staff Research Associates, one Laboratory Assistant, and two programmers to operate a wide range of analytic equipment.
Major Equipment: Laboratory
- Biomek NX 8-channel robot
- Biomek FX 96-channel robot
- Robbins Hydra Microdispenser robot
- GeneMachine Hydroshear
- Qiagen TissueLyser
- Five ABI GeneAmp PCR System 9700 dual block thermalcyclers
- Two ABI 3730 capillary DNA sequencers
- ABI TaqMan 7900 Real-time PCR instrument
- Two Beckman CEQ 8000
- Roche LightCycler 480
- PSQ96 Pyrosequencer
- Roche Genome Sequencer FLX (454)
Laboratory Resources
The Core consists of 1944 square feet of wet laboratory fully equipped for standard molecular genetic techniques, and an additional 122 square feet of office space. In addition to the standard equipment, the lab is equipped for automated high-throughput SNP and microsatellite genotyping, sequencing and mutation detection. High throughput sample preparation is performed on a Biomek NX 8-channel robot, a Biomek FX 96-channel robot, and a Robbins Hydra Microdispenser robot. These instruments are used for DNA dilutions, reaction set-up, and reaction pooling. DNA amplification is performed using 96 and 384-well PCR plates on ABI GeneAmp PCR System 9700 thermalcyclers. Polymorphism detection is conducted using a Roche LightCycler 480. For SNP genotyping the Core has platforms appropriate to high, medium, and low throughput projects. For high throughput, the SNPlex assay is performed on two ABI 3730 capillary instruments; for medium throughput, the ABI TaqMan 7900 Real-time PCR instrument or the Beckman CEQ 8000 capillary instrument can be used; the PSQ96 Pyrosequencer is employed for low throughput projects. The ABI TaqMan 7900 Real-time, Quantitative PCR platform can also be used for gene expression analysis, mutation screening, and determining dissociation curves. Sequencing is performed primarily on the ABI 3730 capillary sequencer, which gives the longest reads and highest data quality. For data interpretation and analysis, the lab is equipped with 19 personal computers running under Windows, Macintosh and Linux operating systems.
Sanger Sequencing
The Sequencing Unit of the Core has two ABI 3730 capillary sequencers available for running sequences: one 48 capillary 3730S, and one 96 capillary ABI 3730XL. Capillary sequencing is far more accurate and precise than slab gel sequencing - there are no problems with spillover or bleed-through between lanes. The multiple platforms ensure against costly downtime, as work can be shifted onto other machines if one machine is out of operation temporarily. The normal sequencing turn-around time is 12-24 hours.
The facility offers two sequencing services: 1. Full Service, in which the customer provides the template as plasmid, PCR product, or BAC, and Core personnel carry out the sequencing reactions and run the sample. 2. Ready-to-Run, where the customer performs the sequencing reaction, and brings in the reacted sample for running on the capillary sequencer. All samples entering the Core laboratory are logged into the Core's sample-tracking database (IGDB) and given a unique tracking ID. On completion of sequencing and analysis, data is immediately available on Web-Seq, the Core's password protected, web-based file retrieval system. The Core's scientists are also available for trouble-shooting and technical support at any stage of the sequencing process.
Massively Parallel High-Throughput Sequencing
The Core operates a Roche Genome Sequencer FLX (formerly 454) for next generation sequencing technologies. Applications possible on this instrument include
- De novo Sequencing
- Comparative Genomics
- Whole Genome Sequencing
- Amplicon Resequencing
- Transcriptome Analysis
- Gene Regulation Studies
- Small RNA Identification
- Methylation Analysis
Genotyping of Multiallelic Markers
The Genotyping unit of the Core is highly automated in order to produce data quickly, accurately, and as inexpensively as possible. Genotyping is performed on the two ABI 3730 instruments. These instruments are capable of generating over 10,000 genotypes per instrument per day, 7 days a week. Liquid handling is performed using three robotic systems: a Robbins Hydra Microdispenser and a Biomek FX 96-channel robot for rapid, accurate preparation of 96 and 384 well microtiter plates, and a Biomek NX 8-channel robot to set up PCR reactions and to perform other liquid handling tasks. Marker amplification is performed using ABI GeneAmp PCR System 9700 thermalcyclers. Together, this equipment increases both genotyping speed and accuracy over manual methods.
PCR products are resolved using the ABI 3730 data collection software and sized by application of the Genemapper software package from Applied Biosystems. Two positive control samples of CEPH 1347-2 per 96-well plate are used to validate the accuracy of genotype calls for each marker. The computer-generated genotypes are checked by technicians blind to disease diagnosis and family structure. The called genotypes are imported to a relational database for error-checking and archival.
Genotyping of SNPs
The Core currently has three available technologies for different applications of SNP detection, from discovery and validation to screening, linkage and association studies. Low throughput SNP genotyping can be performed on the PSQ96 Pyrosequencer. For medium throughput, the allelic discrimination assay is run on an ABI PRISM 7900 Sequence Detection System, or the primer extension assay on the Beckman CEQ 8000 capillary instrument. For high throughput, the SNPlex oligonucleotide ligation assay is run on the ABI 3730 capillary sequencer. The 3730, ABI's latest sequencing platform, is capable of performing over 200,000 SNP genotypes per day using the multiplexed SNPlex assay at a low per SNP cost. With the current 48-plex primer sets, the per SNP cost is as low as $0.14 per SNP genotype, depending on the project size.
Data Management and Quality Control
Once the genotypes have been called, they are imported into a Microsoft SQL Server database for error-checking and data-cleaning. This Integrated Genetic Database (IGDB) holds all the genotypes generated in the Core, as well as information on the individual projects, such as locus information, pedigree information, phenotypic data, tissue source, DNA concentration, sample location, and instruments and technicians generating the data for the project. The database also holds marker information such as size range, heterozygosity, allele frequency, and genetic and physical maps.
Scientists in the UCLA Department of Human Genetics have developed statistical methods to trap errors and allow a more accurate dataset to be passed on for further statistical analysis. The quality checks are both local and global. That is, each genotype is evaluated independently according to a number of quality parameters, then the overall dataset is judged by population-based statistical methods [1], [2]. including checking Hardy-Weinberg equilibrium and comparisons to published allele frequencies. These methods are relevant for datasets both with and without family structure information. Finally, for pedigree-based datasets, a statistical analysis providing posterior mistyping probabilities at each genotype is performed[3]. All results obtained during the quality control process are fed back into the relational database. The use of a relational database confers the additional advantages of improved integrity, management, manipulation, and presentation of the considerable amounts of data generated in large genome studies. Tests can be applied during the course of a study, as more data become available. When the data have been thoroughly checked and validated, the results can be exported in a variety of formats for analysis by different statistical packages.
Special consideration is given to the issues of data security and patient confidentiality. To safeguard patient confidentiality to the highest degree, no information that could identify a patient is stored in the Genotyping Core databases connected to the network. Serial back-ups of the databases are stored at a remote site. Raw image data is maintained on line for a period of several months while the data is likely to require frequent referencing. After this the image files are archived permanently.
Data Sharing
All data submitted to or generated by the Core is entered into the Core’s Integrated Genetic Database (IGDB). The Core accepts only de-identified data, so all data in the IGDB will be identified by barcode only. Every data record has two release fields: 1. Release Private, that is, ready to return to the Principal Investigator or owner of the data; and 2. Release Public, that is, ready to post publicly in an anonymized form. The Release Private field is checked True as soon as a set of data is complete and fully cleaned. Release Public will generally be permitted shortly after publication of the data, on approval of the Principal Investigator. Once the Release Public field is checked True, de-identified data can be made publicly available through the Core’s web-based front-end. Other non-sensitive categories of data – statistics, metadata, and marker allele frequencies without trait information – will be publicly available over the web as soon as it is cleaned and flagged ready for Release Private. There is also information posted on the same site regarding experimental protocols, methodology, and definitions of variables. Researchers who want more information than the de-identified versions available on the web may submit on-line requests to the Core, who will forward them to the owner of the data.
Computing Resources
The Integrated Genetic Database is hosted by the UCLA Department of Human Genetics Bioinformatics Core. The hardware resources provided by the Bioinformatics Core include an extremely fast and high-throughput (Gb Ethernet) Local Area Network and connection to the second-generation Internet (Internet2). The database is written in Microsoft SQL Server 2000 and is run on a dedicated database server. The Bioinformatics Core also provides the necessary staffing to maintain these systems. Transaction logs and serial back-ups of the database are stored off-site.
User Base
The Core provides services to research groups on the UCLA campus and in the broader scientific community. Scientists from across the U.S. and Europe have made use of the facility. The Core currently has accounts with 396 research groups. The Core’s user database holds information on 3163 users, and there have been 1186 separate visitors to the laboratory in the last year. The Core website provides information about the Core, technical protocols, genetic education including a genetic glossary, and links and ordering information. The Core website receives approximately 6000 hits per month.
Administrative Support and Institutional Commitment
UCLA has made a commitment to supporting and facilitating the research of all its scientists by providing a system of core facilities. These cores are staffed and equipped at a cutting-edge level, allowing scientists to concentrate more time on intellectual problem solving, and less on acquiring and learning new technologies. The David Geffen School of Medicine at UCLA provides administrative support for many of the financial aspects of running a core. The Department of Human Genetics, where the Genotyping and Sequencing Core is housed, provides the space in which the Core operates, and the daily assistance of the Department's staff of five administrative specialists.
Budget
To calculate a budget for your grant, visit our Prices webpage. For genotyping projects, these figures should be used as estimates only. Before you run your project, you must meet with the Core to obtain a quote, as details of your project set-up may affect the cost. You should also consider making allowances in your budget to cover the possibility of having to do a small percentage of repeats of genotyping experiments, in case of missing data.
References
- ↑ Papp JC, Kearsey G, Lange K. 2000. Improving the quality of genotyping data. Am J Hum Genet Suppl 67:A1658.
- ↑ Presson A, Sobel E, , Lange K, Papp JC. 2006. Merging microsatellite data. Journal of Computational Biology 13:1131-1147
- ↑ Sobel E, Papp JC, Lange K. 2002. Detection and integration of genotyping errors in statistical genetics. Am J Hum Genet 70:496-508.