Researchers in California today unveiled what they describe as the largest repository in the world for cancer genomes . The database will be easier for scientists to analyze large amounts of sequencing data spilling genome projects (NCI) of the National Institute of United States cancer.
Cancer Genomics Hub (CGHub) built by a team from the University of California, Santa Cruz (UCSC), will hold the first sequencing data from The Cancer Genome Atlas (TCGA). The atlas is the enormous effort of NCI to sequence the DNA of normal cells and tumor cells than 10,000 people with 20 types of cancer. (In some cases, the project is the sequencing of entire genomes;. In other cases, only 1% of the genome that code for proteins) CGHub projects also contain data on the genome associated with cancer and childhood- HIV NCI. He will take over for the NIH for Biotechnology Information National Centre, which had been collecting cancer sequencing data by last August
Physically based at the San Diego Supercomputer Center, the computer system is ready to store CGHub 5 petabytes of data DNA and RNA from cancer patients. (TCGA generates 10 terabytes of data per month, and will eventually produce 10 petabytes [10,000 terabytes] data.)
TCGA built a catalog of genetic modifications cancer key driving that researchers can use to develop treatments suitable for genetic tumor of an individual. A central database will allow researchers to compare the changes and not connected paths through the types of cancer, said UCSC bioinformatics David Haussler, director of the project financed by a $ 10.3 million contract from the NCI: "What is very important is to gather the data in one place and make it easy for researchers to make comparisons between the data sets. "CGHub not hold data from other international projects on cancer genome, however.
for now, researchers will be able to only download data. But sending genome data through the Internet is more convenient than the balloon data sets in size (see our 2011 story "Will Computers crash genomics?"). Haussler said finally, researchers will be able to work on remote data servers CGHub through cloud computing, as NIH did with Amazon for data from its 1000 Genomes Project.
0 Komentar