Home » Projects


The principal mission of Theodosius Dobzhansky Center for Genome BioInformatics is to build a community where genetic and computer scientists, students, fellows and staff labor to develop, test, and apply innovative, computational routines for human gene discovery, vertebrate genome assembly and annotation, functional imputation and translation to global heath and environmental well being .

The Center will have five major project areas of emphasis:

  1. Human gene discovery based upon genome wide association studies (GWAS), whole genome sequence analysis and application with newly designed population cohorts in Russia, China and USA. The diseases for which we seek resistance/susceptibility loci include HIV/AIDS, hepatitis/liver cancer, and nasopharyngeal carcinoma. At NCI Laboratory of Genomic Diversity, our group has assembled cohorts of 20,000 study participants . We propose cohort development for HIV in Russia providing an unusual opportunity to uncover genetic components of these complex important human diseases. Genome technologies designed to test several million single nucleotide polymorphism variants in these study populations will be facilitated with developing bioinformatics and computational regimes that automate gene discovery based on the combination of allele /haplotype association and linkage disequilibrium in human populations.
  2. Genome sequencing assembly and annotation of vertebrate species’ genomes. Through Genome10K, an international collaborative project meant to facilitate and accomplish whole genome sequence, assembly and annotation of approximately 10,000 vertebrate species (http://genome10k.org/) (which Dr. O’Brien founded and leads), we will usher a new biology of genome sequence capacity to the coming generation. The hope would be to empower one day soon all available vertebrate species with a whole genome sequence resolution for study, and application for comparative medicine as well as species betterment and conservation. Our strategy is to develop fast, proven, efficient, computational algorithms for genome assembly, annotation, interpretation, gene discovery, genome mining and comparative genomics.
  3. Resolving ethical quandaries in human data and bio-specimen access in the post-genomic era. The excitement and hope of gene discovery for complex disease in large longitudinal disease population cohorts in confounded by an ethical issue conflicts scientists’ right to open access and discovery with the patients’ rights to privacy and informed consent over present and future investigations. This issue is being debated in the bio-ethics and human genomics community since unresolved it can threaten the whole discipline of gene association studies . We are developing and implementing a compromise analytical and disclosure solution which allows open access to cohort based gene association study results but still protects the privacy anonymity, and consent preferences of study participants.
  4. Species conservation and genomic inference. We shall encourage and facilitate the application of genomics technology to understand the natural history and inform management action around endangered species. Previous studies across several decades have studied cheetahs and lions in Africa, pumas , leopards, whales and many other species to interpret the threats to threatened species in the context of recent history. Most recently we have identified the population structure of tiger subspecies including the Amur and Caspian tigers, which lived in Russian states until recently. The demonstration of the genetic identity of Amur and Caspian tiger’s subspecies stimulated a bold proposal to consider restoration of tigers in the Caspian tiger habitat including a dozen states of the former Soviet Union. With WWF –Russia we are working to monitor and facilitate that proposal by genetic monitoring of source tigers. We also propose to establish a Russian repository of animal species biospecimens as a prelude to empowering a generation of genetic investigation of Russian and other Asian species.
  5.  Training and education. Each of the four areas above will be developed into graduate student and postdoctoral fellow trainingin Russia. In addition, we are developing short 2-week intensive courses on each area to transfer the latest state of the art Bioinformatics and applications to the students of Russia and Asia.

We propose to accomplish these goals by active facilitation of international science expertise, access, training and cooperation to build accessible Bioinformatics tools for gene discovery, gene transfer, gene therapy, imputation for therapy design, and species conservation.