Research
Current Location:homepage  Research
Research Center of Bioinformatics
Editor:贾岩  Updated:2016-01-07  Views:221

     

1. Mission

With the ubiquitous application of high-throughput sequencing (HTS) technologies, biological data continuously rise, and has become one of the most important big data resources, which has enormous potentials to many areas, including (but not limited to) life science, healthcare, medicine, environment, agriculture, national security, etc. However, it is also a big challenge to current information technologies and systems to provide well management, analysis and application for such up to Exabytes scale datasets. This has become a series bottleneck to the development of many related areas, and bioinformatics is fundamental to make the breakthrough.

The goal of bioinformatics and computational biology study in HIT is to develop and use advanced computer science theory, algorithms, information systems, to improve the management, storage, search, analysis, mining, translation, visualization of biological data, to fulfill the wide and urgent demand of biological data application. Mainly, HIT Center for Bioinformatics focuses on the development of cutting-edge HTS data management, indexing, analysis and visualization algorithms, and build large scale biological data management platforms, knowledge bases and data processing and analysis systems to provide effective solution for many national level tasks, such as the establishment and running of China national biological data center, the development of precision medicine, and many China’s fundamental genomics study projects.

2. Overview

Bioinformatics and Computational Biology is an inter-disciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, it combines computer science, statistics, mathematics and engineering to analyze and interpret the biological data.

The bioinformatics research of the school focuses on the analysis  and application of biological high-throughput data brought by innovative technologies in the last decade, e.g. microarray and next-generation sequencing, with the ultimate goal of improving the interpretation of genetic code and developing precision medicine on human diseases. Inspired by both theoretical and practical computing achievements since last century, the faculties in this field have developed and applied computational technologies to this field, such as detecting genome mutations, annotating and visualizing human genome, mining relations between genetic code and diseases, etc.

There are now three professors, two associate professors and three lecturers, among whom four are titled as doctoral supervisors and six master supervisors. Over fifty graduate students are currently enrolled in this field, with a half of them pursuing doctoral degrees. Since its inception, fifteen doctoral and over forty master students have graduated from this field of the school.

In the past five years, faculty members in this research field have been granted more than twenty research projects, among which ten were funded by NSF China and three by the National 863 Program. Some of these projects were prized as Science-Technology Progress of Ministerial or Provincial Level, including Top, Second and Third Prizes won by one, two and three times, respectively. Over five hundred research papers have been published by the faculty members on international as well as domestic academic journals and conferences, with more than seventy of them being on the top ones.

The detailed information of the research work in this field can be found in http://mlg.hit.edu.cn/bit/index_en.html.

3. Research Topics

  • Short read mapping and assembly technologies: Beneficial to precision medicine that requires highly accurate personal genome identification.

  • Visualizing technologies on human genomes: Targeting to both personalized and pedigree genome interpretation.

  • Biological big data quality controls: Saving genomic researches from low quality data or data bias traps.

  • Computational studies of functional genomics: Connecting biological big data analysis to downstream biological or clinical outcome predictions.

  • Biological ontology: Linking genomic molecules to systematic analysis of organism phenotypes and diseases.

4. The Faculty

Prof. Yadong Wang


    He is a professor and the dean of School of Computer Science and Technology, Director of Center for Biomedical Information Technology and Software Systems of Heilongjiang Province, Director of Bioinformatics and Computational Biology Key Lab of Heilongjiang Province and a member of Chinese Computer Association. He was also a member of the expert committee for Biology and Medicine field and an expert of National High-Tech R&D Program of China (863 Program) during the last decade.

His research focuses on bioinformatics, machine learning and knowledge engineering.  In the field of bioinformatics, he is particularly interested in the analysis of personal genomics and translational bioinformatics through high-throughput biomedical data (e.g. microarray, next generation sequencing data, clinical data).

He has presided and accomplished more than 20 research projects, funded by National Natural Science Foundation of China (NSFC), National High-tech R&D Program of China (863 Program), and International Cooperation Programs. Four of these projects were awarded the Second Prize of National Science and Technology Progress, the Second-Class Prize of Science and Technology Progress of Heilongjiang Province and two Second-Class Prizes of Natural Science Foundation of Heilongjiang Province of China, respectively. He has published more than 80 papers in peer-reviewed academic journals, including Nucleic Acids Research, Bioinformatics.

Prof. Yunlong Liu

He is a professor at the Academy of Fundamental and Interdisciplinary Sciences, a deputy director of Center for Biomedical Information Technology and Software Systems of Heilongjiang Province, and a deputy director of Bioinformatics and Computational Biology Key Lab of Heilongjiang Province.

He is an associate editor of BMC Genomics, a member of editorial boards of the following academic journals: Journal of Functional Informatics and Personalized Medicine, Journal of Computational Biology and Drug Design, Journal of Computational Intelligence in Bioinformatics, and Systems Biology, Protein and Peptide Letters.

His research focuses on the management, analysis and visualization of high-throughput sequencing data, decoding genetic mutation functions, and modeling gene regulation networks.

He has presided and accomplished more than 10 research projects, funded by National Natural Science Foundation of China (NSFC), National High-tech R&D Program of China (863 Program), National Key Technology R&D Program of China during the 9th Five-year Plan Period and International Cooperation. He has published more than 50 papers in peer-reviewed academic journals, including Genome Biology, Nucleic Acids Research, Bioinformatics.

Other researchers

  • Prof. Guohua Wang, focusing on functional genomics analysis and modeling;

  • Associate Prof. Jie Li, focusing on drug response modeling and knowledge bases;

  • Associate Prof. Chunguang Ji, focusing on metabolic network modeling;

  • Assistant Prof. Bo Liu, focusing on sequencing reads mapping and assembly;

  • Assistant Prof. Mingxiang Teng, focusing on biological data bias and metrics;

  • Assistant Prof. Jian Liu, focusing on functional genomics analysis and modeling.

5. Selected Publications

The faculty members have amply published their innovative findings, especially on top journals such as Bioinformatics, Nucleic Acids Research and BMC bioinformatics, and top conferences such as IEEE International Conference on Bioinformatics & Biomedicine (BIBM) and IEEE International Conference on Bioinformatics and Bioengineering (BIBE).

5.1 Selected Journal Papers

  1. Yongzhuang Liu, Jian Liu, Jianguo Lu, Jiajie Peng, Liran Juan, Xiaolin Zhu, Bingshan Li and Yadong Wang. Joint Detection of Copy Number Variations in Parent-offspring Trios.Bioinformatics, 2015.

  2. Jiajie Peng, Tao Wang, Jixuan Wang, Yadong Wang and Jin Chen. Extending Gene Ontology with Gene Association Networks. Bioinformatics, 2015.

  3. Xiao Zhu, Henry C. M. Leung, Rongjie Wang, Francis Y. L. Chin, Siu Ming Yiu, Guangri Quan, Yajie Li, Rui Zhang, Qinghua Jiang, Bo Liu, Yucui Dong, Guohui Zhou and Yadong Wang. misFinder: Identify Mis-assemblies in an Unbiased Manner Using Reference and Paired-end Reads.BMC Bioinformatics, 2015.

  4. Bo Liu, Dengfeng Guan, Mingxiang Teng and Yadong Wang. rHAT: Fast Alignment of Noisy Long Reads with Regional Hashing. Bioinformatics, 2015.

  5. Liran Juan, Yongzhuang Liu, Yongtian Wang, Mingxiang Teng, Tianyi Zang and Yadong Wang. Family Genome Browser: Visualizing Genomes with Pedigree Information. Bioinformatics, 2015, 31(14): 2262-2268.

  6. Qinghua Jiang, Jixuan Wang, Xiaoliang Wu, Rui Ma, Tianjiao Zhang, Shuilin Jin, Zhijie Han, Renjie Tan, Jiajie Peng, Guiyou Liu, Yu Li and Yadong Wang. LncRNA2Target: A Database for Differentially Expressed Genes After lncRNA Knockdown or Overexpression. Nucleic Acids Research, 2015, 43(D1): D193-D196.

  7. Liran Juan, Mingxiang Teng, Tianyi Zang, Yafeng Hao, Zhenxing Wang, Chengwu Yan, Yongzhuang Liu, Jie Li, Tianjiao Zhang and Yadong Wang. The Personal Genome Browser: Visualizing Functions of Genetic Variants. Nucleic Acids Research, 2014, 42(W1): W192-W197.

  8. Yongzhuang Liu, Bingshan Li, Renjie Tan, Xiaolin Zhu and Yadong Wang. A Gradient-boosting Approach for Filtering de novo Mutations in Parent-offspring trios. Bioinformatics, 2014,30(13): 1830-1836.

  9. Yue Jiang, Yadong Wang and Michael Brudno. PRISM: Pair Read Informed Split Read Mapping for Base-pair Level Detection of Insertion, Deletion and Structural Variants. Bioinformatics, 2012, 28(20):2576-2583.

    [10] Qinghua Jiang, Yadong Wang, Yangyang Hao, Liran Juan, Mingxiang Teng, Xinjun Zhang, Meimei Li, Guohua Wang and Yunlong Liu. miR2Disease: A Manually Curated Database for microRNA Deregulation in Human Disease. Nucleic Acids Research, 2008, 37(1): D98-D104.

5.2 Selected Top Conference Papers

  1. Yang Bai, Shufan Ji, Yadong Wang. ESclassifier: A Random Forest Classifier for Detection of Exon Skipping Events from RNA-Seq Data. Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2014), 2014.11.2-2014.11.5, Belfast UK, 205-208.

  2. He Liu, Yadong Wang, Lei Wang. A Review of Non-contact, Low-cost Physiological Information Measurement Based on Photoplethysmographic Imaging. Proceedings of IEEE Engineering in Medicine and Biology Society (EMBC 2012), 2012.8.28-2012.9.1, San Diego USA, 2088-2091.

  3. Qinghua Jiang, Guohua Wang, Yadong Wang. An Approach for Prioritizing Disease-related MicroRNAs Based on Genomic Data Integration. Proceedings of International Conference on BioMedical Engineering and Informatics (BMEI 2010), 2010.10.16-2010.10.18, Yantai China, 2270-2274.

  4. Mingxiang Teng, Yadong Wang, Guohua Wang, Jeesun Jung, Howard J Edenberg, Jeremy R Sanford, Yunlong Liu. Prioritizing Single-nucleotide Variations that Potentially Regulate Alternative Splicing. Proceedings of Genetic Analysis Workshop (GAW 2010), 2010.10.13-2010.10.16, Boston USA, 1-7.

  5. Qinghua Jiang, Guohua Wang, Tianjiao Zhang, Yadong Wang. Predicting Human microRNA-disease Associations Based on Support Vector Machine. Proceedings of IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2010), 2010.12.18-2010.12.21, Hong Kong China, 467-472.

  6. Yongzhuang Liu, Shuang Qiu, Haijun Tao, Tianyi Zang, Ya-Dong Wang. A Case-based Reasoning Approach to Support Web Service Composition. Proceedings of International Conference on Machine Learning and Cybernetics (ICMLC 2009), 2009.7.12-2009.7.15, Baoding China, 1471-1476.

  7. Guangri Quan, Yu Du, Junheng Huang, Yadong Wang. Prediction of Human Disease Genes Based on Associations between Phenome and Proteins. Proceedings of International Conference on Biomedical Engineering and Informatics (BMEI 2009), 2009.10.17-2009.10.19, Tianjin China, 1-4.

  8. Yongdong Xu, Guangri Quan, Yadong Wang, Zhiming Xu. Multiple Features Fusion Method for Identifying Text Topic Boundaries. Proceedings of International Conference on Machine Learning and Cybernetics (ICMLC 2008), 2008.7.12-2008.7.15, Kunming China, 2950-2956.

  9. Haijun Tao, Yadong Wang, Maozu Guo. Research on Knowledge-Based Intelligent Agent Model. Proceedings of International Conference on Knowledge Generation, Communication and Management. (KGCM 2008), 2008.6.29-2008.7.2, Orlando USA.

    [10] Haijun Tao, Yadong Wang, Maozu Guo. An Extended Contract-Net Negotiation Model Based on Task Coalition and Genetic Algorithm. Proceedings of International Conference on Machine Learning and Cybernetics. (ICMLC 2007), 2007.8.19-2007.8.22, Hong Kong China, 879-884.

6. Selected Research Projects

  1. National High-tech R&D Program of China (863 Program), Digital Medicine Project and Technology (Grant No. 2012AA0204). $200M, 2012-2015.

    This project focuses on the development of digital medicine technologies, including acquirement, management, sharing and standardization of medical data, the integration of omics data, clinical data and health data, the development and application of clinical knowledgebase. The technologies developed by this project will be integrated to build the next generation EHR/EMR systems, and apply to the medical practice in China.

  2. National High-tech R&D Program of China (863 Program), Research on the Critical Technologies of the Development and Application of Biological Big Data (Grant No. 2015AA0201). $180M, 2015-2017.

    Towards the widely and pressing demands of the management, sharing and application of biological big data in China, this project aims at developing a series of key technologies for the management and sharing of biological big data, including biological big data representation, storage, indexing, querying, searching, analysis etc., which will support the establishment of the biological big data center of China.

  3. National High-tech R&D Program of China (863 Program), Research on the Critical Technologies of the Integration and Information Service of Microbial Digital Resources (Grant No. 2014AA021505). $5M, 2014-2016.

    This project focuses on the development of a series of critical technologies for the integration, management, sharing and analysis of microbial data. Under the grant of this project, a cloud-based microbial information service system will be built, which can support the management, sharing and searching of microbial data resources in China. Meanwhile, it also provides the pipelines for the user to analyze microbial data.

7. Selected Awards

  1. National Science and Technology Progress Award, second-class prize. Research and Application of Expert System in Agriculture. Prof. Yadong Wang. 2006. This award is for the work on the development of expert system in agriculture for helping better operating and managing agricultural productions.

  2. Heilongjiang Province Natural Science Award, second-class prize. Research on Automatic Color Matching Technology through Machine Learning. Prof. Yadong Wang, Prof. Xiaohong Su. 2006. This award is for the work on better matching and transferring computer machine colors to printing colors with less distortion.

  3. Heilongjiang Province Natural Science Award, second-class Prize. Research on Pattern Recognition Methods for Gene Mapping of Complex Diseases. Prof. Yadong Wang. 2007. This award is for the work on the method development for better understanding mechanisms of complex diseases, specifically in gene and molecule levels.

  4. Top 100 high impact academic papers in China. MiR2Disease: A Manually Curated Database for MicroRNA Deregulation in Human Disease. Prof. Yadong Wang. 2010.

8. Social Contribution

HIT Center for Bioinformatics is one the of most famous bioinformatics research teams in China. The director of the center, Prof. Yadong Wang, is the member of the expert groups of many national fundamental science and technology programs, including National Key Research and Development Project of China and The National High Technology Research and Development Program of China(863 Program). He is also the member of National Strategic Guidance Expert Committee on Development of Biological Technology. Under the leadership of Prof. Wang, HIT Center for Bioinformatics has carried out more than 30 national fundamental science and technology development projects on bioinformatics and genomics study, and developed hundreds of key algorithms information systems on the management, analysis, mining and application of big biological data, which largely contributions to improve China’s ability of big biological data application. Mainly, HIT Center for Bioinformatics makes distinguished outcomes on the two following aspects.

1) China’s 100,000 Genomes Projects

China’s 100,000 Genomes Projects is the first large scale genome study held in China, and it is currently also one of the largest genomics science programs around world. The main goal of this world famous project is to accomplish whole genome sequencing and analysis for over 100,000 Chinese people, to build the first comprehensive genomic variant map of China as well as the first multi-omics “health map of China”, to deeply investigate the complex between genomics variants and the environment of China, and its effect on the phenotypes and health status of Chinese people. This project will build the cornerstone for the development of precision medicine in China, and will be a milestone on the history of genomics for both of China and the world. Prof. Yadong Wang is the chief scientist, and HIT Center for Bioinformatics is the leading unit of this project. The team of HIT Center for Bioinformatics is developing and applying hundreds of cutting-edge big genomics data management and analysis technologies as well as genome sequencing technologies, to accomplish this historic task.

2) The development of key technologies for national biological data center

The amount of omics data as well as other biological data is continuously fast growing in China, and the management and sharing of the data is critical to the development of genomics as well as other related areas such as advanced medicine and healthcare. China still does not have a national biological data center, like NCBI for US and EMBL-EBI for EU, which is a bottleneck to better data share and use. It is a very technical task to establish and run a national biological data center, due to that many cutting-edge big biological data technologies are required. HIT Center for Bioinformatics makes outstanding contributions for this task. It has been nearly 17 years for this team to be the main taskforce develop key technologies and build the IT framework of China’s national biological data center. This team have developed over 100 key technologies, for the management, storage, transmission, analysis of big biological data, including world’s first personal and family big genomics data management and visualization systems (PGBrowser and FGBrowser), world’s first graph index-based sequence query and alignment system (deBGA), and world’s first large scale genome sequence index construction system (deBWT), etc. These technologies have been adopted for the construction of China’s national biological data center. Moreover, the construction of national biological data center is coming soon, and HIT Center for Bioinformatics will be the main technological taskforce as always, to make its own contributions.


Website: http://mlg.hit.edu.cn/bit/index_en.html