中科院数学与系统科学研究院

数学研究所

 

学术报告会

 

 

报告人:朱山风 副教授   (复旦大学)

  目:Large-scale Multilabel Learning and its applications in Bioinformatics

  间:2017.11.03(星期五),11:00-12:00

  点:数学院南楼N913

摘要:

Multi-label learning deals with the classification problems where each instance can be assigned with multiple class labels simultaneously. There are thousands or even more labels in large-scale multi-label learning. Many important problems in bioinformatics can be modeled as a large scale multi-label learning problem, such as MeSH indexing, drug target interaction prediction and protein function prediction. By utilizing learning to rank framework, we have developed MeSHLabeler and DeepMeSH to solve large-scale MeSH indexing problem, DrugE-Rank to solve drug target interaction prediction problem, and GOLabeler for protein function prediction. DeepMeSH achieved the first place in both BioASQ4 and BioASQ5 challenge, and MeSHLabeler achieved the first place in both BioASQ2 and BioASQ3 challenges. Specifically, DeepMeSH achieved a Micro F-measure of 0.6323, 2% higher than 0.6218 of MeSHLabeler and 12% higher than 0.5637 of MTI (NLM's official solution), for BioASQ3 challenge data with 6000 citations. In addition, using benchmark data in DrugBank, experimental results show that DrugE-Rank outperforms competing methods significantly, especially achieving more than 30% improvement in Area under Prediction Recall curve for FDA approved new drugs and FDA experimental drugs. Finally, according to the initial evaluation of CAFA3 (The Critical Assessment of protein Function Annotation algorithms) in July 2017, GOLabeler achieved the first place in terms of F-max out of nearly 200 submissions by around 50 labs all over the world.

简历:

朱山风,复旦大学计算机科学技术学院副教授,博士生生导师。香港城市大学博士(2003,日本京都大学博士后(2004-2008,日本学术振兴会邀请访问学者(JSPS Invitation Fellowship 2012,美国伊利诺伊大学香槟分校访问学者(2013-2014,日本京都大学访问副教授(2016)。主要研究方向为生物信息学、信息检索和数据挖掘。在相关领域的著名国际期刊和会议如KDDIJCAIISMBBioinformaticsNARBriefings in Bioinformatics等以第一作者或通讯作者发表论文40余篇。BIBM2014-2017InCoB2012-2017GIW2015-2017APBC2014-2018等生物信息学国际会议程序委员会委员。2014-2017年连续四次参加BioASQ大规模生物医学文本自动标注国际竞赛中均取得第一名的好成绩,比美国国立医学图书馆使用软件精度提高约12%

附件
相关文档