广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (3): 40-48.doi: 10.16088/j.issn.1001-6600.2021091503

• 研究论文 • 上一篇    下一篇

基于双向语言模型的社交媒体药物不良反应识别

李正光1, 陈恒1*, 林鸿飞2   

  1. 1.大连外国语大学 语言智能研究中心, 辽宁 大连 116044;
    2.大连理工大学 计算机科学与技术学院, 辽宁 大连 116024
  • 收稿日期:2021-09-15 修回日期:2021-12-20 出版日期:2022-05-25 发布日期:2022-05-27
  • 通讯作者: 陈恒(1982—), 男, 安徽太和人, 大连外国语大学副教授, 博士。E-mail: chenheng@dlufl.edu.cn
  • 基金资助:
    国家自然科学基金(61806038); 辽宁省高等学校创新人才项目(WR2019005); 辽宁省教育厅科学研究经费项目(2020JYT03); 教育部人文社科项目(18YJCZH2018);大连外国语大学科研基金(2021XJYB16,2021XJYB19)

Identification of Adverse Drug Reaction on Social Media Using Bi-directional Language Model

LI Zhengguang1, CHEN Heng1*, LIN Hongfei2   

  1. 1. Research Center for Language Intelligence, Dalian University of Foreign Language, Dalian Liaoning 116044, China;
    2. School of Computer Science and Technology, Dalian University of Technology, Dalian Liaoning 116024, China
  • Received:2021-09-15 Revised:2021-12-20 Online:2022-05-25 Published:2022-05-27

摘要: 与服药相关的社交文本中隐藏着更具时效和更广泛的药物不良反应信息,但是从相对短小、稀疏的社交短文本中提取药物不良反应非常困难。基于此,本文提出一种双向语言预训练模型和注意力机制相结合的神经网络识别方法。该方法利用双向字符级语言预训练模型提取特定字符级特征,而且在提取药物不良反应的同时,通过注意力机制捕获局部和全局语义上下文信息。此外,为了提高该方法的效率,将字符级特征与词级特征相结合,并采用词级预训练和字符级预训练模型代替协同训练。在PSB 2016社交媒体挖掘共享任务2中的实验结果表明,字符特征在形态学上有助于区分药物不良反应,而注意力机制通过捕获局部和全局语义信息提高了对药物不良反应的识别性能,宏平均F1值为82.2%。

关键词: 药物不良反应, 社交媒体, 双向语言模型, 注意力机制, 预训练模型

Abstract: More time-effective and wider adverse drug reactions are concealed in tweets related to feelings of taking medication. However, it is difficult to extract adverse drug reaction (ADR) from these tweets due to relatively shortness and sparseness of tweets. Therefore, a neural network model is proposed in this paper, which employes the pretrained bidirectional language model and attention mechanism to identify ADR. Firstly, specific character-level features are extracted via a pretrained bidirectional character-level neural language model. Secondly, the attention mechanism is used to capture local and global semantic contexts while extracting ADRs. Thirdly, to improve the efficiency of the proposed method, Character-level features are combined with word-level features. Finally, co-training is replaced with the pretrained of the whole-word level and fine-tuned pretrained character embeddings. These optimizations contribute to improving the performance of identification. The proposed model achieves better performance on the PSB 2016 Social Media Mining Sharing Task Workshop-Task 2: ADR Extraction, obtaining the F1-scores of 82.2% on official datasets. Character features are useful for distinguishing ADR and non-ADR in morphology. In addition, attention mechanism improves the performance of identifying ADR due to capturing local and global semantic contexts.

Key words: adverse drug reaction, social media, bi-directional language model, attention mechanism, pretrained model

中图分类号: 

  • TP391.1
[1]SARKER A, BELOUSOV M, FRIEDRICHS J, et al. Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health(SMM4H)-2017 shared task[J]. Journal of the American Medical Informatics Association, 2018, 25(10): 1274-1283. DOI: 10.1093/jamia/ocy114.
[2]朱晓旭,林鸿飞,曾泽渊. 基于社交媒体的药物不良反应检测[J]. 山西大学学报(自然科学版), 2020, 43(1): 14-21.
[3]BENTON A, UNGAR L, HILL S, et al. Identifying potential adverse effects using the web: a new approach to medical hypothesis generation[J]. Journal of Biomedical Informatics, 2011, 44(6): 989-996.
[4]张亚飞, 于琦, 王于心, 等. 基于药物论坛中潜在不良反应与适应症的知识发现体系构建[J]. 中华医学图书情报杂志, 2020, 29(7): 38-43.
[5]ZHANG Y, CUI S, GAO H. Adverse drug reaction detection on social media with deep linguistic features[J]. Journal of Biomedical Informatics, 2020, 106: 103437.
[6]许力, 李建华. 基于BERT和BiLSTM-CRF的生物医学命名实体识别[J]. 计算机工程与科学, 2021, 43(10): 1873-1879.
[7]LUO L, YANG Z, YANG P, et al. An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition[J]. Bioinformatics. 2018, 34(8): 1381-1388.
[8]ZHANG T, LIN H, REN Y, et al. Identifying adverse drug reaction entities from social media with adversarial transfer learning model[J]. Neurocomputing, 2021, 45: 254-262.
[9]LI Z, YANG Z, WANG L, et al. Lexicon knowledge boosted interaction graph network for adverse drug reaction recognition from social media[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 25(7): 2777-2786.
[10]佘朝阳, 严馨, 徐广义, 等. 基于数据增强和半监督学习的药物不良反应检测[J/OL]. 计算机工程[2021-10-15]. https://doi.org/10.19678/j.issn.1000-3428.0062170.
[11]SAHU S K, ANAND A. Recurrent neural network models for disease name recognition using domain invariant features[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 2016: 2216-2225. DOI: 10.18653/v1/P16-1209.
[12]JAGANNATHA A, YU H. Structured prediction models for RNN based sequence labeling in clinical text[C]// Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2016: 856-865. DOI: 10.18653/v1/D16-1082.
[13]PANDEY C, IBRAHIM Z, WU H H, et al. Improving RNN with attention and embedding for adverse drug reactions[C]// Proceedings of the 2017 International Conference on Digital Health. New York, NY: Association for Computing Machinery, 2017: 67-71. DOI: 10.1145/3079452.3079501.
[14]PENG Y F, YAN S K, LU Z Y. Transfer learning in biomedical natural language processing: an evaluation of BERT and ELMo on ten benchmarking datasets[C]// Proceedings of the 18th BioNLP Workshop and Shared Task. Stroudsburg, PA: Association for Computational Linguistics, 2019: 58-65. DOI: 10.18653/v1/W19-5006.
[15]申晨, 林鸿飞. 基于图嵌入的社交媒体药物不良反应事件检测方法[J]. 大连理工大学学报, 2020, 60(5): 547-554.
[16]宋雅文, 杨志豪, 罗凌, 等. 基于字符卷积神经网络的生物医学变异实体识别方法[J]. 中文信息学报, 2021, 35(5): 63-69.
[17]SARABADANI S. Detection of adverse drug reaction mentions in tweets using ELMo[C]// Proceedings of the 4th Social Media Mining for Health Applications(#SMM4H) Workshop & Shared Task. Stroudsburg, PA: Association for Computational Linguistics, 2019: 120-122. DOI: 10.18653/v1/W19-3221.
[18]SRIVASTAVA R K, GREFF K, SCHMIDHUBER J. Highway networks[EB/OL].(2015-11-03)[2021-09-15].http:// arxiv.org/abs/1505.00387. DOI: 10.48550/arXiv.1505.00387.
[19]DEWI I N, 蔡晓玲, 刘晓锋, 等. 结合类别关键词与注意力机制的药物相互关系抽取模型[J]. 华南理工大学学报(自然科学版), 2021, 49(1): 10-17.
[20]YANG Z C, YANG D Y, DYER C, et al. Hierarchical attention networks for document classification[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: Association for Computational Linguistics, 2016: 1480-1489. DOI: 10.18653/v1/N16-1174.
[21]魏巍, 傅维刚. 面向社交媒体的细粒度ADR本体的半自动构建方法研究[J]. 图书情报工作, 2019, 63(3): 108-114.
[22]COCOS A, FIKS A G, MASINO A J. Deep learning for pharmacovigilance: Recurrent neural network architectures for labeling adverse drug reactions in Twitter posts[J]. Journal of the American Medical Informatics Association, 2017, 24(4): 813-821. DOI: 10.1093/jamia/ocw180.
[23]NIKFARJAM A, SARKER A, O’CONNOR K, et al. Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features[J]. Journal of the American Medical Informatics Association, 2015, 22(3): 671-681. DOI: 10.1093/jamia/ocu041.
[24]NIKFARJAM A, GONZALEZ G H. Pattern mining for extraction of mentions of adverse drug reactions from user comments[J]. AMIA Annual Symposium Proceedings, 2011, 2011: 1019-1026.
[25]LAI S, LIU K, XU L, et al. How to generate a good word embedding?[J]. IEEE Intelligent Systems, 2016, 31(6): 5-14.
[26]DUCHI J, HAZAN E, SINGER Y. Adaptive Subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research, 2011, 12: 2121-2159.
[27]CHOWDHURY S, ZHANG C W, YU P S. Multi-Task Pharmacovigilance Mining from Social Media Posts[C]// Proceedings of the 2018 World Wide Web Conference. Geneva, Switzerland: International World Wide Web Conferences Steering Committee, 2018: 117-126. DOI: 10.1145/3178876.3186053.
[1] 万黎明, 张小乾, 刘知贵, 宋林, 周莹, 李理. 基于高效通道注意力的UNet肺结节CT图像分割[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 66-75.
[2] 张萍, 徐巧枝. 基于多感受野与分组混合注意力机制的肺结节分割研究[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 76-87.
[3] 孔亚钰, 卢玉洁, 孙中天, 肖敬先, 侯昊辰, 陈廷伟. 面向强化当前兴趣的图神经网络推荐算法研究[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 151-160.
[4] 吴军, 欧阳艾嘉, 张琳. 基于多头注意力机制的磷酸化位点预测模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 161-171.
[5] 邓文轩, 杨航, 靳婷. 基于注意力机制的图像分类降维方法[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 32-40.
[6] 李维勇, 柳斌, 张伟, 陈云芳. 一种基于深度学习的中文生成式自动摘要方法[J]. 广西师范大学学报(自然科学版), 2020, 38(2): 51-63.
[7] 王健, 郑七凡, 李超, 石晶. 基于ENCODER_ATT机制的远程监督关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(4): 53-60.
[8] 武文雅, 陈钰枫, 徐金安, 张玉洁. 基于高层语义注意力机制的中文实体关系抽取[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 32-41.
[9] 岳天驰, 张绍武, 杨亮, 林鸿飞, 于凯. 基于两阶段注意力机制的立场检测方法[J]. 广西师范大学学报(自然科学版), 2019, 37(1): 42-49.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 艾艳, 贾楠, 王媛, 郭静, 潘东东. 多性状多位点遗传关联分析的统计方法研究及其应用进展[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 1 -14 .
[2] 白德发, 徐欣, 王国长. 函数型数据广义线性模型和分类问题综述[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 15 -29 .
[3] 曾庆樊, 秦永松, 黎玉芳. 一类空间面板数据模型的经验似然推断[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 30 -42 .
[4] 张治飞, 段谦, 刘乃嘉, 黄磊. 基于Jackknife互信息的高维非线性回归模型研究[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 43 -56 .
[5] 杨迪, 方扬鑫, 周彦. 基于MEB和SVM方法的新类别分类研究[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 57 -67 .
[6] 陈钟秀, 张兴发, 熊强, 宋泽芳. 非对称DAR模型的估计与检验[J]. 广西师范大学学报(自然科学版), 2022, 40(1): 68 -81 .
[7] 杜锦丰, 王海荣, 梁焕, 王栋. 基于表示学习的跨模态检索方法研究进展[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 1 -12 .
[8] 李慕航, 韩萌, 陈志强, 武红鑫, 张喜龙. 面向复杂高效用模式的挖掘算法综述[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 13 -30 .
[9] 晁睿, 张坤丽, 王佳佳, 胡斌, 张维聪, 韩英杰, 昝红英. 中文多模态知识库构建[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 31 -39 .
[10] 周圣凯, 富丽贞, 宋文爱. 基于深度学习的短文本语义相似度计算模型[J]. 广西师范大学学报(自然科学版), 2022, 40(3): 49 -56 .
版权所有 © 广西师范大学学报(自然科学版)编辑部
地址:广西桂林市三里店育才路15号 邮编:541004
电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn
本系统由北京玛格泰克科技发展有限公司设计开发