Journal of Guangxi Normal University(Natural Science Edition) ›› 2021, Vol. 39 ›› Issue (2): 1-12.doi: 10.16088/j.issn.1001-6600.2020082603

    Next Articles

Survey on Modeling Factors of Neural Information Retrieval Model

YANG Zhou1,2, FAN Yixing3, ZHU Xiaofei1*, GUO Jiafeng3, WANG Yue2   

  1. 1. School of Computer Science and Engineering, Chongqing University of Technology, Chongqing 400054, China;
    2. Intelligent Media R & D Center SOHU, Beijing 100190, China;
    3. CAS Key Lab of Network Data Science and Technology, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2020-08-26 Revised:2020-09-22 Online:2021-03-25 Published:2021-04-15

Abstract: Information retrieval models are widely used in search engines. In the task of information retrieval, these models focuses on the different semaphores, which leads to great differences in model performance. At present, most models are based on part or all of the following information: exact signals, similar signals, signals differentiation, query word weight, proximity, text structure, and different distribution assumptions. This paper introduces the specific meaning of each modeling factor, and exemplifies the positive effect of this factor on modeling through relevant experiments. Based on the above experiments and analysis, this paper finally discusses and analyzes the future development and the trend of information retrieval model.

Key words: information retrieval, deep learning, convolutional neural network, recurrent neural network, survey

CLC Number: 

  • TP391.3
[1] YANG Y,YIH S W,MEEK C.WikiQA:a challenge dataset for open-domain question answering[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudburg,PA:ACL,2015:2013-2018.
[2] RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:100,000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Stroudburg,PA:ACL, 2016:2383-2392.
[3] YANG L,QIU M,GOTTIPATI S,et al.CQARank:jointly model topics and expertise in community question answering[C]//Proceedings of the 22nd ACM international conference on Conference on information &knowledge management.New York:ACM,2013:99-108.
[4] LECUN Y,BENGIO Y,HINTON G.Deep learning[J]. Nature,2015,521(7553):436-444.
[5] COLLOBERT R,WESTON J,BOTTOU L,et al.Natural language processing (almost) from scratch[J]. Journal of Machine Learning Research, 2011,12(1):2493-2537.
[6] VINYALS O,KAISER L,KOO T,et al.Grammar as a foreign language[EB/OL].(2015-06-09)[2020-08-26].https://arxiv.org/abs/1412.7449.
[7] LI H,XU J.Semantic matching in search[J]. Foundations and Trends in Information Retrieval,2014,7(5):343-469.
[8] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[EB/OL].(2013-09-07)[2020-08-26].https://arxiv.org/abs/1301.3781.
[9] PANG L,LAN Y Y, GUO J F,et al.A deep investigation of deep IR models[EB/OL].(2017-07-24)[2020-08-26].https://arxiv.org/abs/1707.07700.
[10] FANG H,TAO T,ZHAI C X.Diagnostic evaluation of information retrieval models[J]. ACM Transactions on Information Systems,2011:7.
[11] 庞亮,兰艳艳,徐君,等.深度文本匹配综述[J].计算机学报,2017,40(4): 985-1003.
[12] CHUKLIN A,MARKOV I,RIJKE M D.Click models for Web search[J]. Synthesis Lectures on Information Concepts Retrieval &Services,2015,7(3):1-115.
[13] LIU Y Q,XIE X H,WANG C,et al.Time-aware click model[J]. ACM Transactions on Information Systems,2016,35(3):16.
[14] ROBERTSON S E,WALKER S.Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval[C]//Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.Berlin:Springer-Verlang,1994:232-241.
[15] ZHAI C,LAFFERTY J.A study of smoothing methods for language models applied to Ad Hoc information retrieval[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2001:333-342.
[16] HU B T, LU Z D, LI H,et al.Convolutional neural network architectures for matching natural language sentences[EB/OL].(2015-03-11)[2020-08-26].https://arxiv.org/abs/1503.03244v1.
[17] HUANG P S,HE X D,GAO J F,et al. Learning deep structured semantic models for web search using clickthrough data[C]//Proceedings of the 22nd ACM International Conference on Information &Knowledge Management. New York: ACM,2013:2333-2338.
[18] SHEN Y L,HE X D, GAO J F,et al. Learning semantic representations using convolutional neural networks for web search[C]//Proceedings of the 23rd International Conference on World Wide Web.New York:ACM,2014:373-374.
[19] GUO J F,FAN Y X,AI Q Y,et al.A deep relevance matching model for ad-hoc retrieval[C]//Proceedings of the 25th ACM International on Conference on Information and Knowledge Management.New York:ACM,2016:55-64.
[20] FAN Y X,GUO J F, LAN Y Y,et al. Modeling diverse relevance patterns in Ad-hoc retrieval[C]//The 41st International ACM SIGIR Conference on Research &Development in Information Retrieval.New York:ACM,2018:375-384.
[21] MITRA B,DIAZ F,CRASWELL N.Learning to match using local and distributed representations of text for web search [EB/OL].(2016-10-26)[2020-08-26].https://arxiv.org/abs/1610.08136.
[22] GRAVES A.Offline handwriting recognition with multidimensional recurrent neural networks[M]//MÄRGNER V,EL ABED H.Guide to OCR for Arabic Scripts.London:Springer,2012:297-313.
[23] CHO K,VAN MERRIENBOER B, GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Stroudburg,PA:ACL,2014:1724-1734.
[24] WAN S X,LAN Y Y,XU J,et al.Match-SRNN:modeling the recursive matching structure with spatial RNN[J].Computers &Graphics,2016,28(5):731-745.
[25] TAO T,ZHAI C X.An exploration of proximity measures in information retrieval[C]//Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2007:295-302.
[26] PANG L,LAN Y Y,GUO J F,et al.Deeprank:a new deep architecture for relevance ranking in information retrieval[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.New York: ACM,2017:257-266.
[27] PANG L,LAN Y Y,GUO J F,et al.Text matching as image recognition[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2016:2793-2799.
[28] LECUN Y,BOTTOU L.Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE,1998,86(11):2278-2324.
[29] LEVIN E.A recurrent neural network:limitations and training[J]. Neural Networks,1990,3(6):641-650.
[30] DATAR M,Immorlica N,Indyk P,et al.Locality sensitive hashing scheme based on p-stable distributions[C]//Proceedings of the Twentieth Annual Symposium on Computational Geometry.New York: ACM,2004:253-262.
[31] DAI Z Y,XIONG C Y,CALLAN J,et al.Convolutional neural networks for soft-matching N-grams in Ad-hoc search[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining.New York:ACM,2018:126-134.
[32] XIONG C Y,DAI Z Y,CALLAN J,et al.End-to-end neural ad-hoc ranking with kernel pooling[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2017:55-64.
[33] HUI K,YATES A,BERBERICH K,et al.PACRR:a position-aware neural IR model for relevance atching[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.Stroudburg,PA:ACL,2017:1049-1058.
[34] PONTES J,JOÃO D,CARVALHO R A D,et al.Information retrieval to knowledge retrieval:reflections and proposals[J].Perspectives em Ciência da Informaco,2013,18(4):2-17.
[35] GABRILOVICH E,MARKOVITCH S.Wikipedia-based semantic interpretation for natural language processing[J].Journal of Artificial Intelligence Research,2009,34:443-498.
[36] WU H C,LUK R W P,WONG K F,et al.A retrospective study of a hybrid document-context based retrieval model[J].Information Processing &Management,2007 43(5): 1308-1331.
[1] DENG Wenxuan, YANG Hang, JIN Ting. A Dimensionality-reduction Method Based on Attention Mechanismon Image Classification [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 32-40.
[2] XUE Tao, QIU Senhui, LU Hao, QIN Xingsheng. Exchange Rate Prediction Based on Empirical Mode Decomposition and Multi-branch LSTM Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 41-50.
[3] TANG Rongchai, WU Xiru. Real-time Detection of Passion Fruit Based on Improved YOLO-V3 Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 32-39.
[4] ZHANG Mingyu,ZHAO Meng,CAI Fuhong,LIANG Yu,WANG Xinhong. Wave Power Prediction Based on Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(3): 25-32.
[5] GE Yifei, ZHENG Yanbin. Private Information Retrieval Schemes with Erasure-correcting or Error-correcting Properties [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(3): 33-44.
[6] LI Weiyong, LIU Bin, ZHANG Wei, CHEN Yunfang. An Automatic Summarization Model Based on Deep Learning for Chinese [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 51-63.
[7] YAN Hao, XU Hongbo, SHEN Yinghan, CHENG Xueqi. Research on Open Chinese Event Detection [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 64-71.
[8] LIU Yingxuan, WU Xiru, XUE Ganggang. Multi-target Real-time Detection for Road Traffic SignsBased on Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(2): 96-106.
[9] FAN Rui, JIANG Pinqun, ZENG Shangyou, XIA Haiying, LIAO Zhixian, LI Peng. Design of Lightweight Convolution Neural Network Based on Multi-scale Parallel Fusion [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(3): 50-59.
[10] ZHANG Jinlei, LUO Yuling, FU Qiang. Predicting Financial Time Series Based on Gated Recurrent Unit Neural Network [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(2): 82-89.
[11] HUANG Liming,CHEN Weizheng,YAN Hongfei,CHEN Chong. A Stock Prediction Method Based on Recurrent Neural Network and Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 13-22.
[12] WU Wenya,CHEN Yufeng,XU Jin’an,ZHANG Yujie. High-level Semantic Attention-based Convolutional Neural Networks for Chinese Relation Extraction [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 32-41.
[13] YUE Tianchi, ZHANG Shaowu, YANG Liang, LIN Hongfei, YU Kai. Stance Detection Method Based on Two-Stage Attention Mechanism [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 42-49.
[14] YU Chuanming,LI Haonan,AN Lu. Analysis of Text Emotion Cause Based on Multi-task Deep Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 50-61.
[15] LIN Yuan, LIU Haifeng, LIN Hongfei, XU Kan. Group Ranking Methods with Loss Function Incorporation [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 62-70.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] HU Jinming, WEI Duqu. Research on Generalized Sychronization of Fractional-order PMSM[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 14 -20 .
[2] ZHU Yongjian, LUO Jian, QIN Yunbai, QIN Guofeng, TANG Chuliu. A Method for Detecting Metal Surface Defects Based on Photometric Stereo and Series Expansion Methods[J]. Journal of Guangxi Normal University(Natural Science Edition), 2020, 38(6): 21 -31 .
[3] YANG Liting, LIU Xuecong, FAN Penglai, ZHOU Qihai. Research Progress in Vocal Communication of Nonhuman Primates in China[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 1 -9 .
[4] BIN Shiyu, LIAO Fang, DU Xuesong, XU Yilan, WANG Xin, WU Xia, LIN Yong. Research Progress on Cold Tolerance of Tilapia[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 10 -16 .
[5] LIU Jing, BIAN Xun. Characteristics of the Orthoptera Mitogenome and Its Application[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 17 -28 .
[6] LI Xingkang, ZHONG Enzhu, CUI Chunyan, ZHOU Jia, LI Xiaoping, GUAN Zhenhua. Monitoring Singing Behavior of Western Black Crested Gibbon (Nomascus concolor furvogaster)[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 29 -37 .
[7] HE Xinming, XIA Wancai, BA Sang, LONG Xiaobin, LAI Jiandong, YANG Chan, WANG Fan, LI Dayong. Grooming Strategies of Resident Males with Different Number of Mates in Yunnan Snub-nosed Monkeys (Rhinopithecus bieti)[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 38 -44 .
[8] FU Wen, REN Baoping, LIN Jianzhong, LUAN Ke, WANG Pengcheng, WANG Bing, LI Dayong, ZHOU Qihai. Jiyuan Taihang Mountain Macaque Population and Conservation Status[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 45 -52 .
[9] ZHENG Jingjin, LIANG Jipeng, ZHANG Kechu, HUANG Aimian, LU Qian, LI Youbang, HUANG Zhonghao. White-headed Langurs Select Foods Based on Woody Plants' Dominances[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 53 -64 .
[10] YANG Chan, WAN Yaqiong, HUANG Xiaofu, YUAN Xudong, ZHOU Hongyan, FANG Haocun, LI Dayong, LI Jiaqi. Activity Rhythm of Muntiacus reevesi Based on Infrared Camera Technology[J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(1): 65 -70 .