Journal of Guangxi Normal University(Natural Science Edition) ›› 2022, Vol. 40 ›› Issue (3): 13-30.doi: 10.16088/j.issn.1001-6600.2021071101

Previous Articles     Next Articles

Survey of Algorithms Oriented to Complex High Utility Pattern Mining

LI Muhang, HAN Meng*, CHEN Zhiqiang, WU Hongxin, ZHANG Xilong   

  1. School of Computer Science and Engineering, North Minzu University, Yinchuan Ningxia 750021, China
  • Received:2021-07-11 Revised:2021-09-09 Online:2022-05-25 Published:2022-05-27

Abstract: High utility complex pattern is a new topic in data mining. For the first time, the complex high utility pattern mining algorithms are reviewed from two perspectives: high utility integrated pattern and high utility derivative pattern. Firstly, high utility sequential pattern(HUSP) is classified and discussed according to different data structures; high utility episode pattern(HUEP) and periodic high utility pattern(PHUP) are summarized in chronological order. Secondlyly, for the derivative model, the high average utility pattern, the high-utility pattern with negative unit profits and high on-shelf utility pattern are summarized from the perspective of data structure; the concise high utility pattern is summarized from the perspective of concise types. Thirdly, the advantages and disadvantages and upper boundaries of the existing integrated pattern and derivative pattern mining algorithms are compared and analyzed. Finally, based on the current research deficiencies, further research direction is given, including high utility pattern mining in uncertain data, high utility on-shelf pattern mining over data stream, and parallel high utility pattern mining in a big data environment.

Key words: survey, pattern mining, complex high utility pattern, high utility integrated pattern, high utility derivative pattern

CLC Number: 

  • TP311.13
[1]FOURNIER-VIGER P, WU C W, ZIDA S, et al. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning[C]// Foundations of Intelligent Systems:LNAI volume 8502. Berlin: Springer, 2014: 83-92.
[2]YIN J F, ZHENG Z G, CAOL B. USpan: an efficient algorithm for mining high utility sequential patterns[C]// Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2012: 660-668. DOI: 10.1145/2339530.2339636.
[3]FOURNIER-VIGER P, YANG P, LIN J C W, et al.HUE-Span: fast high utility episode mining[C]// Advanced Data Mining and Applications:LNAI volume 11888. Berlin: Springer, 2019: 169-184.
[4]FOURNIER-VIGER P, LIN J C W, DUONG Q H, et al. PHM: mining periodic high-utility itemsets[C]// Advances in Data Mining: Applications and Theoretical Aspects:LNAI volume 9728. Berlin: Springer, 2016: 64-79.
[5]WU J M T, LIN J C W, PIROUZ M, et al. TUB-HAUPM: tighter upper bound for mining high average-utility patterns[J]. IEEE Access, 2018, 6: 18655-18669. DOI: 10.1109/ACCESS.2018.2820740.
[6]FOURNIER-VIGER P, ZIDA S, LIN J C W, et al. EFIM-closed: Fast and memory efficient discovery of closed high-utilityitemsets[C]// Machine Learning and Data Mining in Pattern Recognition:LNAI volume 9729. Berlin: Springer, 2016: 199-213.
[7]FOURNIER-VIGER P. FHN: efficient mining of high-utilityitemsets with negative unit profits[C]// Advanced Data Mining and Applications:LNAI volume 8933. Berlin: Springer, 2014: 16-29.
[8]FOURNIER-VIGER P, ZIDA S. FOSHU: faster on-shelf high utility itemset mining-with or without negative unit profit[C]// Proceedings of the 30th annual ACM symposium on applied computing. New York: Association for Computing Machinery, 2015: 857-864. DOI: 10.1145/2695664.2695823.
[9]MABROUKEH N R, EZEIFE C I. A taxonomy of sequential pattern mining algorithms[J]. ACM Computing Surveys (CSUR), 2010, 43(1): 1-41. DOI: 10.1145/1824795.1824798.
[10]FOURNIER-VIGER P, LIN J C W, NKAMBOU R, et al. High-utility pattern mining[M].Berlin:Springer, 2019: 97-129.
[11]RAHMATI B, SOHRABI M K. A systematic survey on high utility itemset mining[J]. International Journal of Information Technology & Decision Making, 2019, 18(4):1113-1185. DOI: 10.1142/S0219622019300027.
[12]SINGH K, SINGH S S, KUMAR A, et al. High utility itemsets mining with negative utility value: a survey[J]. Journal of Intelligent and Fuzzy Systems, 2018,35(6):6551-6562. DOI: 10.3233/JIFS-18965.
[13]王少峰,韩萌,贾涛,等.数据流高效用模式挖掘综述[J].计算机应用研究,2020,37(9):2571-2578. DOI: 10.19734/j.issn.1001-3695.2019.03.0105.
[14]张春砚,韩萌,孙蕊,等.高效用模式挖掘关键技术综述[J].计算机应用研究,2021,38(2):330-340. DOI: 10.19734/j.issn.1001-3695.2020.02.0027.
[15]孙蕊,韩萌,张春砚,等.精简高效用模式挖掘综述[J].计算机应用研究,2021,38(4):975-981. DOI: 10.19734/j.issn.1001-3695.2020.02.0092.
[16]高曼,韩萌,雷冰冰.高效用模式产生策略综述[J].计算机工程与应用,2020,56(16):1-12.
[17]GAN W S, LIN J C W, ZHANG J X, et al. Fast utility mining on complex sequences[EB/OL].(2019-04-28)[2021-09-09].https://arxiv.org/abs/1904.12248.
[18]SHIE B E, HSIAO H F, TSENG V S, et al. Mining high utility mobile sequential patterns in mobile commerce environments[C]// Proceedings of the International Conference on Database Systems for Advanced Applications, DASFAA 2011. Berlin: Springer, 2011: 224-238.
[19]AHMED C F, TANBEER S K, JEONG B S. Mining high utility web access sequences in dynamic web logdata[C]// Proceedings of the 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Piscataway, NJ: IEEE, 2010: 76-81. DOI: 10.1109/SNPD.2010.21.
[20]ZIHAYAT M, WU C W, AN A J, et al. Efficiently mining high utility sequential patterns in static and streaming data[J]. Intelligent Data Analysis, 2017, 21(S1): S103-S135.
[21]唐辉军,王乐,樊成立.基于模式增长的高效用序列模式挖掘算法[J].自动化学报,2021,47(4):943-954. DOI: 10.16383/j.aas.c180660.
[22]WANG J Z, HUANG J L. Incremental mining of high utility sequential patterns in incremental databases[C]// Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York: Association for Computing Machinery, 2016: 2341-2346. DOI: 10.1145/2983323.2983691.
[23]WANG J Z, HUANG J L. On incremental high utility sequential pattern mining[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2018, 9(5): 1-26. DOI: 10.1145/3178114.
[24]ZIHAYAT M, CHEN Y, AN A J. Memory-adaptive high utility sequential pattern mining over data streams[J]. Machine Learning, 2017, 106(6): 799-836. DOI: 10.1007/s10994-016-5617-1.
[25]TANG H J, LIU Y G, WANG L. A new algorithm of mining high utility sequential pattern in streaming data[J]. International Journal of Computational Intelligence Systems, 2018, 12(1):342-350. DOI: 10.2991/ijcis.2019.125905650.
[26]ALKAN O K, KARAGOZ P.CRoM and HuspExt: improving efficiency of high utility sequential pattern extraction[C]// 2016 IEEE 32nd International Conference on Data Engineering (ICDE). Piscataway, NJ: IEEE,2016: 1472-1473. DOI: 10.1109/ICDE.2016.7498380.
[27]WANG J Z, HUANG J L, CHEN Y C. On efficiently mining high utility sequential patterns[J]. Knowledge and Information Systems, 2016, 49(2): 597-627. DOI: 10.1007/s10115-015-0914-8.
[28]XU T T, DONG X J, XU J L, et al. Mining high utility sequential patterns with negative item values[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2017, 31(10): 1750035. DOI: 10.1142/S0218001417500355.
[29]XU T T, XU J L, DONG X J. Mining high utility sequential patterns using multiple minimum utility[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2018, 32(10): 1859017. DOI: 10.1142/S0218001418590176.
[30]SONG W, RONG K K. Mining high utility sequential patterns using maximal remaining utility[C]// Data Mining and Big Data:LNISA volume 10943. Berlin: Springer, 2018: 466-477.
[31]ZHANG C K, ZU Y W, NIE J L, et al. Two efficient algorithms for mining high utility sequential patterns[C]// 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SustainCom/SocialCom 2019s), Piscataway, NJ: IEEE, 2019: 905-911. DOI: 10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00132.
[32]YIN J F, ZHENG Z G, CAO L B, et al. Efficiently mining top-k high utility sequential patterns[C]// 2013 IEEE 13th International Conference on Data Mining. Piscataway, NJ: IEEE, 2013: 1259-1264. DOI: 10.1109/ICDM.2013.148.
[33]LI T X, XU T T, DONG X J. HUNSPM: an efficient algorithm for mining high utility negative sequential patterns[C]// 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). Piscataway, NJ: IEEE, 2017: 1833-1837. DOI: 10.1109/FSKD.2017.8393045.
[34]LIN J C W, ZHANG J X, FOURNIER-VIGER P. High-utility sequential pattern mining with multiple minimum utility thresholds[C]// Proceedings of the Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, APWeb-WAIM 2017. Berlin: Springer, 2017: 215-229.
[35]BUFFETT S. Candidate list maintenance in high utility sequential pattern mining[C]// 2018 IEEE International Conference on Big Data (Big Data). Piscataway, NJ: IEEE, 2018: 644-652. DOI: 10.1109/BigData.2018.8622138.
[36]BUFFETT S. Dramatically reducing search for high utility sequential patterns by maintaining candidate lists[J]. Information, 2020, 11(1):44. DOI: 10.3390/info11010044.
[37]AHMED C F, TANBEER S K, JEONG B S. A novel approach for mining high utility sequential patterns in sequence databases[J]. ETRI journal, 2010, 32(5): 676-686.
[38]PEI J, HAN J W, MORTAZAVI-ASL B, et al. Mining sequential patterns by pattern-growth:the prefixspan approach[J]. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(11): 1424-1440. DOI: 10.1109/TKDE.2004.77.
[39]SHIE B E, CHENG J H, CHUANG K T, et al. A one-phase method for mining high utility mobile sequential patterns in mobile commerce environments[C]// Advanced Research in Applied Artificial Intelligence:LNAI volume 7345. Berlin: Springer, 2012: 616-626.
[40]LE B, HUYNH U, DINH D T. A pure array structure and parallel strategy for high-utility sequential patternmining[J]. Expert Systems with Applications, 2018, 104: 107-120. DOI: 10.1016/j.eswa.2018.03.019.
[41]GAN W S, LIN J C W, ZHANG J X, et al.ProUM: high utility sequential pattern mining[C]// Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). Piscataway, NJ: IEEE, 2019: 767-773. DOI: 10.1109/SMC.2019.8914402.
[42]LAN G C, HONG T P, TSENG V S, et al. Applying the maximum utility measure in high utility sequential pattern mining[J]. Expert Systems with Applications, 2014, 41(11): 5071-5081. DOI: 10.1016/j.eswa.2014.02.022.
[43]LIN J C W, LI Y F, FOURNIER-VIGER P, et al. An efficient chain structure to mine high-utility sequential patterns[C]// 2019 International Conference on Data Mining Workshops (ICDMW) . Piscataway, NJ: IEEE, 2019:1013-1019. DOI: 10.1109/ICDMW.2019.00146.
[44]WU C W, LIN Y F, YU P S, et al. Mining high utility episodes in complex event sequences[C]// Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2013: 536-544. DOI: 10.1145/2487575.2487654.
[45]GUO G M, ZHANG L, LIU Q, et al. High utility episode mining made practical and fast[C]// Advanced Data Mining and Applications:LNAI volume 8933. Berlin: Springer, 2014: 71-84.
[46]RATHORE S, DAWAR S, GOYAL V, et al. Top-K high utility episode mining from a complex event sequence[C]// Proceedings of the 21st International Conference on Mmanagement of Data. Telengana:Computer Society of India, 2016:1-8.
[47]LIN Y F, HUANG C F, TSENG V S. A novel methodology for stock investment using high utility episode mining and genetic algorithm[J]. Applied Soft Computing, 2017, 59: 303-315. DOI: 10.1016/j.asoc.2017.05.032.
[48]GAN W S, LIN J C W, CHAO H C, et al. Discovering high utility episodes in sequences[EB/OL].(2019-12-25)[2021-09-09].https://arxiv.org/abs/1912.11670.
[49]HONG T P, HSU J H, YANG K J, et al. Mining high utility partial periodic pattern by GPA[C]// 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Piscataway, NJ: IEEE, 2017: 820-824. DOI: 10.1109/SMC.2017.8122710.
[50]LIN J C W, ZHANG J X, FOURNIER-VIGER P, et al. Efficient mining of short periodic high-utility itemsets[C]// 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Piscataway, NJ: IEEE, 2016: 003083-003088. DOI: 10.1109/SMC.2016.7844710.
[51]DINH T, HUYNH V N, LE B. Mining periodic high utility sequential patterns[C]// Intelligent Information and Database Systems:LNAI volume 10191. Berlin: Springer, 2017: 545-555.
[52]ISMAIL W, HASSAN M M, FORTINO G. Productive-associated periodic high-utility itemsets mining[C]// 2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC). Piscataway, NJ: IEEE, 2017: 637-642. DOI: 10.1109/ICNSC.2017.8000165.
[53]DINH D T, LE B, FOURNIER-VIGER P, et al. An efficient algorithm for mining periodic high-utility sequential patterns[J]. Applied Intelligence, 2018, 48(12): 4694-4714. DOI: 10.1007/s10489-018-1227-x.
[54]REDDY T Y, KIRAN R U, TOYODA M, et al. Discovering partial periodic high utility itemsets in temporal databases[C]// Database and Expert Systems Applications:LNISA volume 11707. Berlin: Springer, 2019: 351-361.
[55]HONG T P, LEE C H, WANG S L. Mining high average-utility itemsets[C]// 2009 IEEE International Conference on Systems, Man and Cybernetics. Piscataway,NJ: IEEE, 2009: 2526-2530. DOI: 10.1109/ICSMC.2009.5346333.
[56]YUN U, KIM D. Mining of high average-utility itemsets using novel list structure and pruning strategy[J]. Future Generation Computer Systems, 2017, 68: 346-360. DOI: 10.1016/j.future.2016.10.027.
[57]LAN G C, HONG T P, TSENG V S. A projection-based approach for discovering high average-utility itemsets[J]. Journal of Information Science and Engineering, 2012, 28(1): 193-209.
[58]LU T, VO B, NGUYEN H T, et al. A new method for mining high average utility itemsets[C]// Computer Information Systems and Industrial Management:LNISA volume 8838. Berlin: Springer, 2015: 33-42.
[59]LIN J C W, REN S, FOURNIER-VIGER P, et al. EHAUPM: efficient high average-utility pattern mining with tighter upper bounds[J]. IEEE Access, 2017, 5: 12927-12940. DOI: 10.1109/ACCESS.2017.2717438.
[60]DAM T L, RAMAMPIARO H, NØRVÅG K, et al. Towards efficiently mining closed high utilityitemsets from incremental databases[J]. Knowledge-Based Systems, 2019, 165: 13-29. DOI: 10.1016/j.knosys.2018.11.019.
[61]DAM T L, LI K L, FOURNIER-VIGER P, et al. CLS-Miner: efficient and effective closed high-utility itemset mining[J]. Frontiers of Computer Science, 2019, 13(2): 357-381. DOI: 10.1007/s11704-016-6245-4.
[62]SHIE B E, YU P S, TSENG V S. Efficient algorithms for mining maximal high utility itemsets from data streams with different models[J]. Expert Systems with Applications, 2012, 39(17): 12947-12960. DOI: 10.1016/j.eswa.2012.05.035.
[63]WU C W, FOURNIER-VIGER P, GU J Y, et al. Mining compact high utility itemsets without candidate generation[M]// FOURNIER-VIGER P, LIN J C W, NKAMBOU R, et al. High-Utility Pattern Mining. Berlin:Springer, 2019: 279-302.
[64]CHU C J, TSENG V S, LIANG T. An efficient algorithm for mining high utility itemsets with negative item values in large databases[J]. Applied Mathematics and Computation, 2009, 215: 767-778.
[65]SUBRAMANIAN K, KANDHASAMY P. UP-GNIV: an expeditious high utility pattern mining algorithm for itemsets with negative utility values[J]. International Journal of Information Technology and Management, 2015, 14(1): 26-42. DOI: 10.1504/IJITM.2015.066056.
[66]SINGH K, SHAKYA H K, SINGH A, et al. Mining of high utility itemsets with negative utility[J]. Expert Systems, 2018, 35(6): e12296. DOI: 10.1111/exsy.12296.
[67]KRISHNAMOORTHY S. Efficiently mining high utility itemsets with negative unit profits[J]. Knowledge-Based Systems, 2018, 145: 1-14. DOI: 10.1016/j.knosys.2017.12.035.
[68]GAN W S, LIN J C W, FOURNIER-VIGER P, et al. Mining high-utility itemsets with both positive and negative unit profits from uncertain databases[C]// Advances in Knowledge Discovery and Data Mining:LNAI volume 10234. Berlin: Springer, 2017: 434-446.
[69]LAN G C, HONG T P, HUANG J P, et al. On-shelf utility mining with negative item values[J]. Expert Systems with Applications, 2014, 41(7): 3450-3459. DOI: 10.1016/j.eswa.2013.10.049.
[70]DAM T L, LI K L, FOURNIER-VIGER P, et al. An efficient algorithm for mining top-k on-shelf high utility itemsets[J]. Knowledge and Information Systems, 2017, 52(3): 621-655. DOI: 10.1007/s10115-016-1020-2.
[71]LAN G C, HONG T P, TSENG V S. Discovery of high utility itemsets from on-shelf time periods of products[J]. Expert Systems with Applications, 2011, 38(5): 5851-5857. DOI: 10.1016/j.eswa.2010.11.040.
[72]RADKAR A N, PAWAR S S. Mining high on-shelf utility itemsets with negative values from dynamic updated database[EB/OL].(2015-07-07)[2021-09-09].https://arxiv.org/abs/1507.01759.
[73]LIN J C W, GAN W S, FOURNIER-VIGER P, et al. Efficient algorithms for mining high-utility itemsets in uncertain databases[J]. Knowledge-Based Systems, 2016, 96: 171-187. DOI: 10.1016/j.knosys.2015.12.019.
[74]BUI N, VO B, HUYNH V N, et al. Mining closed high utility itemsets in uncertain databases[C]// Proceedings of the Seventh Symposium on Information and Communication Technology. New York: Association for Computing Machinery, 2016: 7-14. DOI: 10.1145/3011077.3011124.
[75]LIU J Q, ZHAO R, YANG X C, et al. Efficient parallel algorithm for mining high utility patterns based on spark[C]// 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). Piscataway,NJ:IEEE, 2019: 367-372. DOI: 10.1109/DSC.2019.00062.
[1] YANG Zhou, FAN Yixing, ZHU Xiaofei, GUO Jiafeng, WANG Yue. Survey on Modeling Factors of Neural Information Retrieval Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(2): 1-12.
[2] HU Yucong, XIE Yichen,HUANG Jingxiang. The Change of Private Car Travel Rate under the Influence of Policy: a Case Study of Guangzhou [J]. Journal of Guangxi Normal University(Natural Science Edition), 2019, 37(1): 98-105.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] AI Yan, JIA Nan, WANG Yuan, GUO Jing, PAN Dongdong. Review of Statistical Methods and Applications of Genetic Association Analysis for Multiple Traits and Multiple Locus[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 1 -14 .
[2] BAI Defa, XU Xin, WANG Guochang. Review of Generalized Linear Models and Classification for Functional Data[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 15 -29 .
[3] ZENG Qingfan, QIN Yongsong, LI Yufang. Empirical Likelihood Inference for a Class of Spatial Panel Data Models[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 30 -42 .
[4] ZHANG Zhifei, DUAN Qian, LIU Naijia, HUANG Lei. High-dimensional Nonlinear Regression Model Based on JMI[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 43 -56 .
[5] YANG Di, FANG Yangxin, ZHOU Yan. New Category Classification Research Based on MEB and SVM Methods[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 57 -67 .
[6] CHEN Zhongxiu, ZHANG Xingfa, XIONG Qiang, SONG Zefang. Estimation and Test for Asymmetric DAR Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(1): 68 -81 .
[7] DU Jinfeng, WANG Hairong, LIANG Huan, WANG Dong. Progress of Cross-modal Retrieval Methods Based on Representation Learning[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 1 -12 .
[8] CHAO Rui, ZHANG Kunli, WANG Jiajia, HU Bin, ZHANG Weicong, HAN Yingjie, ZAN Hongying. Construction of Chinese Multimodal Knowledge Base[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 31 -39 .
[9] LI Zhengguang, CHEN Heng, LIN Hongfei. Identification of Adverse Drug Reaction on Social Media Using Bi-directional Language Model[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 40 -48 .
[10] ZHOU Shengkai, FU Lizhen, SONG Wen’ai. Semantic Similarity Computing Model for Short Text Based on Deep Learning[J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 49 -56 .