|
广西师范大学学报(自然科学版) ›› 2022, Vol. 40 ›› Issue (3): 13-30.doi: 10.16088/j.issn.1001-6600.2021071101
李慕航, 韩萌*, 陈志强, 武红鑫, 张喜龙
LI Muhang, HAN Meng*, CHEN Zhiqiang, WU Hongxin, ZHANG Xilong
摘要: 复杂高效用模式挖掘是当前研究的一个新兴主题。本文首次从高效用融合模式和衍生模式2个角度进行讨论。首先,对于融合模式,根据数据结构的不同对高效用序列模式进行分类论述;按照时间顺序对高效用片段模式、周期高效用模式进行概述。针对衍生模式,从数据结构角度对高平均效用模式、带有负项的高效用模式、on-shelf高效用模式进行总结;从精简类型角度概述精简高效用模式,并对现有融合模式和衍生模式挖掘算法的优缺点、上边界等进行对比分析。最后,针对现阶段研究缺陷与不足,给出了下一步研究方向,包括不确定数据中的高效用模式挖掘方法、数据流上的高效用on-shelf模式挖掘方法和大数据环境下的并行高效用模式挖掘方法。
中图分类号:
[1]FOURNIER-VIGER P, WU C W, ZIDA S, et al. FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning[C]// Foundations of Intelligent Systems:LNAI volume 8502. Berlin: Springer, 2014: 83-92. [2]YIN J F, ZHENG Z G, CAOL B. USpan: an efficient algorithm for mining high utility sequential patterns[C]// Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2012: 660-668. DOI: 10.1145/2339530.2339636. [3]FOURNIER-VIGER P, YANG P, LIN J C W, et al.HUE-Span: fast high utility episode mining[C]// Advanced Data Mining and Applications:LNAI volume 11888. Berlin: Springer, 2019: 169-184. [4]FOURNIER-VIGER P, LIN J C W, DUONG Q H, et al. PHM: mining periodic high-utility itemsets[C]// Advances in Data Mining: Applications and Theoretical Aspects:LNAI volume 9728. Berlin: Springer, 2016: 64-79. [5]WU J M T, LIN J C W, PIROUZ M, et al. TUB-HAUPM: tighter upper bound for mining high average-utility patterns[J]. IEEE Access, 2018, 6: 18655-18669. DOI: 10.1109/ACCESS.2018.2820740. [6]FOURNIER-VIGER P, ZIDA S, LIN J C W, et al. EFIM-closed: Fast and memory efficient discovery of closed high-utilityitemsets[C]// Machine Learning and Data Mining in Pattern Recognition:LNAI volume 9729. Berlin: Springer, 2016: 199-213. [7]FOURNIER-VIGER P. FHN: efficient mining of high-utilityitemsets with negative unit profits[C]// Advanced Data Mining and Applications:LNAI volume 8933. Berlin: Springer, 2014: 16-29. [8]FOURNIER-VIGER P, ZIDA S. FOSHU: faster on-shelf high utility itemset mining-with or without negative unit profit[C]// Proceedings of the 30th annual ACM symposium on applied computing. New York: Association for Computing Machinery, 2015: 857-864. DOI: 10.1145/2695664.2695823. [9]MABROUKEH N R, EZEIFE C I. A taxonomy of sequential pattern mining algorithms[J]. ACM Computing Surveys (CSUR), 2010, 43(1): 1-41. DOI: 10.1145/1824795.1824798. [10]FOURNIER-VIGER P, LIN J C W, NKAMBOU R, et al. High-utility pattern mining[M].Berlin:Springer, 2019: 97-129. [11]RAHMATI B, SOHRABI M K. A systematic survey on high utility itemset mining[J]. International Journal of Information Technology & Decision Making, 2019, 18(4):1113-1185. DOI: 10.1142/S0219622019300027. [12]SINGH K, SINGH S S, KUMAR A, et al. High utility itemsets mining with negative utility value: a survey[J]. Journal of Intelligent and Fuzzy Systems, 2018,35(6):6551-6562. DOI: 10.3233/JIFS-18965. [13]王少峰,韩萌,贾涛,等.数据流高效用模式挖掘综述[J].计算机应用研究,2020,37(9):2571-2578. DOI: 10.19734/j.issn.1001-3695.2019.03.0105. [14]张春砚,韩萌,孙蕊,等.高效用模式挖掘关键技术综述[J].计算机应用研究,2021,38(2):330-340. DOI: 10.19734/j.issn.1001-3695.2020.02.0027. [15]孙蕊,韩萌,张春砚,等.精简高效用模式挖掘综述[J].计算机应用研究,2021,38(4):975-981. DOI: 10.19734/j.issn.1001-3695.2020.02.0092. [16]高曼,韩萌,雷冰冰.高效用模式产生策略综述[J].计算机工程与应用,2020,56(16):1-12. [17]GAN W S, LIN J C W, ZHANG J X, et al. Fast utility mining on complex sequences[EB/OL].(2019-04-28)[2021-09-09].https://arxiv.org/abs/1904.12248. [18]SHIE B E, HSIAO H F, TSENG V S, et al. Mining high utility mobile sequential patterns in mobile commerce environments[C]// Proceedings of the International Conference on Database Systems for Advanced Applications, DASFAA 2011. Berlin: Springer, 2011: 224-238. [19]AHMED C F, TANBEER S K, JEONG B S. Mining high utility web access sequences in dynamic web logdata[C]// Proceedings of the 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. Piscataway, NJ: IEEE, 2010: 76-81. DOI: 10.1109/SNPD.2010.21. [20]ZIHAYAT M, WU C W, AN A J, et al. Efficiently mining high utility sequential patterns in static and streaming data[J]. Intelligent Data Analysis, 2017, 21(S1): S103-S135. [21]唐辉军,王乐,樊成立.基于模式增长的高效用序列模式挖掘算法[J].自动化学报,2021,47(4):943-954. DOI: 10.16383/j.aas.c180660. [22]WANG J Z, HUANG J L. Incremental mining of high utility sequential patterns in incremental databases[C]// Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. New York: Association for Computing Machinery, 2016: 2341-2346. DOI: 10.1145/2983323.2983691. [23]WANG J Z, HUANG J L. On incremental high utility sequential pattern mining[J]. ACM Transactions on Intelligent Systems and Technology (TIST), 2018, 9(5): 1-26. DOI: 10.1145/3178114. [24]ZIHAYAT M, CHEN Y, AN A J. Memory-adaptive high utility sequential pattern mining over data streams[J]. Machine Learning, 2017, 106(6): 799-836. DOI: 10.1007/s10994-016-5617-1. [25]TANG H J, LIU Y G, WANG L. A new algorithm of mining high utility sequential pattern in streaming data[J]. International Journal of Computational Intelligence Systems, 2018, 12(1):342-350. DOI: 10.2991/ijcis.2019.125905650. [26]ALKAN O K, KARAGOZ P.CRoM and HuspExt: improving efficiency of high utility sequential pattern extraction[C]// 2016 IEEE 32nd International Conference on Data Engineering (ICDE). Piscataway, NJ: IEEE,2016: 1472-1473. DOI: 10.1109/ICDE.2016.7498380. [27]WANG J Z, HUANG J L, CHEN Y C. On efficiently mining high utility sequential patterns[J]. Knowledge and Information Systems, 2016, 49(2): 597-627. DOI: 10.1007/s10115-015-0914-8. [28]XU T T, DONG X J, XU J L, et al. Mining high utility sequential patterns with negative item values[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2017, 31(10): 1750035. DOI: 10.1142/S0218001417500355. [29]XU T T, XU J L, DONG X J. Mining high utility sequential patterns using multiple minimum utility[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2018, 32(10): 1859017. DOI: 10.1142/S0218001418590176. [30]SONG W, RONG K K. Mining high utility sequential patterns using maximal remaining utility[C]// Data Mining and Big Data:LNISA volume 10943. Berlin: Springer, 2018: 466-477. [31]ZHANG C K, ZU Y W, NIE J L, et al. Two efficient algorithms for mining high utility sequential patterns[C]// 2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SustainCom/SocialCom 2019s), Piscataway, NJ: IEEE, 2019: 905-911. DOI: 10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00132. [32]YIN J F, ZHENG Z G, CAO L B, et al. Efficiently mining top-k high utility sequential patterns[C]// 2013 IEEE 13th International Conference on Data Mining. Piscataway, NJ: IEEE, 2013: 1259-1264. DOI: 10.1109/ICDM.2013.148. [33]LI T X, XU T T, DONG X J. HUNSPM: an efficient algorithm for mining high utility negative sequential patterns[C]// 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). Piscataway, NJ: IEEE, 2017: 1833-1837. DOI: 10.1109/FSKD.2017.8393045. [34]LIN J C W, ZHANG J X, FOURNIER-VIGER P. High-utility sequential pattern mining with multiple minimum utility thresholds[C]// Proceedings of the Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data, APWeb-WAIM 2017. Berlin: Springer, 2017: 215-229. [35]BUFFETT S. Candidate list maintenance in high utility sequential pattern mining[C]// 2018 IEEE International Conference on Big Data (Big Data). Piscataway, NJ: IEEE, 2018: 644-652. DOI: 10.1109/BigData.2018.8622138. [36]BUFFETT S. Dramatically reducing search for high utility sequential patterns by maintaining candidate lists[J]. Information, 2020, 11(1):44. DOI: 10.3390/info11010044. [37]AHMED C F, TANBEER S K, JEONG B S. A novel approach for mining high utility sequential patterns in sequence databases[J]. ETRI journal, 2010, 32(5): 676-686. [38]PEI J, HAN J W, MORTAZAVI-ASL B, et al. Mining sequential patterns by pattern-growth:the prefixspan approach[J]. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(11): 1424-1440. DOI: 10.1109/TKDE.2004.77. [39]SHIE B E, CHENG J H, CHUANG K T, et al. A one-phase method for mining high utility mobile sequential patterns in mobile commerce environments[C]// Advanced Research in Applied Artificial Intelligence:LNAI volume 7345. Berlin: Springer, 2012: 616-626. [40]LE B, HUYNH U, DINH D T. A pure array structure and parallel strategy for high-utility sequential patternmining[J]. Expert Systems with Applications, 2018, 104: 107-120. DOI: 10.1016/j.eswa.2018.03.019. [41]GAN W S, LIN J C W, ZHANG J X, et al.ProUM: high utility sequential pattern mining[C]// Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). Piscataway, NJ: IEEE, 2019: 767-773. DOI: 10.1109/SMC.2019.8914402. [42]LAN G C, HONG T P, TSENG V S, et al. Applying the maximum utility measure in high utility sequential pattern mining[J]. Expert Systems with Applications, 2014, 41(11): 5071-5081. DOI: 10.1016/j.eswa.2014.02.022. [43]LIN J C W, LI Y F, FOURNIER-VIGER P, et al. An efficient chain structure to mine high-utility sequential patterns[C]// 2019 International Conference on Data Mining Workshops (ICDMW) . Piscataway, NJ: IEEE, 2019:1013-1019. DOI: 10.1109/ICDMW.2019.00146. [44]WU C W, LIN Y F, YU P S, et al. Mining high utility episodes in complex event sequences[C]// Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2013: 536-544. DOI: 10.1145/2487575.2487654. [45]GUO G M, ZHANG L, LIU Q, et al. High utility episode mining made practical and fast[C]// Advanced Data Mining and Applications:LNAI volume 8933. Berlin: Springer, 2014: 71-84. [46]RATHORE S, DAWAR S, GOYAL V, et al. Top-K high utility episode mining from a complex event sequence[C]// Proceedings of the 21st International Conference on Mmanagement of Data. Telengana:Computer Society of India, 2016:1-8. [47]LIN Y F, HUANG C F, TSENG V S. A novel methodology for stock investment using high utility episode mining and genetic algorithm[J]. Applied Soft Computing, 2017, 59: 303-315. DOI: 10.1016/j.asoc.2017.05.032. [48]GAN W S, LIN J C W, CHAO H C, et al. Discovering high utility episodes in sequences[EB/OL].(2019-12-25)[2021-09-09].https://arxiv.org/abs/1912.11670. [49]HONG T P, HSU J H, YANG K J, et al. Mining high utility partial periodic pattern by GPA[C]// 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Piscataway, NJ: IEEE, 2017: 820-824. DOI: 10.1109/SMC.2017.8122710. [50]LIN J C W, ZHANG J X, FOURNIER-VIGER P, et al. Efficient mining of short periodic high-utility itemsets[C]// 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC). Piscataway, NJ: IEEE, 2016: 003083-003088. DOI: 10.1109/SMC.2016.7844710. [51]DINH T, HUYNH V N, LE B. Mining periodic high utility sequential patterns[C]// Intelligent Information and Database Systems:LNAI volume 10191. Berlin: Springer, 2017: 545-555. [52]ISMAIL W, HASSAN M M, FORTINO G. Productive-associated periodic high-utility itemsets mining[C]// 2017 IEEE 14th International Conference on Networking, Sensing and Control (ICNSC). Piscataway, NJ: IEEE, 2017: 637-642. DOI: 10.1109/ICNSC.2017.8000165. [53]DINH D T, LE B, FOURNIER-VIGER P, et al. An efficient algorithm for mining periodic high-utility sequential patterns[J]. Applied Intelligence, 2018, 48(12): 4694-4714. DOI: 10.1007/s10489-018-1227-x. [54]REDDY T Y, KIRAN R U, TOYODA M, et al. Discovering partial periodic high utility itemsets in temporal databases[C]// Database and Expert Systems Applications:LNISA volume 11707. Berlin: Springer, 2019: 351-361. [55]HONG T P, LEE C H, WANG S L. Mining high average-utility itemsets[C]// 2009 IEEE International Conference on Systems, Man and Cybernetics. Piscataway,NJ: IEEE, 2009: 2526-2530. DOI: 10.1109/ICSMC.2009.5346333. [56]YUN U, KIM D. Mining of high average-utility itemsets using novel list structure and pruning strategy[J]. Future Generation Computer Systems, 2017, 68: 346-360. DOI: 10.1016/j.future.2016.10.027. [57]LAN G C, HONG T P, TSENG V S. A projection-based approach for discovering high average-utility itemsets[J]. Journal of Information Science and Engineering, 2012, 28(1): 193-209. [58]LU T, VO B, NGUYEN H T, et al. A new method for mining high average utility itemsets[C]// Computer Information Systems and Industrial Management:LNISA volume 8838. Berlin: Springer, 2015: 33-42. [59]LIN J C W, REN S, FOURNIER-VIGER P, et al. EHAUPM: efficient high average-utility pattern mining with tighter upper bounds[J]. IEEE Access, 2017, 5: 12927-12940. DOI: 10.1109/ACCESS.2017.2717438. [60]DAM T L, RAMAMPIARO H, NØRVÅG K, et al. Towards efficiently mining closed high utilityitemsets from incremental databases[J]. Knowledge-Based Systems, 2019, 165: 13-29. DOI: 10.1016/j.knosys.2018.11.019. [61]DAM T L, LI K L, FOURNIER-VIGER P, et al. CLS-Miner: efficient and effective closed high-utility itemset mining[J]. Frontiers of Computer Science, 2019, 13(2): 357-381. DOI: 10.1007/s11704-016-6245-4. [62]SHIE B E, YU P S, TSENG V S. Efficient algorithms for mining maximal high utility itemsets from data streams with different models[J]. Expert Systems with Applications, 2012, 39(17): 12947-12960. DOI: 10.1016/j.eswa.2012.05.035. [63]WU C W, FOURNIER-VIGER P, GU J Y, et al. Mining compact high utility itemsets without candidate generation[M]// FOURNIER-VIGER P, LIN J C W, NKAMBOU R, et al. High-Utility Pattern Mining. Berlin:Springer, 2019: 279-302. [64]CHU C J, TSENG V S, LIANG T. An efficient algorithm for mining high utility itemsets with negative item values in large databases[J]. Applied Mathematics and Computation, 2009, 215: 767-778. [65]SUBRAMANIAN K, KANDHASAMY P. UP-GNIV: an expeditious high utility pattern mining algorithm for itemsets with negative utility values[J]. International Journal of Information Technology and Management, 2015, 14(1): 26-42. DOI: 10.1504/IJITM.2015.066056. [66]SINGH K, SHAKYA H K, SINGH A, et al. Mining of high utility itemsets with negative utility[J]. Expert Systems, 2018, 35(6): e12296. DOI: 10.1111/exsy.12296. [67]KRISHNAMOORTHY S. Efficiently mining high utility itemsets with negative unit profits[J]. Knowledge-Based Systems, 2018, 145: 1-14. DOI: 10.1016/j.knosys.2017.12.035. [68]GAN W S, LIN J C W, FOURNIER-VIGER P, et al. Mining high-utility itemsets with both positive and negative unit profits from uncertain databases[C]// Advances in Knowledge Discovery and Data Mining:LNAI volume 10234. Berlin: Springer, 2017: 434-446. [69]LAN G C, HONG T P, HUANG J P, et al. On-shelf utility mining with negative item values[J]. Expert Systems with Applications, 2014, 41(7): 3450-3459. DOI: 10.1016/j.eswa.2013.10.049. [70]DAM T L, LI K L, FOURNIER-VIGER P, et al. An efficient algorithm for mining top-k on-shelf high utility itemsets[J]. Knowledge and Information Systems, 2017, 52(3): 621-655. DOI: 10.1007/s10115-016-1020-2. [71]LAN G C, HONG T P, TSENG V S. Discovery of high utility itemsets from on-shelf time periods of products[J]. Expert Systems with Applications, 2011, 38(5): 5851-5857. DOI: 10.1016/j.eswa.2010.11.040. [72]RADKAR A N, PAWAR S S. Mining high on-shelf utility itemsets with negative values from dynamic updated database[EB/OL].(2015-07-07)[2021-09-09].https://arxiv.org/abs/1507.01759. [73]LIN J C W, GAN W S, FOURNIER-VIGER P, et al. Efficient algorithms for mining high-utility itemsets in uncertain databases[J]. Knowledge-Based Systems, 2016, 96: 171-187. DOI: 10.1016/j.knosys.2015.12.019. [74]BUI N, VO B, HUYNH V N, et al. Mining closed high utility itemsets in uncertain databases[C]// Proceedings of the Seventh Symposium on Information and Communication Technology. New York: Association for Computing Machinery, 2016: 7-14. DOI: 10.1145/3011077.3011124. [75]LIU J Q, ZHAO R, YANG X C, et al. Efficient parallel algorithm for mining high utility patterns based on spark[C]// 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). Piscataway,NJ:IEEE, 2019: 367-372. DOI: 10.1109/DSC.2019.00062. |
[1] | 杨州, 范意兴, 朱小飞, 郭嘉丰, 王越. 神经信息检索模型建模因素综述[J]. 广西师范大学学报(自然科学版), 2021, 39(2): 1-12. |
|
版权所有 © 广西师范大学学报(自然科学版)编辑部 地址:广西桂林市三里店育才路15号 邮编:541004 电话:0773-5857325 E-mail: gxsdzkb@mailbox.gxnu.edu.cn 本系统由北京玛格泰克科技发展有限公司设计开发 |