基于概率主题建模的图像语义学习与检索

广西师范大学学报（自然科学版） ›› 2012, Vol. 30 ›› Issue (3): 125-134.

基于概率主题建模的图像语义学习与检索

李志欣, 陈宏朝, 吴王景莉, 周生明

广西师范大学计算机科学与信息工程学院,广西桂林541004

收稿日期:2012-05-23 出版日期:2012-09-20 发布日期:2018-12-04
通讯作者: 李志欣(1971—),男,广西武鸣人,广西师范大学副教授,博士。E-mail:lizx@mailbox.gxnu.edu.cn
作者简介:李志欣,男,1971年10月出生,汉族,广西武鸣人,博士,广西师范大学计算机科学与信息工程学院副教授,硕士生导师,中国计算机学会会员。
基金资助:
国家自然科学基金资助项目(61165009);广西自然科学基金资助项目(2012GXNSFAA053219,2011GXNS-FB018068);“八桂学者”工程专项经费项目

Semantic Learning and Retrieval of Images Based on Probabilistic Topic Modeling

LI Zhi-xin, CHEN Hong-chao, WU Jing-li, ZHOU Sheng-ming

College of Computer Science and Information Technology,GuangxiNormal University,Guilin Guangxi 541004,China

Received:2012-05-23 Online:2012-09-20 Published:2018-12-04

摘要/Abstract

摘要： 针对图像检索中存在的“语义鸿沟”问题,本文提出一种语义学习模型进行图像的自动标注。首先提出连续的概率潜在语义分析(PLSA)模型及对应的参数估计算法,并利用最大惩罚似然的方法解决协方差矩阵的奇异性问题;然后,提出一个根据不同模态数据各自的特点进行处理的概率模型,该模型使用连续PLSA和传统PLSA分别建模视觉特征和文本关键词,并通过不对称学习算法发现两种模态之间共有的语义主题,从而能更精确地对未知图像进行标注。通过在分别包含5 000幅和31 695幅图像的两个标准Corel数据集上进行实验,并与几种典型的图像标注方法进行比较的结果表明,文中方法具有更高的精度和更好的效果。

关键词: 图像自动标注, 主题模型, 连续PLSA, 语义学习, 图像检索

Abstract: In order to bridge the semantic gap existing in imageretrieval,a semantic learning model is proposed to annotate image automatically.Firstly,continuous probabilistic latent semantic analysis (PLSA) and its corresponding parameter estimation algorithm are presented.In addition,maximum penalized likelihood is adopted to solve the singularity problem of covariance matrix.Secondly,in terms of the characteristics of different modalities,the proposed probabilistic model employs continuous PLSA and traditional PLSA to model visual features and textual words respectively.The model can discover the mutual semantic topics ofthese two modalities by an asymmetric learning approach.So it predicts semanticannotation more precisely for unseen images.Finally,the experiments are conducted on two baseline Corel datasets which contain 5 000 and 31 695 images respectively.In comparison with several state-of-the-art approaches,higher accuracy and superior effectiveness of the approach are reported.

Key words: automatic image annotation, topic model, continuous PLSA, semantic learning, image retrieval

中图分类号:

TP391

李志欣, 陈宏朝, 吴王景莉, 周生明. 基于概率主题建模的图像语义学习与检索[J]. 广西师范大学学报（自然科学版）, 2012, 30(3): 125-134.

LI Zhi-xin, CHEN Hong-chao, WU Jing-li, ZHOU Sheng-ming. Semantic Learning and Retrieval of Images Based on Probabilistic Topic Modeling[J]. Journal of Guangxi Normal University(Natural Science Edition), 2012, 30(3): 125-134.

参考文献

[1] SMEULDERS A W M,WORRING M,SANTINI S,et al.Content-based imageretrievalat the end of the early years[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2000,22(12):1349-1380.
[2] DATTA R,JOSHI D,LI Jia,et al.Image retrieval:ideas,influences,andtrends of the new age[J].ACM Computing Surveys,2008,40(2):5.
[3] 李志欣,施智平,李志清,等.图像检索中语义映射方法综述[J].计算机辅助设计与图形学学报,2008,20(8):1085-1096.
[4] CHANG E,GOH K,SYCHAY G,et al.CBSA:content-based soft annotation for multimodal image retrieval using Bayes point machines[J].IEEE Transactions on Circuits and Systems for Video Technology,2003,13(1):26-38.
[5] LI Jia,WANG J Z.Automatic linguistic indexing of pictures by a statisticalmodeling approach[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(9):1075-1088.
[6] CARNEIRO G,CHAN A B,MORENO P J,et al.Supervised learning of semantic classes for image annotation and retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(3):394-410.
[7] JEON J,LAVRENKO V,MANMATHA R.Automatic image annotation and retrieval using cross-media relevance models[C]//Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,2003:119-126.
[8] LAVRENKO V,MANMATHA R,JEON J.A model for learning the semanticsof pictures[C]//THRUN S,SAUL L K,SCHOLKOPF B.Advances in Neural Information Processing Systems 16.Cambridge:MIT Press,2004:553-560.
[9] FENG S L,MANMATHA R,LAVRENKO V.Multiple Bernoulli relevance models for image and video annotation[C]//Proceedings of IEEE Computer Society Conferenceon Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer SocietyPress,2004:1002-1009.
[10] DUYGULU P,BARNARD K,de FREITAS J F G,et al.Object recognitionas machine translation:learning a lexicon for a fixed image vocabulary[M]//Lecture Notes in Computer Science:vol.2353.Berlin:Springer-Varlag,2002:97-112.
[11] BARNARD K,DUYGULU P,FORSYTH D,et al.Matching words and pictures[J].Journal of Machine Learning Research,2003,3(2):1107-1135.
[12] BLEI D M,JORDAN M I.Modeling annotated data[C]//Proceedingsof the 26thAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,2003:127-134.
[13] MONAY F,GATICA-PEREZ D.Modeling semantic aspects for cross-media image indexing[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(10):1802-1817.
[14] 李志欣,施智平,李志清,等.融合语义主题的图像自动标注[J].软件学报,2011,22(4):801-812.
[15] HOFMANN T.Unsupervised learning by probabilistic latent semanticanalysis[J].Machine Learning,2001,42(1/2):177-196.
[16] BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet allocation[J].Journal of Machine Learning Research,2003,3(1):993-1022.
[17] LI Zhi-xin,SHI Zhi-ping,LIU Xi,et al.Automatic image annotation with continuous PLSA[C]//Proceedings of the 35th IEEE International Conference on Acoustics,Speech and Signal Processing.Los Alamitos:IEEE Computer Society Press,2010:806-809.
[18] 李志欣,施智平,刘曦,等.建模连续视觉特征的图像语义标注方法[J].计算机辅助设计与图形学学报,2010,22(8):1412-1420.
[19] ORMONEIT D,TRESP V.Averaging,maximum penalized likelihood andBayesian estimation for improving Gaussian mixture probability density estimates[J].IEEE Transactions on Neural Networks,1998,9(4):639-650.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed