详细信息
文献类型:期刊文献
中文题名:基于CNN和SOM的评论主题发现
英文题名:Topics Discovery of Travel Reviews Based on CNN and SOM
作者:谢宗彦[1];黎巎[2];周纯洁[1]
第一作者:谢宗彦
机构:[1]北京联合大学北京市信息服务工程重点实验室;[2]北京联合大学旅游信息化协同创新中心
第一机构:北京联合大学北京市信息服务工程重点实验室
年份:2018
卷号:36
期号:6
起止页码:30-34
中文期刊名:情报科学
外文期刊名:Information Science
收录:CSTPCD;;国家哲学社会科学学术期刊数据库;北大核心:【北大核心2017】;CSSCI:【CSSCI2017_2018】;
基金:国家自然科学基金青年项目(41101111);北京联合大学人才强校优选计划项目(BPHR2017CS15)
语种:中文
中文关键词:深度学习;卷积神经网络;主题发现;短文本;旅游评论
外文关键词:deep learning;Convolutional Neural Networks;topic discovery;short text;travel comments
摘要:【目的/意义】随着旅游网站的增加,游客的网络评论日益增多。针对传统方法在旅游短文本评论主题分类时出现特征维度过高和数据稀疏等问题,本文提出一种基于卷积神经网络和SOM的旅游评论主题发现方法。【方法/过程】首先采用词向量来进行文本表示,降低了特征维度过高问题;其次,通过卷积神经网络对评论文本提取高阶的抽象特征;最后在通过SOM模型基于提取到的抽象特征对主题进行聚类。【结果/结论】实验结果表明,CNNSOM算法较传统文本聚类算法在准确率、召回率和F值上都有显著提高,能够更好的进行旅游评论的主题发现。
【Purpose/significance】As the number of travel websites increases, the number of online reviews of tourists is increasing. Aiming at the problems of traditional methods such as feature over-dimensioning and data sparseness in the topic classification of short travel text reviews, this paper proposes a topic discovery method for travel reviews based on convolutional neural networks and SOM.【Method/process】Firstly, word vectors are used to represent the texts, which reduces the problem of over-dimensionality. Secondly, high-order abstract features are extracted from the commentary texts by Convolutional Neural Networks. Finally, the topics are clustered based on the abstract features extracted from the SOM model.【Result/conclusion】The experimental results show that compared with the traditional text clustering algorithm, CNN-SOM algorithm has significantly improved the accuracy rate, recall rate and F-value, which can make thematic discovery of travel reviews better.
参考文献:
正在载入数据...