详细信息
Evaluation of text semantic features using latent dirichlet allocation model ( EI收录)
文献类型:期刊文献
英文题名:Evaluation of text semantic features using latent dirichlet allocation model
作者:Zhou, Chunjie[1]; Li, Nao[2,3]; Zhang, Chi[2]; Yang, Xiaoyu[1]
第一作者:Zhou, Chunjie
通讯作者:Li, Nao
机构:[1] Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing, 100101, China; [2] Collaborative Innovation Center of eTourism, Tourism College, Beijing Union University, Beijing, 100101, China; [3] Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, 100101, China
第一机构:北京联合大学北京市信息服务工程重点实验室
年份:2020
卷号:16
期号:6
起止页码:968-978
外文期刊名:International Journal of Performability Engineering
收录:EI(收录号:20203609127803);Scopus(收录号:2-s2.0-85090030102)
基金:This study was jointly supported by the Funding Project for Postgraduates in Beijing Union University, the Premium Funding Project for Academic Human Resources Development in Beijing Union University, and the State Key Laboratory of Resources and Environmental Information System.
语种:英文
外文关键词:Statistics - Tourism
摘要:Obtaining useful information from mass data on the Internet has been a hot topic in information process research in recent years. For unstructural data like online reviews based on natural languages, it becomes more challenging. Online consumer reviews reflect customers' real experience and opinions on products or services. However, there are short of methods or tools to help potential customers find high-quality and helpful reviews from a large number of reviews. This paper applied the concept and idea of creative computing to solve this problem. Tf-idf, as a traditional method to extract text features, measures the importance of words through word frequency and ignores the semantic information in the text data, while the topic model makes up for this deficiency. This paper proposed to use the vector of reviews allocated by LDA topic model to represent text semantic features. Basing on semantic features of reviews, it calculated cosine similarity between the thumb up reviews and other reviews and thus obtain the simulated helpfulness scores of all reviews. Then, a linear regression was designed to obtain two features, i.e., the syntax and semantic features, and determine the simulated helpfulness scores. The proposed method was validated by collected online tourism reviews of Forbidden City and Mount Huang on three Chinese representative online tourism platforms. The results showed that the proposed method can effectively obtain and thus compare the helpfulness of online reviews in a creative way. ? 2020 Totem Publisher, Inc. All rights reserved.
参考文献:
正在载入数据...