详细信息
基于BERT+BiLSTM+CRF的中文景点命名实体识别
Chinese Scenic Spot Named Entity Recognition Based on BERT+BiLSTM+CRF
文献类型:期刊文献
中文题名:基于BERT+BiLSTM+CRF的中文景点命名实体识别
英文题名:Chinese Scenic Spot Named Entity Recognition Based on BERT+BiLSTM+CRF
作者:赵平[1];孙连英[2];万莹[1];葛娜[1]
第一作者:赵平
机构:[1]北京联合大学智慧城市学院,北京100101;[2]北京联合大学城市轨道交通与物流学院,北京100101
第一机构:北京联合大学智慧城市学院
年份:2020
卷号:29
期号:6
起止页码:169-174
中文期刊名:计算机系统应用
外文期刊名:Computer Systems & Applications
收录:CSTPCD
基金:国家重点研发计划(2018 YFC0807806);中国物流学会项目(2018CSLKT3-184)。
语种:中文
中文关键词:BERT语言模型;BiLSTM;条件随机场;景点实体识别
外文关键词:BERT model;BiLSTM;conditional random field;scenic spot entity recognition
摘要:为解决旅游文本在特征表示时的一词多义问题,针对旅游游记文本景点实体识别中景点别名的问题,研究了一种融合语言模型的中文景点实体识别模型.首先使用BERT语言模型进行文本特征提取获取字粒度向量矩阵,BiLSTM用于上下文信息的提取,同时结合CRF模型提取全局最优序列,最终得到景点命名实体.实验表明,提出的模型性能提升显著,在实际旅游领域内景点识别的测试中,与以往研究者方法比较下准确率,召回率分别提升了8.33%,1.71%.
To solve the polysemy troublesome problem of tourism text in feature extraction,a Chinese scenic spot named entity recognition model based on fusion language model is studied for the problem of attraction alias in the visual recognition of tourist travel texts.Firstly,the BERT is used for tourism text feature extraction to obtain the word granularity vector matrix.BiLSTM is used to extract the context information.The CRF is used to obtain the global optimal sequence,and finally the tourist attraction entity is obtained.Experiments show that the performance of the proposed model is significantly improved.In the test of scenic spot identification in the actual tourism field,compared with the previous research,precision and recall rates are increased by 8.33%and 1.71%,respectively.
参考文献:
正在载入数据...