详细信息
考虑文本情感特征的电商小微企业信用风险预警
Credit Risk Warning for E-commerce Small and Micro Enterprises Considering Textual Sentiment Features
文献类型:期刊文献
中文题名:考虑文本情感特征的电商小微企业信用风险预警
英文题名:Credit Risk Warning for E-commerce Small and Micro Enterprises Considering Textual Sentiment Features
作者:徐鲲[1];李莹[1];鲍新中[1]
第一作者:徐鲲
机构:[1]北京联合大学管理学院,北京100101
第一机构:北京联合大学管理学院
年份:2023
卷号:32
期号:12
起止页码:195-201
中文期刊名:运筹与管理
外文期刊名:Operations Research and Management Science
收录:CSTPCD;;国家哲学社会科学学术期刊数据库;北大核心:【北大核心2020】;CSCD:【CSCD_E2023_2024】;CSSCI:【CSSCI_E2023_2024】;
基金:教育部人文社会科学基金项目(20YJC630175)。
语种:中文
中文关键词:文本情感特征;信用风险预警;随机森林;网格搜索
外文关键词:text sentiment features;credit risk warning;random forest;grid search
摘要:电商小微企业为社会创造多元就业岗位、促进先进生产力发展,但信用风险影响其正常融资与发展。为进一步完善电商小微企业信用风险预警问题,基于淘宝生鲜行业小微企业的真实交易数据,考虑在线评论的文本情感特征,建立主客观两维度的信用风险指标体系;构建“两步法”网格搜索算法优化的随机森林模型,并运用SMOTE获取平衡数据集以构建更严格的预警模型;同时通过建立Logistic、CART、随机森林三大模型来设置对照组。实证结果表明:(1)考虑文本情感特征因素后所构建主客观两维度指标体系有效合理,通过了ROC有效性判定。(2)“两步法”网格搜索算法优化的随机森林模型效果优于其他三种预警模型。(3)平衡数据集不管对于单个预警模型还是集成预警模型而言都至关重要。研究为电商平台、金融机构建立统一预警模型、科学预测电商小微企业信用、高效贷款提供新思路。
Small and micro e-commerce enterprises contribute to societal employment diversity and promote the development of advanced productivity.However,their normal financing and development are impeded by credit risks.With the empowerment of internet financing by cloud computing and big data in the areas such as information collection and intelligent decision-making,the perspective of credit risk assessment has expanded.Currently,the academic community emphasizes the importance of qualitative indicators in assessing credit risks for small enterprises.For small and microe-commerce enterprises,the most distinctive form of unstructured data is the publicly available consumer online review texts on platforms.The subjective sentiments hidden in these texts can subtly influence subsequent consumers’attitudes toward products,preferences for companies,and consequently,their perception of risks.This can significantly impact the credit of small and microe-commerce enterprises.Based on the above,this study collects online review text data,considers textual sentiment features,and conducts in-depth exploration to analyze the credit risks of e-commerce small and micro-enterprises in the fresh produce industry.The marginal contribution of this article lies in its theoretical capacity to provide new insights for exploring credit risk early warning models and scientifically predicting credit risks for small and microe-commerce enterprises under the backdrop of big data and unstructured data utilization.In practice,it aids in advancing the credit risk early warning for small and microe-commerce enterprises,helping them focus on their credit risks from the perspective of online review texts.To further enhance the credit risk early warning for small and microe-commerce enterprises,this study focuses on C2C fresh produce businesses on the Taobao platform.Firstly,a dual-dimensional credit risk indicator system is designed based on subjective and objective criteria.Using Python,822 Taobao fresh produce stores are crawled and filtered,resulting in 33,756 online reviews.Considering the sentiment features embedded in online textual comments,an LDA topic model is constructed to obtain subjective indicators.The sentiment analysis method is employed to build a sentiment lexicon for quantifying subjective indicators.Combined with objective indicators such as qualification and operational metrics for small and microe-commerce enterprises,a dual-dimensional credit risk early warning indicator system tailored to these enterprises is formed.Secondly,a random forest early warning model is constructed based on the“two-step”grid search algorithm optimization.Initially,a large parameter range is divided to determine the optimal parameter range,followed by a refined search within this range to pinpoint the optimal parameters.The SMOTE algorithm is applied to address the imbalance in the dataset,enhancing the rigor of the early warning model.Finally,empirical analysis is conducted using real-world data.To highlight the superiority of the random forest early warning model based on the“two-step”grid search algorithm optimization,Logistic,CART,and random forest are established as control models for comparative analysis.The study validates the superiority of the constructed early warning model.The empirical results of this study demonstrate:(1)The effectiveness and rationality of the two-dimensional indicator system constructed after considering textual sentiment features,as verified through ROC effectiveness assessment.(2)The superior performance of the random forest model optimized by the“two-step”grid search algorithm compared to the other three warning models.(3)The critical importance of a balanced dataset for both individual and ensemble warning models.
参考文献:
正在载入数据...