详细信息
文献类型:期刊文献
中文题名:软件缺陷预测模型可解释性对比
英文题名:Explainable Comparison of Software Defect Prediction Models
作者:李汇来[1];杨斌[2];于秀丽[3];唐晓梅[4]
第一作者:李汇来
机构:[1]北京邮电大学人工智能学院,北京100876;[2]中国联合网络通信有限公司研究院,北京100048;[3]北京邮电大学现代邮政学院,北京100876;[4]北京联合大学信息网络中心,北京100101
第一机构:北京邮电大学人工智能学院,北京100876
年份:2023
卷号:50
期号:5
起止页码:21-30
中文期刊名:计算机科学
外文期刊名:Computer Science
收录:CSTPCD;;北大核心:【北大核心2020】;CSCD:【CSCD_E2023_2024】;
语种:中文
中文关键词:软件缺陷预测;可解释性;软件度量;神经网络;抽象语法树
外文关键词:Software defect prediction;Explicability;Software metrics;Neural network;Abstract syntax tree
摘要:软件缺陷预测已经成为软件测试中的重要研究方向,缺陷预测的全面与否直接影响着测试效率和程序运行。但现有的缺陷预测是根据历史数据进行推断,大多不能对预测过程给出合理的解释,这种黑盒的预测过程仅仅展现输出结果,使得人们难以得知测试模型内部结构对输出的影响。为解决这一问题,需挑选软件度量方法和部分典型深度学习模型,对其输入、输出及结构进行简要对比,从数据差异程度和模型对代码的处理过程两个角度进行分析,对它们的异同给出解释。实验表明,采用深度学习的方法进行缺陷预测比传统软件度量方法更加有效,这主要是由它们对原始数据处理过程不同造成的;采用卷积神经网络和长短期记忆神经网络做缺陷预测时,数据差异主要由对代码信息理解的完整程度不同造成的。综上可知,要提高对软件缺陷的预测能力,模型的计算应该对代码的语义、逻辑和上下文联系进行全面的介入,避免有用信息被遗漏。
Software defect prediction has become an important research direction in software testing.The comprehensiveness of defect prediction directly affects the efficiency of testing and program operation.However,the existing defect prediction is based on historical data,and most of them cannot give a reasonable explanation for the prediction process.This black box prediction process only shows the output results,making it difficult for people to know the impact of the internal structure of the test model on the output.In order to solve this problem,it is necessary to select software measurement methods and some typical deep lear-ning models,make a brief comparison of their input,output and structure,analyze them from the two perspectives of the degree of data differences and the processing process of the model on the code,and explain their similarities and differences.Experiments show that the method of deep learning is more effective than traditional software measurement methods in defect prediction,which is mainly caused by their different processing processes of raw data.When using convolution neural network and long-term and short-term memory neural network to predict defects,the data difference is mainly caused by the integrity of the understan-ding of code information.To sum up,in order to improve the prediction ability of software defects,the calculation of the model should comprehensively involve the semantics,logic and context of the code to avoid the omission of useful information.
参考文献:
正在载入数据...