详细信息
文献类型:期刊文献
中文题名:基于符号集合近似的城市轨道交通站点分类研究
英文题名:Classification of Urban Rail Transit Stations based on SAX
作者:张丽英[1,2];孟斌[3];尹芹[4]
第一作者:张丽英
机构:[1]中国矿业大学(北京)地球科学与测绘工程学院;[2]中国石油大学(北京)地球物理与信息学院;[3]北京联合大学应用文理学院;[4]首都师范大学资源环境与旅游学院
第一机构:中国矿业大学(北京)地球科学与测绘工程学院,北京100083
年份:2016
卷号:18
期号:12
起止页码:1597-1607
中文期刊名:地球信息科学学报
外文期刊名:Journal of Geo-Information Science
收录:CSTPCD;;北大核心:【北大核心2014】;CSCD:【CSCD2015_2016】;
基金:北京市哲学社会科学基金项目(14CSA002);国家自然科学基金项目(41171136)
语种:中文
中文关键词:轨道交通站点;时间序列;符号集合近似(SAX);层次聚类;时空特征
外文关键词:rail transit stations;time series;Symbolic Aggregate appro Ximation(SAX);Hierarchical clustering;spatio-temporal characteristics
摘要:轨道站点是城市轨道交通基本线网系统中的关键节点,科学的轨道站点分类,对了解城市功能分区及评价轨道交通基础设施建设情况具有重要意义。轨道交通站点时间序列客观记录了所观测的站点在各个时刻点的重要信息,研究其时间序列聚类,是认识和理解轨道交通站点时间序列形成本质的重要手段,也是挖掘轨道交通站点时间序列中隐含的有较高价值规律知识的重要方法。本文以北京IC卡轨道站点刷卡数据为研究对象,提出了描述轨道站点的4个数据集,即工作日进站数据集(WB)、工作日出站数据集(WA)、休息日进站数据集(RB)和休息日出站数据集(RA);并首次引入时间序列分析方法(符号集合近似(SAX)方法)对4个数据集进行聚类分析,实现了高维数据的有效降维和轨道站点之间的相似性度量。采用层次聚类方法并根据聚类有效性DB指数确定将195个站点分为8类更为合理。通过分析每类站点的日客流特征和空间位置分布情况,为轨道交通站点规划设计和管理服务提供一定的客观参考依据。
Urban rail transit stations are the key nodes of the basic urban rail transit network system. The scientific classification of the rail transit stations is significant to understand the urban functional zoning and evaluate the construction of the rail transit infrastructure. The time series data of urban rail transit stations objectively records the important information of observed stations at alltime points. The time series data contains different patterns, which reflect different sequence genesis. Therefore, studying cluster of the time series data is an important means to recognize and understand the essence of time series data formation. It is also a major method to mine higher value of principle and knowledge that implied in time series data. In this paper, we use smart card data of urban rail transit stations in Beijing, and divide the big data into four data sets: weekdays boarding data set(WB), weekdays alighting data set(WA), weekends(rest day) boarding data set(RB) and weekends alighting data set(RA) to describe characteristics of each station’s daily passenger volume. Symbolic Aggregate appro Ximation(SAX) is firstly introduced to analyze four data sets, which effectively reduces the dimensionality of high-dimensional data and realizes similarity measure between stations. Finally, it is more reasonable to classify the 195 rail transit stations into 8 types according to the DB index by hierarchical clustering method. They are residential stations, work stations, partial residential- based residential and work mixed stations, dislocation stations, tourist attractions and commercial stations, partial work- based residential and work mixed stations, integrated stations and other stations. The performance of SAX is compared with Euclidean distance similarity measure. The results indicate that SAX outperforms Euclidean distance in terms of accuracy and efficiency. The paper analyzes characteristics of daily passenger boarding and alighting volume on four data sets and spatial distribution of each type. It is found that residence and dislocation stations are mostly located in the far end of the subway, while the types of work stations, tourist attractions and commercial stations, partial work-based residential and work mixed stations, and integrated stations are concentrated in the urban areas. Partial residential- based residential and work mixed stations scatter around the city center. The results can help to interpret the different functional zoning of the city and the characteristics of residents’ travel behavior, which provides a basis for understanding the urban spatial pattern and its evolution process,and also provides some objective reference for planning, design and management services of rail transit stations.
参考文献:
正在载入数据...