首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 589 毫秒
1.
The scaling of human mobility by taxis is exponential   总被引:4,自引:0,他引:4  
As a significant factor in urban planning, traffic forecasting and prediction of epidemics, modeling patterns of human mobility draws intensive attention from researchers for decades. Power-law distribution and its variations are observed from quite a few real-world human mobility datasets such as the movements of banking notes, trackings of cell phone users’ locations and trajectories of vehicles. In this paper, we build models for 20 million trajectories with fine granularity collected from more than 10 thousand taxis in Beijing. In contrast to most models observed in human mobility data, the taxis’ traveling displacements in urban areas tend to follow an exponential distribution instead of a power-law. Similarly, the elapsed time can also be well approximated by an exponential distribution. Worth mentioning, analysis of the interevent time indicates the bursty nature of human mobility, similar to many other human activities.  相似文献   

2.
Spatiotemporal Patterns of Urban Human Mobility   总被引:2,自引:0,他引:2  
The modeling of human mobility is adopting new directions due to the increasing availability of big data sources from human activity. These sources enclose digital information about daily visited locations of a large number of individuals. Examples of these data include: mobile phone calls, credit card transactions, bank notes dispersal, check-ins in internet applications, among several others. In this study, we consider the data obtained from smart subway fare card transactions to characterize and model urban mobility patterns. We present a simple mobility model for predicting peoples’ visited locations using the popularity of places in the city as an interaction parameter between different individuals. This ingredient is sufficient to reproduce several characteristics of the observed travel behavior such as: the number of trips between different locations in the city, the exploration of new places and the frequency of individual visits of a particular location. Moreover, we indicate the limitations of the proposed model and discuss open questions in the current state of the art statistical models of human mobility.  相似文献   

3.
The empirical study of network dynamics has been limited by the lack of longitudinal data. Here we introduce a quantitative indicator of link persistence to explore the correlations between the structure of a mobile phone network and the persistence of its links. We show that persistent links tend to be reciprocal and are more common for people with low degree and high clustering. We study the redundancy of the associations between persistence, degree, clustering and reciprocity and show that reciprocity is the strongest predictor of tie persistence. The method presented can be easily adapted to characterize the dynamics of other networks and can be used to identify the links that are most likely to survive in the future.  相似文献   

4.
Intra-urban human mobility patterns: An urban morphology perspective   总被引:9,自引:0,他引:9  
This paper provides a new perspective on human motion with an investigation of whether and how patterns of human mobility inside cities are affected by two urban morphological characteristics: compactness and size. Mobile phone data have been collected in eight cities in Northeast China and used to extract individuals’ movement trajectories. The massive mobile phone data provides a wide coverage and detailed depiction of individuals’ movement in space and time. Considering that most individuals’ movement is limited within particular urban areas, boundaries of urban agglomerations are demarcated based on the spatial distribution of mobile phone base towers. Results indicate that the distribution of human’s intra-urban travel in general follows the exponential law. The exponents, however, vary from city to city and indicate the impact of city sizes and shapes. Individuals living in large or less compact cities generally need to travel farther on a daily basis, and vice versa. A Monte Carlo simulation analysis based on Levy flight is conducted to further examine and validate the relation between intra-urban human mobility and urban morphology.  相似文献   

5.
徐赞新  王钺  司洪波  冯振明 《物理学报》2011,60(4):40501-040501
移动通信应用为人类移动规律的研究提供了独特的数据来源. 本文通过城市手机用户的分布数据,研究城市移动人群的整体动力学行为. 借助随机矩阵理论的方法,通过比较移动人群数据与随机数据在互相关矩阵谱分布上的差异,发现移动人群数据互相关矩阵的相关系数均值、最大本征值及其对应的本征向量明显偏离于随机互相关矩阵的分布,指出这种差异体现了城市移动人群的整体行为特性,且这种差异在不同时间段也会有所不同. 研究结果体现出相关系数的均值和最大本征值的波动趋势,并指出本征向量成员权重的时空模式与城市移动人群整体行为特征的波动过 关键词: 随机矩阵理论 移动人群 宏观行为  相似文献   

6.
Peter Grindrod  Mark Parsons 《Physica A》2011,390(21-22):3970-3981
The plethora of digital communication technologies, and their mass take up, has resulted in a wealth of interest in social network data collection and analysis in recent years. Within many such networks the interactions are transient: thus those networks evolve over time. In this paper we introduce a class of models for such networks using evolving graphs with memory dependent edges, which may appear and disappear according to their recent history. We consider time discrete and time continuous variants of the model. We consider the long term asymptotic behaviour as a function of parameters controlling the memory dependence. In particular we show that such networks may continue evolving forever, or else may quench and become static (containing immortal and/or extinct edges). This depends on the existence or otherwise of certain infinite products and series involving age dependent model parameters. We show how to differentiate between the alternatives based on a finite set of observations. To test these ideas we show how model parameters may be calibrated based on limited samples of time dependent data, and we apply these concepts to three real networks: summary data on mobile phone use from a developing region; online social-business network data from China; and disaggregated mobile phone communications data from a reality mining experiment in the US. In each case we show that there is evidence for memory dependent dynamics, such as that embodied within the class of models proposed here.  相似文献   

7.
余晓平  裴韬 《物理学报》2013,62(20):208901-208901
手机通信数据详细记录了人们的通信行为, 成为研究人们社会关系、行为模式的重要资源. 通话号码个数、通话次数和时长是手机通信网络的基本属性. 本文在复杂网络理论基础上, 应用统计的方法研究了中国西部某城市三百余万手机用户不同节假日和工作日的 四天通话数据在不同尺度下的号码度、通话度、时长度的分布以及平均号码度、 平均通话度、平均时长度的特征.研究表明, 所有尺度下, 号码度、通话度、时长度均为幂律分布, 幂指数随尺度、日期和指标的不同而不同, 在[1.3, 4] 范围内波动.总体上, 号码度幂指数大于通话度和时长度幂指数, 入度幂指数大于出度幂指数;节假日幂指数大于相应指标的工作日幂指数, 休息时段幂指数大于工作时段幂指数;与工作日相比, 节假日的平均号码度和平均通话度较小, 平均时长度较大.揭示了绝大多数用户每日只接打1个号码的电话, 节假日期间接打电话的用户数、次数、时长减少, 但平均通话时长增大的特征. 关键词: 手机通话网络 复杂网络 度分布 通话模式  相似文献   

8.
Mobile phone communication as digital service generates ever-increasing datasets of human communication actions, which in turn allow us to investigate the structure and evolution of social interactions and their networks. These datasets can be used to study the structuring of such egocentric networks with respect to the strength of the relationships by assuming direct dependence of the communication intensity on the strength of the social tie. Recently we have discovered that there are significant differences between the first and further “best friends” from the point of view of age and gender preferences. Here we introduce a control parameter p max based on the statistics of communication with the first and second “best friend” and use it to filter the data. We find that when p max is decreased the identification of the “best friend” becomes less ambiguous and the earlier observed effects get stronger, thus corroborating them.  相似文献   

9.
The most common machine-learning methods solve supervised and unsupervised problems based on datasets where the problem’s features belong to a numerical space. However, many problems often include data where numerical and categorical data coexist, which represents a challenge to manage them. To transform categorical data into a numeric form, preprocessing tasks are compulsory. Methods such as one-hot and feature-hashing have been the most widely used encoding approaches at the expense of a significant increase in the dimensionality of the dataset. This effect introduces unexpected challenges to deal with the overabundance of variables and/or noisy data. In this regard, in this paper we propose a novel encoding approach that maps mixed-type data into an information space using Shannon’s Theory to model the amount of information contained in the original data. We evaluated our proposal with ten mixed-type datasets from the UCI repository and two datasets representing real-world problems obtaining promising results. For demonstrating the performance of our proposal, this was applied for preparing these datasets for classification, regression, and clustering tasks. We demonstrate that our encoding proposal is remarkably superior to one-hot and feature-hashing encoding in terms of memory efficiency. Our proposal can preserve the information conveyed by the original data.  相似文献   

10.
In this article, we consider a version of the challenging problem of learning from datasets whose size is too limited to allow generalisation beyond the training set. To address the challenge, we propose to use a transfer learning approach whereby the model is first trained on a synthetic dataset replicating features of the original objects. In this study, the objects were smartphone photographs of near-complete Roman terra sigillata pottery vessels from the collection of the Museum of London. Taking the replicated features from published profile drawings of pottery forms allowed the integration of expert knowledge into the process through our synthetic data generator. After this first initial training the model was fine-tuned with data from photographs of real vessels. We show, through exhaustive experiments across several popular deep learning architectures, different test priors, and considering the impact of the photograph viewpoint and excessive damage to the vessels, that the proposed hybrid approach enables the creation of classifiers with appropriate generalisation performance. This performance is significantly better than that of classifiers trained exclusively on the original data, which shows the promise of the approach to alleviate the fundamental issue of learning from small datasets.  相似文献   

11.
Marcin Owczarczuk 《Physica A》2012,391(4):1428-1433
In this article we show that usage of a mobile phone, i.e. daily series of number of calls made by a customer, exhibits long memory. We use a sample of 4502 postpaid users from a Polish mobile operator and study their two-year billing history. We estimate Hurst exponent by nine estimators: aggregated variance method, differencing the variance, absolute values of the aggregated series, Higuchi’s method, residuals of regression, the R/S method, periodogram method, modified periodogram method and Whittle estimator. We also analyze empirically relations between estimators. Long memory implies an inertial effect in clients’ behavior which may be used by mobile operators to accelerate usage and gain additional profit.  相似文献   

12.
Unsupervised domain adaptation is a challenging task in person re-identification (re-ID). Recently, cluster-based methods achieve good performance; clustering and training are two important phases in these methods. For clustering, one major issue of existing methods is that they do not fully exploit the information in outliers by either discarding outliers in clusters or simply merging outliers. For training, existing methods only use source features for pretraining and target features for fine-tuning and do not make full use of all valuable information in source datasets and target datasets. To solve these problems, we propose a Threshold-based Hierarchical clustering method with Contrastive loss (THC). There are two features of THC: (1) it regards outliers as single-sample clusters to participate in training. It well preserves the information in outliers without setting cluster number and combines advantages of existing clustering methods; (2) it uses contrastive loss to make full use of all valuable information, including source-class centroids, target-cluster centroids and single-sample clusters, thus achieving better performance. We conduct extensive experiments on Market-1501, DukeMTMC-reID and MSMT17. Results show our method achieves state of the art.  相似文献   

13.
The first objective data showing the geographical locations of people in Fukushima after the Fukushima Dai-ichi nuclear power plant accident, obtained by an analysis of GPS (Global Positioning System)-enabled mobile phone logs, are presented. The method of estimation is explained, and the flow of people into and out of the 20 km evacuation zone during the accident is visualized.  相似文献   

14.
Recent interest in human dynamics has stimulated the investigation of the stochasticprocesses that explain human behaviour in various contexts, such as mobile phone networksand social media. In this paper, we extend the stochastic urn-based model proposed in[T. Fenner, M. Levene, G. Loizou, J. Stat. Mech. 2015, P08015 (2015)] so thatit can generate mixture models, in particular, a mixture of exponential distributions. Themodel is designed to capture the dynamics of survival analysis, traditionally employed inclinical trials, reliability analysis in engineering, and more recently in the analysis oflarge data sets recording human dynamics. The mixture modelling approach, which isrelatively simple and well understood, is very effective in capturing heterogeneity indata. We provide empirical evidence for the validity of the model, using a data set ofpopular search engine queries collected over a period of 114 months. We show that thesurvival function of these queries is closely matched by the exponential mixture solutionfor our model.  相似文献   

15.
A clustering method has been developed to group signals that display similar dynamic behavior. The procedure involves using the method of time delay embedding to construct a trajectory in state space from a time series. Certain features that characterize the geometry of the trajectory have been defined. These features were subjected to a series of statistical tests to determine their usefulness in a hierarchical clustering analysis. The latter is aimed at finding groups of similar trajectories. The trajectory-based clustering algorithm has been applied to simulated data, which included both stochastic data generated by a linear AR model, and nonlinear data generated by a Duffing oscillator. The results show that the algorithm works reliably in both cases.  相似文献   

16.
We study the well-known sociological phenomenon of gang aggregation and territory formation through an interacting agent system defined on a lattice. We introduce a two-gang Hamiltonian model where agents have red or blue affiliation but are otherwise indistinguishable. In this model, all interactions are indirect and occur only via graffiti markings, on-site as well as on nearest neighbor locations. We also allow for gang proliferation and graffiti suppression. Within the context of this model, we show that gang clustering and territory formation may arise under specific parameter choices and that a phase transition may occur between well-mixed, possibly dilute configurations and well separated, clustered ones. Using methods from statistical mechanics, we study the phase transition between these two qualitatively different scenarios. In the mean-fields rendition of this model, we identify parameter regimes where the transition is first or second order. In all cases, we have found that the transitions are a consequence solely of the gang to graffiti couplings, implying that direct gang to gang interactions are not strictly necessary for gang territory formation; in particular, graffiti may be the sole driving force behind gang clustering. We further discuss possible sociological—as well as ecological—ramifications of our results.  相似文献   

17.
The trust region method which originated from the Levenberg–Marquardt (LM) algorithm for mixed effect model estimation are considered in the context of second level functional magnetic resonance imaging (fMRI) data analysis. We first present the mathematical and optimization details of the method for the mixed effect model analysis, then we compare the proposed methods with the conventional expectation-maximization (EM) algorithm based on a series of datasets (synthetic and real human fMRI datasets). From simulation studies, we found a higher damping factor for the LM algorithm is better than lower damping factor for the fMRI data analysis. More importantly, in most cases, the expectation trust region algorithm is superior to the EM algorithm in terms of accuracy if the random effect variance is large. We also compare these algorithms on real human datasets which comprise repeated measures of fMRI in phased-encoded and random block experiment designs. We observed that the proposed method is faster in computation and robust to Gaussian noise for the fMRI analysis. The advantages and limitations of the suggested methods are discussed.  相似文献   

18.
Construction of graph-based approximations for multi-dimensional data point clouds is widely used in a variety of areas. Notable examples of applications of such approximators are cellular trajectory inference in single-cell data analysis, analysis of clinical trajectories from synchronic datasets, and skeletonization of images. Several methods have been proposed to construct such approximating graphs, with some based on computation of minimum spanning trees and some based on principal graphs generalizing principal curves. In this article we propose a methodology to compare and benchmark these two graph-based data approximation approaches, as well as to define their hyperparameters. The main idea is to avoid comparing graphs directly, but at first to induce clustering of the data point cloud from the graph approximation and, secondly, to use well-established methods to compare and score the data cloud partitioning induced by the graphs. In particular, mutual information-based approaches prove to be useful in this context. The induced clustering is based on decomposing a graph into non-branching segments, and then clustering the data point cloud by the nearest segment. Such a method allows efficient comparison of graph-based data approximations of arbitrary topology and complexity. The method is implemented in Python using the standard scikit-learn library which provides high speed and efficiency. As a demonstration of the methodology we analyse and compare graph-based data approximation methods using synthetic as well as real-life single cell datasets.  相似文献   

19.
Raman spectroscopy has the potential to significantly aid in the research and diagnosis of cancer. The information dense, complex spectra generate massive datasets in which subtle correlations may provide critical clues for biological analysis and pathological classification. Therefore, implementing advanced data mining techniques is imperative for complete, rapid and accurate spectral processing. Numerous recent studies have employed various data methods to Raman spectra for classification and biochemical analysis. Although, as Raman datasets from biological specimens are often characterized by high dimensionality and low sample numbers, many of these classification models are subject to overfitting. Furthermore, attempts to reduce dimensionality result in transformed feature spaces making the biological evaluation of significant and discriminative spectral features problematic. We have developed a novel data mining framework optimized for Raman datasets, called Fisher‐based Feature Selection Support Vector Machines (FFS‐SVM). This framework provides simultaneous supervised classification and user‐defined Fisher criterion‐based feature selection, reducing overfitting and directly yielding significant wavenumbers from the original feature space. Herein, we investigate five cancerous and non‐cancerous breast cell lines using Raman microspectroscopy and our unique FFS‐SVM framework. Our framework classification performance is then compared to several other frequently employed classification methods on four classification tasks. The four tasks were constructed by an unsupervised clustering method yielding the four different categories of cell line groupings (e.g. cancer vs non‐cancer) studied. FFS‐SVM achieves both high classification accuracies and the extraction of biologically significant features. The top ten most discriminative features are discussed in terms of cell‐type specific biological relevance. Our framework provides comprehensive cellular level characterization and could potentially lead to the discovery of cancer biomarker‐type information, which we have informally termed ‘Raman‐based spectral biomarkers’. The FFS‐SVM framework along with Raman spectroscopy will be used in future studies to investigate in‐situ dynamic biological phenomena. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

20.
食品的品种不同则其含有营养成分和功效存在差异,得到的傅里叶变换红外光谱也存在差异。为了准确的实现品种分类,设计了一种将傅里叶变换红外光谱与模糊聚类分析方法相结合的品种鉴别方法。在模糊Kohonen聚类网络(FKCN)基础上将模糊K调和聚类(FKHM)引入到Kohonen聚类网络的学习速率和更新策略中,提出了模糊K-Harmonic-Kohonen网络(FKHKCN)算法。FKHKCN利用模糊C均值(FCM)聚类的模糊隶属度计算其学习速率,以FKHM的聚类中心为基础通过推导计算得到FKHKCN的聚类中心,可以解决模糊Kohonen聚类网络方法对于初始类中心敏感而导致聚类结果不稳定的问题。FKHKCN作为一种模糊聚类算法,可实现傅里叶变换红外光谱数据的聚类分析。采用三种数据集:(1)采集产自四川的三种茶叶(优质和劣质的乐山竹叶青以及峨眉山毛峰)作为实验样本,样本总数为96。(2)两个品种(robusta和arabica)的咖啡样本。(3)三个品种(鸡肉、猪肉和火鸡)的肉类样本。首先对三个光谱数据集进行预处理,利用多元散射校正降低茶叶样本原始光谱数据集的散射影响,使用Savitzky-Gol...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号