共查询到20条相似文献,搜索用时 15 毫秒
1.
Chul Moon Noah Giansiracusa Nicole A. Lazar 《Journal of computational and graphical statistics》2018,27(3):576-586
Topological data analysis (TDA) is a rapidly developing collection of methods for studying the shape of point cloud and other data types. One popular approach, designed to be robust to noise and outliers, is to first use a smoothing function to convert the point cloud into a manifold and then apply persistent homology to a Morse filtration. A significant challenge is that this smoothing process involves the choice of a parameter and persistent homology is highly sensitive to that choice; moreover, important scale information is lost. We propose a novel topological summary plot, called a persistence terrace, that incorporates a wide range of smoothing parameters and is robust, multi-scale, and parameter-free. This plot allows one to isolate distinct topological signals that may have merged for any fixed value of the smoothing parameter, and it also allows one to infer the size and point density of the topological features. We illustrate our method in some simple settings where noise is a serious issue for existing frameworks and then we apply it to a real dataset by counting muscle fibers in a cross-sectional image. Supplementary material for this article is available online. 相似文献
2.
3.
杨益民 《数学的实践与认识》2005,35(3):203-208
应用矩阵特征值的方法推导出线性分式的 n次自迭代通项公式 ,证明了对于任意的自然数 n 3 ,存在无穷多以 n为自迭代周期的线性分式 ,给出了线性分式函数的自迭代周期和吸引子的较全面的刻画 . 相似文献
4.
在海量征信数据的背景下,为降低缺失数据插补的计算成本,提出收缩近邻插补方法.收缩近邻方法通过三阶段完成数据插补,第一阶段基于样本和变量的缺失比例计算入样概率,通过不等概抽样完成数据的收缩,第二阶段基于样本间距离,选取与缺失样本近邻的样本组成训练集,第三阶段建立随机森林模型进行迭代插补.利用Australian数据集和中国各银行数据集进行模拟研究,结果表明在确保一定插补精度的情况下,收缩近邻方法较大程度减少了计算量. 相似文献
5.
Change point hazard rate models arise in many life time data analysis, for example, in studying times until the undesirable side effects occur in clinical trials. In this paper we propose a general class of change point hazard model for survival data. This class includes and extends different types of change point models for survival data, e.g. cure rate model and lag model. Most classical approach develops estimates of model parameters, with particular interest in change point parameter and often the whole hazard function, but exclusively in terms of asymptotic properties. We propose a Bayesian approach, avoiding asymptotics and provide inference conditional upon the observed data. The proposed Bayesian models are fitted using Markov chain Monte Carlo method. We illustrate our proposed methodology with an application to modeling life times of the printed circuit board. 相似文献
6.
A Strong Representation of the Product-Limit Estimator for Left Truncated and Right Censored Data 总被引:1,自引:0,他引:1
In this paper we consider the TJW product-limit estimatorFn(x) of an unknown distribution functionFwhen the data are subject to random left truncation and right censorship. An almost sure representation of PL-estimatorFn(x) is derived with an improved error bound under some weaker assumptions. We obtain the strong approximation ofFn(x)−F(x) by Gaussian processes and the functional law of the iterated logarithm is proved for maximal derivation of the product-limit estimator toF. A sharp rate of convergence theorem concerning the smoothed TJW product-limit estimator is obtained. Asymptotic properties of kernel estimators of density function based on TJW product-limit estimator is given. 相似文献
7.
《Journal of computational and graphical statistics》2013,22(4):925-945
The focus of this article is on fitting regression models and testing of general linear hypotheses for correlated data using quasi-likelihood based techniques. The class of generalized method of moments or GMMs provides an elegant approach for estimating a vector of regression parameters from a set of score functions. Extending the principle of the GMMs, in the generalized estimating equation framework, leads to a quadratic inference function or QIF approach for the analysis of correlated data. We derive an iteratively reweighted generalized least squares or IRGLS algorithm for finding the QIF estimator and establish its convergence properties. A software library implementing the techniques is demonstrated through several datasets. 相似文献
8.
Facundo Mmoli 《Applied and Computational Harmonic Analysis》2011,30(3):363-401
We introduce a spectral notion of distance between objects and study its theoretical properties. Our distance satisfies the properties of a metric on the class of isometric shapes, which means, in particular, that two shapes are at 0 distance if and only if they are isometric when endowed with geodesic distances. Our construction is similar to the Gromov–Wasserstein distance, but rather than viewing shapes merely as metric spaces, we define our distance via the comparison of heat kernels. This allows us to establish precise relationships of our distance to previously proposed spectral invariants used for data analysis and shape comparison, such as the spectrum of the Laplace–Beltrami operator, the diagonal of the heat kernel, and certain constructions based on diffusion distances. In addition, the heat kernel encodes a natural notion of scale, which is useful for multi-scale shape comparison. We prove a hierarchy of lower bounds for our distance, which provide increasing discriminative power at the cost of an increase in computational complexity. We also explore the definition of other spectral metrics on collections of shapes and study their theoretical properties. 相似文献
9.
利用最小二乘法进行线性数据拟合在一定条件下存在着误差较大的缺陷,为使线性数据拟合方法在科学实验和工程实践中能够更加准确地求解量与量之间的关系表达式,本文通过对常用线性数据拟合方法———最小二乘法进行了误差分析,并在此基础上提出了最小距离平方和法以对最小二乘法作改进处理.最后,通过举例分析对两种线性数据拟合方法的优劣加以讨论并分别给出其较为合理的应用控制条件. 相似文献
10.
Research is an incremental, iterative process, with new results relying and building upon previous ones. Scientists need to find, retrieve, understand, and verify results to confidently extend them, even when the results are their own. We present the trackr framework for organizing, automatically annotating, discovering, and retrieving results. We identify sources of automatically extractable metadata for computational results, and we define an extensible system for organizing, annotating, and searching for results based on these and other metadata. We present an open-source implementation of these concepts for plots, computational artifacts, and woven dynamic reports generated in the R statistical computing language. Supplementary materials for this article are available online. 相似文献
11.
本文应用最优化理论,对固定效应的面板数据分位数回归模型,提出一种模式搜索方法,此方法可以同时估计出所有分位点处的解释变量系数和所有个体的固定效应值。进一步利用蒙特卡洛模拟比较现有文献中涉及的面板数据分位数回归方法,结果显示无论误差项是否满足经典假设,模式搜索分位数回归法较之其他分位数回归估计方法更为有效. 相似文献
12.
受希勒引理的启发,针对长偏置数据给出一个带泊松权重的平均剩余生命函数的光滑估计量.同时研究了该光滑估计量的渐进性质,例如强相合性及渐进正态性等性质. 相似文献
13.
14.
基于删失数据的指数威布尔分布最大似然估计的新算法 总被引:1,自引:0,他引:1
本文讨论了指数威布尔分布当观测数据是删失数据情形时参数的最大似然估计问题.因为删失数据是一种不完全数据,我们利用EM算法来计算参数的近似最大似然估计.由于EM算法计算的复杂性,计算效率也不理想.为了克服牛顿-拉普森算法和EM算法的局限性,我们提出了一种新的方法.这种方法联合了指数威布尔分布到指数分布的变换和等效寿命数据的技巧,比牛顿-拉普森算法和EM算法更具有操作性.数据模拟讨论了这一方法的可行性.为了演示本文的方法,我们还提供了一个真实寿命数据分析的例子. 相似文献
15.
16.
Examples of scientific problems and data analyses are presented for the fields of demography, neurophysiology, and seismology. The examples are connected by the involvement of space or time. The demographic problem is to display quantities derived from spatially aggregated data and associated measures of uncertainty. The neurophysiological problem is to infer the presence of complex pathways among groups of neurons given sequences of firing times. There are two seismological problems: (1) to determine isoseismals of recorded intensities following the Loma Prieta earthquake and (2) to relate intensity and acceleration values measured at distinct locations. The statistical analyses are connected to each other by the application of smoothing in some form and by the provision of consequent graphical displays. 相似文献
17.
Xin M. Tu 《Journal of computational and graphical statistics》2013,22(1):97-112
Survival data with censored initiating and terminating times have surfaced in some recent epidemiologic studies. Unlike standard survival analysis with known initiating times, analysis of data with both censored initiating and terminating times requires maximization of a complicated bivariate likelihood, which is often difficult to carry out. This article considers a missing-data formulation of the problem and focuses on the use of EM-type algorithms to simplify the computation of maximum likelihood estimates. This approach provides a feasible way of performing regression analysis with such bivariate survival data. Several illustrative examples are provided, including a real-data analysis application involving a cohort of HIV-infected hemophiliac patients. 相似文献
18.
本文用[1]发展的计数过程去研究截断样本下强率函数核估计的渐进正态性.在弱于[7]和[10]的条件下,得到了更一般的结果.接着我们将这种方法运用到密度函数核估计,在较弱的条件下,得到了截断样本下密度函数核估计的渐进正态性. 相似文献
19.