首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider the problem of estimating the marginals in the case where there is knowledge on the copula. If the copula is smooth, it is known that it is possible to improve on the empirical distribution functions: optimal estimators still have a rate of convergence n−1/2, but a smaller asymptotic variance. In this paper we show that for non-smooth copulas it is sometimes possible to construct superefficient estimators of the marginals: we construct both a copula and, exploiting the information our copula provides, estimators of the marginals with the rate of convergence logn/n.  相似文献   

2.
Parallel to Cox's [JRSS B34 (1972) 187-230] proportional hazards model, generalized logistic models have been discussed by Anderson [Bull. Int. Statist. Inst. 48 (1979) 35-53] and others. The essential assumption is that the two densities ratio has a known parametric form. A nice property of this model is that it naturally relates to the logistic regression model for categorical data. In astronomic, demographic, epidemiological, and other studies the variable of interest is often truncated by an associated variable. This paper studies generalized logistic models for the two-sample truncated data problem, where the two lifetime densities ratio is assumed to have the form exp{α+φ(x;β)}. Here φ is a known function of x and β, and the baseline density is unspecified. We develop a semiparametric maximum likelihood method for the case where the two samples have a common truncation distribution. It is shown that inferences for β do not depend the nonparametric components. We also derive an iterative algorithm to maximize the semiparametric likelihood for the general case where different truncation distributions are allowed. We further discuss how to check goodness of fit of the generalized logistic model. The developed methods are illustrated and evaluated using both simulated and real data.  相似文献   

3.
Nonparametric quantile regression with multivariate covariates is a difficult estimation problem due to the “curse of dimensionality”. To reduce the dimensionality while still retaining the flexibility of a nonparametric model, we propose modeling the conditional quantile by a single-index function , where a univariate link function g0(⋅) is applied to a linear combination of covariates , often called the single-index. We introduce a practical algorithm where the unknown link function g0(⋅) is estimated by local linear quantile regression and the parametric index is estimated through linear quantile regression. Large sample properties of estimators are studied, which facilitate further inference. Both the modeling and estimation approaches are demonstrated by simulation studies and real data applications.  相似文献   

4.
We consider the estimation of the regression operator r in the functional model: Y=r(x)+ε, where the explanatory variable x is of functional fixed-design type, the response Y is a real random variable and the error process ε is a second order stationary process. We construct the kernel type estimate of r from functional data curves and correlated errors. Then we study their performances in terms of the mean square convergence and the convergence in probability. In particular, we consider the cases of short and long range error processes. When the errors are negatively correlated or come from a short memory process, the asymptotic normality of this estimate is derived. Finally, some simulation studies are conducted for a fractional autoregressive integrated moving average and for an Ornstein-Uhlenbeck error processes.  相似文献   

5.
We consider the problem of setting bootstrap confidence regions for multivariate parameters based on data depth functions. We prove, under mild regularity conditions, that depth-based bootstrap confidence regions are second-order accurate in the sense that their coverage error is of order n−1, given a random sample of size n. The results hold in general for depth functions of types A and D, which cover as special cases the Tukey depth, the majority depth, and the simplicial depth. A simulation study is also provided to investigate empirically the bootstrap confidence regions constructed using these three depth functions.  相似文献   

6.
Support vector machines (SVMs) have attracted much attention in theoretical and in applied statistics. The main topics of recent interest are consistency, learning rates and robustness. We address the open problem whether SVMs are qualitatively robust. Our results show that SVMs are qualitatively robust for any fixed regularization parameter λ. However, under extremely mild conditions on the SVM, it turns out that SVMs are not qualitatively robust any more for any null sequence λn, which are the classical sequences needed to obtain universal consistency. This lack of qualitative robustness is of a rather theoretical nature because we show that, in any case, SVMs fulfill a finite sample qualitative robustness property.For a fixed regularization parameter, SVMs can be represented by a functional on the set of all probability measures. Qualitative robustness is proven by showing that this functional is continuous with respect to the topology generated by weak convergence of probability measures. Combined with the existence and uniqueness of SVMs, our results show that SVMs are the solutions of a well-posed mathematical problem in Hadamard’s sense.  相似文献   

7.
Homogeneity tests based on several progressively Type-II censored samples   总被引:2,自引:0,他引:2  
In this paper, we discuss the problem of testing the homogeneity of several populations when the available data are progressively Type-II censored. Defining for each sample a univariate counting process, we can modify all the methods that were developed during the last two decades (see e.g. [P.K. Andersen, Ø. Borgan, R. Gill, N. Keiding, Statistical Models Based on Counting Processes, Springer, New York, 1993]) for use to this problem. An important aspect of these tests is that they are based on either linear or non-linear functionals of a discrepancy process (DP) based on the comparison of the cumulative hazard rate (chr) estimated from each sample with the chr estimated from the whole sample (viz., the aggregation of all the samples), leading to either linear tests or non-linear tests. Both these kinds of tests suffer from some serious drawbacks. For example, it is difficult to extend non-linear tests to the K-sample situation when K?3. For this reason, we propose here a new class of non-linear tests, based on a chi-square type functional of the DP, that can be applied to the K-sample problem for any K?2.  相似文献   

8.
An exhaustive search as required for traditional variable selection methods is impractical in high dimensional statistical modeling. Thus, to conduct variable selection, various forms of penalized estimators with good statistical and computational properties, have been proposed during the past two decades. The attractive properties of these shrinkage and selection estimators, however, depend critically on the size of regularization which controls model complexity. In this paper, we consider the problem of consistent tuning parameter selection in high dimensional sparse linear regression where the dimension of the predictor vector is larger than the size of the sample. First, we propose a family of high dimensional Bayesian Information Criteria (HBIC), and then investigate the selection consistency, extending the results of the extended Bayesian Information Criterion (EBIC), in Chen and Chen (2008) to ultra-high dimensional situations. Second, we develop a two-step procedure, the SIS+AENET, to conduct variable selection in p>n situations. The consistency of tuning parameter selection is established under fairly mild technical conditions. Simulation studies are presented to confirm theoretical findings, and an empirical example is given to illustrate the use in the internet advertising data.  相似文献   

9.
In this paper we consider the problem of estimating E[(YE[YX])2] based on a finite sample of independent, but not necessarily identically distributed, random variables . We analyze the theoretical properties of a recently developed estimator. It is shown that the estimator has many theoretically interesting properties, while the practical implementation is simple.  相似文献   

10.
In this paper we propose a new test for the multivariate two-sample problem. The test statistic is the difference of the sum of all the Euclidean interpoint distances between the random variables from the two different samples and one-half of the two corresponding sums of distances of the variables within the same sample. The asymptotic null distribution of the test statistic is derived using the projection method and shown to be the limit of the bootstrap distribution. A simulation study includes the comparison of univariate and multivariate normal distributions for location and dispersion alternatives. For normal location alternatives the new test is shown to have power similar to that of the t- and T2-Test.  相似文献   

11.
Consider the model Y=m(X)+ε, where m(⋅)=med(Y|⋅) is unknown but smooth. It is often assumed that ε and X are independent. However, in practice this assumption is violated in many cases. In this paper we propose modeling the dependence between ε and X by means of a copula model, i.e. (ε,X)∼Cθ(Fε(⋅),FX(⋅)), where Cθ is a copula function depending on an unknown parameter θ, and Fε and FX are the marginals of ε and X. Since many parametric copula families contain the independent copula as a special case, the so-obtained regression model is more flexible than the ‘classical’ regression model.We estimate the parameter θ via a pseudo-likelihood method and prove the asymptotic normality of the estimator, based on delicate empirical process theory. We also study the estimation of the conditional distribution of Y given X. The procedure is illustrated by means of a simulation study, and the method is applied to data on food expenditures in households.  相似文献   

12.
In a range of practical problems the boundary of the support of a bivariate distribution is of interest, for example where it describes a limit to efficiency or performance, or where it determines the physical extremities of a spatially distributed population in forestry, marine science, medicine, meteorology or geology. We suggest a tracking-based method for estimating a support boundary when it is composed of a finite number of smooth curves, meeting together at corners. The smooth parts of the boundary are assumed to have continuously turning tangents and bounded curvature, and the corners are not allowed to be infinitely sharp; that is, the angle between the two tangents should not equal π. In other respects, however, the boundary may be quite general. In particular it need not be uniquely defined in Cartesian coordinates, its corners my be either concave or convex, and its smooth parts may be neither concave nor convex. Tracking methods are well suited to such generalities, and they also have the advantage of requiring relatively small amounts of computation. It is shown that they achieve optimal convergence rates, in the sense of uniform approximation.  相似文献   

13.
Testing for the independence between two categorical variables R and S forming a contingency table is a well-known problem: the classical chi-square and likelihood ratio tests are used. Suppose now that for each individual a set of p characteristics is also observed. Those explanatory variables, likely to be associated with R and S, can play a major role in their possible association, and it can therefore be interesting to test the independence between R and S conditionally on them. In this paper, we propose two nonparametric tests which generalise the chi-square and the likelihood ratio ideas to this case. The procedure is based on a kernel estimator of the conditional probabilities. The asymptotic law of the proposed test statistics under the conditional independence hypothesis is derived; the finite sample behaviour of the procedure is analysed through some Monte Carlo experiments and the approach is illustrated with a real data example.  相似文献   

14.
In this paper we aim to estimate the direction in general single-index models and to select important variables simultaneously when a diverging number of predictors are involved in regressions. Towards this end, we propose the nonconcave penalized inverse regression method. Specifically, the resulting estimation with the SCAD penalty enjoys an oracle property in semi-parametric models even when the dimension, pn, of predictors goes to infinity. Under regularity conditions we also achieve the asymptotic normality when the dimension of predictor vector goes to infinity at the rate of pn=o(n1/3) where n is sample size, which enables us to construct confidence interval/region for the estimated index. The asymptotic results are augmented by simulations, and illustrated by analysis of an air pollution dataset.  相似文献   

15.
A minimum volume (MV) set, at level α, is a set having minimum volume among all those sets containing at least α probability mass. MV sets provide a natural notion of the ‘central mass’ of a distribution and, as such, have recently become popular as a tool for the detection of anomalies in multivariate data. Motivated by the fact that anomaly detection problems frequently arise in settings with temporally indexed measurements, we propose here a new method for the estimation of MV sets from dependent data. Our method is based on the concept of complexity-penalized estimation, extending recent work of Scott and Nowak for the case of independent and identically distributed measurements, and has both desirable theoretical properties and a practical implementation. Of particular note is the fact that, for a large class of stochastic processes, choice of an appropriate complexity penalty reduces to the selection of a single tuning parameter, which represents the data dependency of the underlying stochastic process. While in reality the dependence structure is unknown, we offer a data-dependent method for selecting this parameter, based on subsampling principles. Our work is motivated by and illustrated through an application to the detection of anomalous traffic levels in Internet traffic time series.  相似文献   

16.
Consider the nonparametric regression model Yni=g(xni)+εni for i=1,…,n, where g is unknown, xni are fixed design points, and εni are negatively associated random errors. Nonparametric estimator gn(x) of g(x) will be introduced and its asymptotic properties are studied. In particular, the pointwise and uniform convergence of gn(x) and its asymptotic normality will be investigated. This extends the earlier work on independent random errors (e.g. see J. Multivariate Anal. 25(1) (1988) 100).  相似文献   

17.
A general depth measure, based on the use of one-dimensional linear continuous projections, is proposed. The applicability of this idea in different statistical setups (including inference in functional data analysis, image analysis and classification) is discussed. A special emphasis is made on the possible usefulness of this method in some statistical problems where the data are elements of a Banach space.The asymptotic properties of the empirical approximation of the proposed depth measure are investigated. In particular, its asymptotic distribution is obtained through U-statistics techniques. The practical aspects of these ideas are discussed through a small simulation study and a real-data example.  相似文献   

18.
Recent advances in the transformation model have made it possible to use this model for analyzing a variety of censored survival data. For inference on the regression parameters, there are semiparametric procedures based on the normal approximation. However, the accuracy of such procedures can be quite low when the censoring rate is heavy. In this paper, we apply an empirical likelihood ratio method and derive its limiting distribution via U-statistics. We obtain confidence regions for the regression parameters and compare the proposed method with the normal approximation based method in terms of coverage probability. The simulation results demonstrate that the proposed empirical likelihood method overcomes the under-coverage problem substantially and outperforms the normal approximation based method. The proposed method is illustrated with a real data example. Finally, our method can be applied to general U-statistic type estimating equations.  相似文献   

19.
20.
Let (X,Y) be a Rd×N0-valued random vector where the conditional distribution of Y given X=x is a Poisson distribution with mean m(x). We estimate m by a local polynomial kernel estimate defined by maximizing a localized log-likelihood function. We use this estimate of m(x) to estimate the conditional distribution of Y given X=x by a corresponding Poisson distribution and to construct confidence intervals of level α of Y given X=x. Under mild regularity conditions on m(x) and on the distribution of X we show strong convergence of the integrated L1 distance between Poisson distribution and its estimate. We also demonstrate that the corresponding confidence interval has asymptotically (i.e., for sample size tending to infinity) level α, and that the probability that the length of this confidence interval deviates from the optimal length by more than one converges to zero with the number of samples tending to infinity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号