排序方式: 共有54条查询结果,搜索用时 46 毫秒
31.
The aim of this study is to show the usefulness of robust multiple regression techniques implemented in the expectation maximization framework in order to model successfully data containing missing elements and outlying objects. In particular, results from a comparative study of partial least squares and partial robust M-regression models implemented in the expectation maximization algorithm are presented. The performances of the proposed approaches are illustrated on simulated data with and without outliers, containing different percentages of missing elements and on a real data set. The obtained results suggest that the proposed methodology can be used for constructing satisfactory regression models in terms of their trimmed root mean squared errors. 相似文献
32.
When drugs are poorly soluble then, instead of the potentiometric determination of dissociation constants, pH-spectrophotometric
titration can be used along with nonlinear regression of the absorbance response surface data. Generally, regression models
are extremely useful for extracting the essential features from a multiwavelength set of data. Regression diagnostics represent
procedures for examining the regression triplet (data, model, method) in order to check (a) the data quality for a proposed model; (b) the model quality for a given set of data; and (c) that
all of the assumptions used for least squares hold. In the interactive, PC-assisted diagnosis of data, models and estimation
methods, the examination of data quality involves the detection of influential points, outliers and high leverages, that cause many problems when regression fitting the absorbance response hyperplane. All graphically
oriented techniques are suitable for the rapid estimation of influential points. The reliability of the dissociation constants
for the acid drug silybin may be proven with goodness-of-fit tests of the multiwavelength spectrophotometric pH-titration
data. The uncertainty in the measurement of the pK
a of a weak acid obtained by the least squares nonlinear regression analysis of absorption spectra is calculated. The procedure
takes into account the drift in pH measurement, the drift in spectral measurement, and all of the drifts in analytical operations,
as well as the relative importance of each source of uncertainty. The most important source of uncertainty in the experimental
set-up for the example is the uncertainty in the pH measurement. The influences of various sources of uncertainty on the accuracy
and precision are discussed using the example of the mixed dissociation constants of silybin, obtained using the SQUAD(84)
and SPECFIT/32 regression programs. 相似文献
33.
快速稳健偏最小二乘回归及其在近红外光谱分析中的应用 总被引:4,自引:0,他引:4
现代近红外光谱,作为一种间接分析技术,将建立校正模型,实现对未知样本的定量分析。针对近红外光谱分析灵敏度低、抗干扰性差的弱点,构建一种快速稳健的偏最小二乘回归(RRPLSR)算法。它运用峭度法快速识别离群点,排除它们后,再实施偏最小二乘回归,消除复共线性,建立稳健可靠的定量校正模型。将RRPLSR方法实际应用于鱼类物质的近红外光谱数据分析,实现脂肪含量的定量检测,效果良好。与已有的其他方法相比,它能准确识别离群点,所建模型预测性能良好,且计算省时,效率高,适用于快速检测。 相似文献
34.
35.
Covering points by disjoint boxes with outliers 总被引:1,自引:0,他引:1
36.
Jean-Louis Féménias 《Journal of Molecular Spectroscopy》2003,217(1):32-42
The goodness-of-fit problem is addressed and two among the more efficient tests presently available are revisited and discussed: the autocorrelation method and the sign test recently proposed by Paolo Minguzzi [J. Mol. Spectrosc. 209, (2001) 169], both based on residual analysis. A mathematical proof of the empirical relations needed in the sign test is proposed; it allows more insight in the comparison of both methods and shows that more information may be available from the sign test. Some differences of efficiency are pointed out in special cases, but finally these methods prove to be closely related. A general procedure is deduced that takes advantage of the practical usefulness of both approaches, of their efficiency and of the statistical information each of them can provide. 相似文献
37.
本文对多元线性回归模型定义了AP统计量和距离影响函数;将它们分解为两项之积,指出强影响点、异常点、高杠杆点间的内在联系;讨论了上述三种点的探查方法.最后,给出与距离影响函数有关的一个定理. 相似文献
38.
A statistical procedure is called robust if it is insensitive to the occurence of gross errors in the data. The ordinary least squares regression technique does not satisfy this property, because even a single outlier can totally offset the result. Therefore, the least trimmed squares (LTS) technique is introduced, which can resist the effect of a large percentage of outliers. The latter method is illustrated on data concerning life insurance, pension funds, health insurance, and inflation. 相似文献
39.
PERT is a widely utilized framework for project management. However, as a result of underlying assumptions about the activity times, the PERT formulas prescribe a light-tailed distribution with a constant variance conditional on the range. Given the pervasiveness of heavy-tailed phenomena in business contexts as well as inherently differing levels of uncertainty about different activities, there is a need for a more flexible distribution which allows for varying amounts of dispersion and greater likelihoods of more extreme tail-area events. In particular, we argue that the tail-area decay of an activity time distribution is a key factor which has been insufficiently considered previously. We provide a distribution which permits varying amounts of dispersion and greater likelihoods of more extreme tail-area events that is straightforward to implement with expert judgments. Moreover, the distribution can be integrated into the PERT framework such that the classic PERT results represent an important special case of the method presented here. 相似文献
40.
Efstathiou CE 《Talanta》2006,69(5):1068-1071
Common significance tests carried out using statistical software packages usually return to the user the probability p of type I error as the result. Based on p and the preset confidence level the user will decide on the acceptance or the rejection of the associated null hypothesis. Dixon's test (Q-test) is commonly used for the detection of an outlier within a set of N observations (typically: N = 3–12). Q-test can only be applied by comparing the experimental value of the statistic Q with tabulated critical Q-values corresponding to some standard values of p. Hence, for a given value of Q and a number of observations, N, the user knows only the range and not the value of the associated probability p of type I error (erroneous rejection). This is due to the lack of explicit expressions of the form p = F(Q,N). In this work, a simple stochastic (Monte Carlo) approach is presented for the estimation of p corresponding to a given experimental value of Q and size N of the data set. In addition, based on Dixon's equations, explicit expressions of p are given for N = 3 and 4. 相似文献