首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
本文选取74例急性脑梗死(ACI)患者作为研究对象,入院时根据美国国立卫生院卒中量表(NIHSS)评分分为重度组(NIHSS评分>15分,n=21)、中度组(NIHSS评分5?15分,n=24)、轻度组(NIHSS评分<5分,n=29),均接受血清copeptin和IL-18水平检测及螺旋CT成像检查.结果发现,随AC...  相似文献   

2.
本文对实时超声心动图(RTE)与平板运动心电图(TET)诊断冠心病(CHD)的价值进行了研究与分析。选取2016年6月~2018年6月疑似CHD患者100例,所有患者均给予RTE、TET检查,并以冠脉造影(CAG)作为金标准,采用受试者工作特征(ROC)曲线分析RTE与TET对CHD的诊断价值。CAG结果显示,100例疑似CHD患者中,CHD患者40例(40.00%)、非CHD患者60例(60.00%);ROC曲线分析结果显示,在诊断CHD敏感度、特异度、准确度、曲线下面积中,RTE为85.00%、86.67%、86.00%、0.837,TET为80.00%、76.67%、78.00%、0.768,RTE与TET联合为100.00%、96.67%、98.00%、0.924,明显高于RTE、TET,差异有统计学意义(P<0.05),RTE高于TET,但差异无统计学意义(P>0.05)。本文证实了RTE、TET是诊断CHD重要的方法,且RTE与TET联合的诊断敏感度、特异度、准确度、曲线下面积更高。  相似文献   

3.
Glide SP mode enrichment results for two preparations of the DUD dataset and native ligand docking RMSDs for two preparations of the Astex dataset are presented. Following a best-practices preparation scheme, an average RMSD of 1.140 ? for native ligand docking with Glide SP is computed. Following the same best-practices preparation scheme for the DUD dataset an average area under the ROC curve (AUC) of 0.80 and average early enrichment via the ROC (0.1?%) metric of 0.12 were observed. 74 and 56?% of the 39 best-practices prepared targets showed AUC over 0.7 and 0.8, respectively. Average AUC was greater than 0.7 for all best-practices protein families demonstrating consistent enrichment performance across a broad range of proteins and ligand chemotypes. In both Astex and DUD datasets, docking performance is significantly improved employing a best-practices preparation scheme over using minimally-prepared structures from the PDB. Enrichment results for WScore, a new scoring function and sampling methodology integrating WaterMap and Glide, are presented for four DUD targets, hivrt, hsp90, cdk2, and fxa. WScore performance in early enrichment is consistently strong and all systems examined show AUC?>?0.9 and superior early enrichment to DUD best-practices Glide SP results.  相似文献   

4.
Variable (wavelength or feature) selection techniques have become a critical step for the analysis of datasets with high number of variables and relatively few samples. In this study, a novel variable selection strategy, variable combination population analysis (VCPA), was proposed. This strategy consists of two crucial procedures. First, the exponentially decreasing function (EDF), which is the simple and effective principle of ‘survival of the fittest’ from Darwin’s natural evolution theory, is employed to determine the number of variables to keep and continuously shrink the variable space. Second, in each EDF run, binary matrix sampling (BMS) strategy that gives each variable the same chance to be selected and generates different variable combinations, is used to produce a population of subsets to construct a population of sub-models. Then, model population analysis (MPA) is employed to find the variable subsets with the lower root mean squares error of cross validation (RMSECV). The frequency of each variable appearing in the best 10% sub-models is computed. The higher the frequency is, the more important the variable is. The performance of the proposed procedure was investigated using three real NIR datasets. The results indicate that VCPA is a good variable selection strategy when compared with four high performing variable selection methods: genetic algorithm–partial least squares (GA–PLS), Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS), competitive adaptive reweighted sampling (CARS) and iteratively retains informative variables (IRIV). The MATLAB source code of VCPA is available for academic research on the website: http://www.mathworks.com/matlabcentral/fileexchange/authors/498750.  相似文献   

5.
In multivariate calibration with the spectral dataset, variable selection is often applied to identify relevant subset of variables, leading to improved prediction accuracy and easy interpretation of the selected fingerprint regions. Until now, numerous variable selection methods have been proposed, but a proper choice among them is not trivial. Furthermore, in many cases, a set of variables found by those methods might not be robust due to the irreproducibility and uncertainty issues, posing a great challenge in improving the reliability of the variable selection. In this study, the reproducibility of the 5 variable selection methods was investigated quantitatively for evaluating their performance. The reproducibility of variable selection was quantified by using Monte-Carlo sub-sampling (MCS) techniques together with the quantitative similarity measure designed for the highly collinear spectral dataset. The investigation of reproducibility and prediction accuracy of the several variable selection algorithms with two different near-infrared (NIR) datasets illustrated that the different variable selection methods exhibited wide variability in their performance, especially in their capabilities to identify the consistent subset of variables from the spectral datasets. Thus the thorough assessment of the reproducibility together with the predictive accuracy of the identified variables improved the statistical validity and confidence of the selection outcome, which cannot be addressed by the conventional evaluation schemes.  相似文献   

6.
7.
The significance of evolving mixtures structural spectroscopic studies might appear limited when the experimental spectra do not present a sufficient quality for a precise interpretation. It is the case when the chemical behaviour of macromolecules is studied on the basis of infrared spectra. If the effective resolution is low, the spectral profiles appear similar despite the applied chemical conditions change. This makes impossible the interpretation of the raw spectra and mathematical treatments are required to separate the different contributions that overlap.To determine the behaviour of the reactive sites of humic acids in the binding with heavy metals, infrared spectra are recorded under various chemical conditions. The cation to be considered is Pb2+ and the two chemical variables to be studied are pH and initial lead concentration. Four series of FTIR spectra are recorded, but no visible difference can be directly assigned to the different chemical states of the macromolecules. Multivariate self-modelling curve resolution is thus proposed as a tool for resolving these complex and strong overlapping datasets. First, initial estimates are obtained from pure variable detection methods: it comes out that two spectra are enough to reconstruct the experimental matrices. In a further step, the application of the multivariate curve resolution-alternating least squares (MCR-ALS) algorithm with additional constraints on each individual dataset, as well as on column-wise augmented matrices, allows to optimise the profiles and spectra that appear to be highly characterising the acid and the salt form of the molecule. Moreover, the concentrations profiles associated to these two limit spectral forms allow interpreting the analytical measurements made during the reactions between humic acids and H+ or Pb2+. Consequently, depending on the initial state of the humic acid, two distinct reactional mechanisms are proposed.  相似文献   

8.
《Analytical letters》2012,45(13):2238-2254
A new variable selection method called ensemble regression coefficient analysis is reported on the basis of model population analysis. In order to construct ensemble regression coefficients, many subsets of variables are randomly selected to calibrate corresponding partial least square models. Based on ensemble theory, the mean of regression coefficients of the models is set as the ensemble regression coefficient. Subsequently, the absolute value of the ensemble regression coefficient can be applied as an informative vector for variable selection. The performance of ensemble regression coefficient analysis was assessed by four near infrared datasets: two simulated datasets, one wheat dataset, and one tobacco dataset. The results showed that this approach can select important variables to obtain fewer errors compared with regression coefficient analysis and Monte Carlo uninformative variable elimination.  相似文献   

9.
This paper presents a preliminary study in building discriminant models from solid-state NMR spectrometry data to detect the presence of acetaminophen in over-the-counter pharmaceutical formulations. The dataset, containing 11 spectra of pure substances and 21 spectra of various formulations, was processed by partial least squares discriminant analysis (PLS-DA). The model found coped with the discrimination, and its quality parameters were acceptable. It was found that standard normal variate preprocessing had almost no influence on unsupervised investigation of the dataset. The influence of variable selection with the uninformative variable elimination by PLS method was studied, reducing the dataset from 7601 variables to around 300 informative variables, but not improving the model performance. The results showed the possibility to construct well-working PLS-DA models from such small datasets without a full experimental design.  相似文献   

10.
In the literature, much effort has been put into modeling dependence among variables and their interactions through nonlinear transformations of predictive variables. In this paper, we propose a nonlinear generalization of Partial Least Squares (PLS) using multivariate additive splines. We discuss the advantages and drawbacks of the proposed model, building it via the generalized cross validation criterion (GCV) criterion, and show its performance on a real dataset and on simulated datasets in comparison to other methods based on splines. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

11.
Biomarker discovery is one important goal in metabolomics, which is typically modeled as selecting the most discriminating metabolites for classification and often referred to as variable importance analysis or variable selection. Until now, a number of variable importance analysis methods to discover biomarkers in the metabolomics studies have been proposed. However, different methods are mostly likely to generate different variable ranking results due to their different principles. Each method generates a variable ranking list just as an expert presents an opinion. The problem of inconsistency between different variable ranking methods is often ignored. To address this problem, a simple and ideal solution is that every ranking should be taken into account. In this study, a strategy, called rank aggregation, was employed. It is an indispensable tool for merging individual ranking lists into a single “super”-list reflective of the overall preference or importance within the population. This “super”-list is regarded as the final ranking for biomarker discovery. Finally, it was used for biomarkers discovery and selecting the best variable subset with the highest predictive classification accuracy. Nine methods were used, including three univariate filtering and six multivariate methods. When applied to two metabolic datasets (Childhood overweight dataset and Tubulointerstitial lesions dataset), the results show that the performance of rank aggregation has improved greatly with higher prediction accuracy compared with using all variables. Moreover, it is also better than penalized method, least absolute shrinkage and selectionator operator (LASSO), with higher prediction accuracy or less number of selected variables which are more interpretable.  相似文献   

12.
Probability density functions (PDFs) have been derived for a number of commonly used limit of detection definitions, including several variants of the Relative Standard Deviation of the Background–Background Equivalent Concentration (RSDB–BEC) method, for a simple linear chemical measurement system (CMS) having homoscedastic, Gaussian measurement noise and using ordinary least squares (OLS) processing. All of these detection limit definitions serve as both decision and detection limits, thereby implicitly resulting in 50% rates of Type 2 errors. It has been demonstrated that these are closely related to Currie decision limits, if the coverage factor, k, is properly defined, and that all of the PDFs are scaled reciprocals of noncentral t variates. All of the detection limits have well-defined upper and lower limits, thereby resulting in finite moments and confidence limits, and the problem of estimating the noncentrality parameter has been addressed. As in Parts 1–3, extensive Monte Carlo simulations were performed and all the simulation results were found to be in excellent agreement with the derived theoretical expressions. Specific recommendations for harmonization of detection limit methodology have also been made.  相似文献   

13.
本研究选取了分化型甲状腺癌(DTC)患者106例为甲状腺癌组,同期106例甲状腺腺瘤患者为甲状腺腺瘤组,检测比较了两组患者的超声弹性成像参数(弹性比值、蓝色面积比值)、血清中期因子(midkine,MK)、血管内皮生长因子(VEGF)水平.研究结果发现,弹性比值、蓝色面积比值、血清MK和VEGF水平与DTC患者淋巴结转...  相似文献   

14.
A μs and ms pulsed argon glow discharge was investigated with respect to the breakdown condition (Paschen curve). Moreover, current–voltage profiles were acquired for different discharge frequencies, pulse durations, cathode–anode spacing and discharge pressures. The breakdown voltage was dependent on the cathode material (Cu, steel, Ti and Al). No severe change in the breakdown voltage was observed for a 1 ms pulse at different frequencies. However, the theoretical breakdown curve, calculated based on the Paschen equation did not fit the experimental data. The current plots for different cathode–anode spacing showed a maximum at intermediate distance (8–10 mm). These data were consistent with mass spectrometric data acquired using the same instrument in a GC-GD-TOFMS chemical speciation study.  相似文献   

15.
Many commercially available software programs claim similar efficiency and accuracy as variable selection tools. Genetic algorithms are commonly used variable selection methods where most relevant variables can be differentiated from ‘less important’ variables using evolutionary computing techniques. However, different vendors offer several algorithms, and the puzzling question is: which one is the appropriate method of choice? In this study, several genetic algorithm tools (e.g. GFA from Cerius2, QuaSAR-Evolution from MOE and Partek’s genetic algorithm) were compared. Stepwise multiple linear regression models were generated using the most relevant variables identified by the above genetic algorithms. This procedure led to the successful generation of Quantitative Structure–activity Relationship (QSAR) models for (a) proprietary datasets and (b) the Selwood dataset.  相似文献   

16.
The selection abilities of the two well‐known techniques of variable selection, synergy interval‐partial least‐squares (SiPLS) and genetic algorithm‐partial least‐squares (GA‐PLS), have been examined and compared. By using different simulated and real (corn and metabolite) datasets, keeping in view the spectral overlapping of the components, the influence of the selection of either intervals of variables or individual variables on the prediction performances was examined. In the simulated datasets, with decrease in the overlapping of the spectra of components and cases with components of narrow bands, GA‐PLS results were better. In contrast, the performance of SiPLS was higher for data of intermediate overlapping. For mixtures of high overlapping analytes, GA‐PLS showed slightly better performance. However, significant differences between the results of the two selection methods were not observed in most of the cases. Although SiPLS resulted in slightly better performance of prediction in the case of corn dataset except for the prediction of the moisture content, the improvement obtained by SiPLS compared with that by GA‐PLS was not significant. For real data of less overlapped components (metabolite dataset), GA‐PLS that tends to select far fewer variables did not give significantly better root mean square error of cross‐validation (RMSECV), cross‐validated R2 (Q2), and root mean square error of prediction (RMSEP) compared with SiPLS. Irrespective of the type of dataset, GA‐PLS resulted in models with fewer latent variables (LVs). When comparing the computational time of the methods, GA‐PLS is considered superior to SiPLS. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

17.
Large datasets containing many spectra commonly associated with in situ or operando experiments call for new data treatment strategies as conventional scan by scan data analysis methods have become a time-consuming bottleneck. Several convenient automated data processing procedures like least square fitting of reference spectra exist but are based on assumptions. Here we present the application of multivariate curve resolution (MCR) as a blind-source separation method to efficiently process a large data set of an in situ X-ray absorption spectroscopy experiment where the sample undergoes a periodic concentration perturbation. MCR was applied to data from a reversible reduction–oxidation reaction of a rhenium promoted cobalt Fischer–Tropsch synthesis catalyst. The MCR algorithm was capable of extracting in a highly automated manner the component spectra with a different kinetic evolution together with their respective concentration profiles without the use of reference spectra. The modulative nature of our experiments allows for averaging of a number of identical periods and hence an increase in the signal to noise ratio (S/N) which is efficiently exploited by MCR. The practical and added value of the approach in extracting information from large and complex datasets, typical for in situ and operando studies, is highlighted.  相似文献   

18.
Nowadays, with a high dimensionality of dataset, it faces a great challenge in the creation of effective methods which can select an optimal variables subset. In this study, a strategy that considers the possible interaction effect among variables through random combinations was proposed, called iteratively retaining informative variables (IRIV). Moreover, the variables are classified into four categories as strongly informative, weakly informative, uninformative and interfering variables. On this basis, IRIV retains both the strongly and weakly informative variables in every iterative round until no uninformative and interfering variables exist. Three datasets were employed to investigate the performance of IRIV coupled with partial least squares (PLS). The results show that IRIV is a good alternative for variable selection strategy when compared with three outstanding and frequently used variable selection methods such as genetic algorithm-PLS, Monte Carlo uninformative variable elimination by PLS (MC-UVE-PLS) and competitive adaptive reweighted sampling (CARS). The MATLAB source code of IRIV can be freely downloaded for academy research at the website: http://code.google.com/p/multivariate-calibration/downloads/list.  相似文献   

19.
Montville D  Voigtman E 《Talanta》2003,59(3):461-476
There are a number of possible ways to define the instrumental limit of detection (LOD) figure of merit and in the present work we define four such sample test statistics and derive their probability density functions (PDFs). Although the derived PDFs were found to be irreducible integrals, they are easily evaluated via numerical integration and can be used to obtain expectation values, population precisions and confidence intervals. Monte Carlo simulation methods were used to prepare normalized histograms of one million LOD variates each, for homoscedastic linear calibration curve systems, and these were found to be in excellent agreement with the numerically obtained PDFs and associated statistics. The software used in all aspects of the present work is available for free, with full, commented source code, and is designed to facilitate exploration of calibration curve systems of immediate interest to the reader.  相似文献   

20.
Datasets of molecular compounds often contain outliers, that is, compounds which are different from the rest of the dataset. Outliers, while often interesting may affect data interpretation, model generation, and decisions making, and therefore, should be removed from the dataset prior to modeling efforts. Here, we describe a new method for the iterative identification and removal of outliers based on a k‐nearest neighbors optimization algorithm. We demonstrate for three different datasets that the removal of outliers using the new algorithm provides filtered datasets which are better than those provided by four alternative outlier removal procedures as well as by random compound removal in two important aspects: (1) they better maintain the diversity of the parent datasets; (2) they give rise to quantitative structure activity relationship (QSAR) models with much better prediction statistics. The new algorithm is, therefore, suitable for the pretreatment of datasets prior to QSAR modeling. © 2014 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号