首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Nonlinear underdetermined blind separation of nonnegative dependent sources consists in decomposing a set of observed nonlinearly mixed signals into a greater number of original nonnegative and dependent component (source) signals. This hard problem is practically relevant for contemporary metabolic profiling of biological samples, where sources (a.k.a. pure components or analytes) are aimed to be extracted from mass spectra of nonlinear multicomponent mixtures. This paper presents a method for nonlinear underdetermined blind separation of nonnegative dependent sources that comply with a sparse probabilistic model, that is, sources are constrained to be sparse in support and amplitude. This model is validated on experimental pure component mass spectra. Under a sparse prior, a nonlinear problem is converted into an equivalent linear one comprised of original sources and their higher‐order, mostly second‐order, monomials. The influence of these monomials, which stand for error terms, is reduced by preprocessing a matrix of mixtures by means of robust principal component analysis and hard, soft and trimmed thresholding. Preprocessed data matrices are mapped in high‐dimensional reproducible kernel Hilbert space (RKHS) of functions by means of an empirical kernel map. Sparseness‐constrained nonnegative matrix factorizations in RKHS yield sets of separated components. They are assigned to pure components from the library using a maximal correlation criterion. The methodology is exemplified on demanding numerical and experimental examples related respectively to extraction of eight dependent components from three nonlinear mixtures and to extraction of 25 dependent analytes from nine nonlinear mixture mass spectra recorded in nonlinear chemical reaction of peptide synthesis. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
The nonlinear, nonnegative single‐mixture blind source separation problem consists of decomposing observed nonlinearly mixed multicomponent signal into nonnegative dependent component (source) signals. The problem is difficult and is a special case of the underdetermined blind source separation problem. However, it is practically relevant for the contemporary metabolic profiling of biological samples when only one sample is available for acquiring mass spectra; afterwards, the pure components are extracted. Herein, we present a method for the blind separation of nonnegative dependent sources from a single, nonlinear mixture. First, an explicit feature map is used to map a single mixture into a pseudo multi‐mixture. Second, an empirical kernel map is used for implicit mapping of a pseudo multi‐mixture into a high‐dimensional reproducible kernel Hilbert space. Under sparse probabilistic conditions that were previously imposed on sources, the single‐mixture nonlinear problem is converted into an equivalent linear, multiple‐mixture problem that consists of the original sources and their higher‐order monomials. These monomials are suppressed by robust principal component analysis and hard, soft, and trimmed thresholding. Sparseness‐constrained nonnegative matrix factorizations in reproducible kernel Hilbert space yield sets of separated components. Afterwards, separated components are annotated with the pure components from the library using the maximal correlation criterion. The proposed method is depicted with a numerical example that is related to the extraction of eight dependent components from one nonlinear mixture. The method is further demonstrated on three nonlinear chemical reactions of peptide synthesis in which 25, 19, and 28 dependent analytes are extracted from one nonlinear mixture mass spectra. The goal application of the proposed method is, in combination with other separation techniques, mass spectrometry‐based non‐targeted metabolic profiling, such as biomarker identification studies. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

3.
The paper presents sparse component analysis (SCA)‐based blind decomposition of the mixtures of mass spectra into pure components, wherein the number of mixtures is less than number of pure components. Standard solutions of the related blind source separation (BSS) problem that are published in the open literature require the number of mixtures to be greater than or equal to the unknown number of pure components. Specifically, we have demonstrated experimentally the capability of the SCA to blindly extract five pure components mass spectra from two mixtures only. Two approaches to SCA are tested: the first one based on ?1 norm minimization implemented through linear programming and the second one implemented through multilayer hierarchical alternating least square nonnegative matrix factorization with sparseness constraints imposed on pure components spectra. In contrast to many existing blind decomposition methods no a priori information about the number of pure components is required. It is estimated from the mixtures using robust data clustering algorithm together with pure components concentration matrix. Proposed methodology can be implemented as a part of software packages used for the analysis of mass spectra and identification of chemical compounds. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

4.
Gao HT  Li TH  Chen K  Li WG  Bi X 《Talanta》2005,66(1):65-73
Non-negative matrix factorization (NMF), with the constraints of non-negativity, has been recently proposed for multi-variate data analysis. Because it allows only additive, not subtractive, combinations of the original data, NMF is capable of producing region or parts-based representation of objects. It has been used for image analysis and text processing. Unlike PCA, the resolutions of NMF are non-negative and can be easily interpreted and understood directly. Due to multiple solutions, the original algorithm of NMF [D.D. Lee, H.S. Seung, Nature 401 (1999) 788] is not suitable for resolving chemical mixed signals. In reality, NMF has never been applied to resolving chemical mixed signals. It must be modified according to the characteristics of the chemical signals, such as smoothness of spectra, unimodality of chromatograms, sparseness of mass spectra, etc. We have used the modified NMF algorithm to narrow the feasible solution region for resolving chemical signals, and found that it could produce reasonable and acceptable results for certain experimental errors, especially for overlapping chromatograms and sparse mass spectra. Simulated two-dimensional (2-D) data and real GUJINGGONG alcohol liquor GC-MS data have been resolved soundly by NMF technique. Butyl caproate and its isomeric compound (butyric acid, hexyl ester) have been identified from the overlapping spectra. The result of NMF is preferable to that of Heuristic evolving latent projections (HELP). It shows that NMF is a promising chemometric resolution method for complex samples.  相似文献   

5.
Sparse component analysis (SCA) is demonstrated for blind extraction of three pure component spectra from only two measured mixed spectra in 13C and 1H nuclear magnetic resonance (NMR) spectroscopy. This appears to be the first time to report such results and that is the first novelty of the paper. Presented concept is general and directly applicable to experimental scenarios that possibly would require use of more than two mixtures. However, it is important to emphasize that number of required mixtures is always less than number of components present in these mixtures. The second novelty is formulation of blind NMR spectra decomposition exploiting sparseness of the pure components in the wavelet basis defined by either Morlet or Mexican hat wavelet. This enabled accurate estimation of the concentration matrix and number of pure components by means of data clustering algorithm and pure components spectra by means of linear programming with constraints from both 1H and 13C NMR experimental data. The third novelty is capability of proposed method to estimate number of pure components in demanding underdetermined blind source separation (uBSS) scenario. This is in contrast to majority of the BSS algorithms that assume this information to be known in advance. Presented results are important for the NMR spectroscopy-associated data analysis in pharmaceutical industry, medicine diagnostics and natural products research.  相似文献   

6.
针对高维小样本质谱数据在构造模型时易产生的过拟合现象、变量间的严重共线性、及结构与性质间的非线性关系,采用了核分段逆回归(KSIR)特征提取集成线性判别分析(LDA)新技术。首先以KSIR算法完成质谱数据的非线性特征提取,然后在由新特征矢量张成的低维空间构造样本类别的线性判别函数,负责各样本个体类别的判定。将KSIR-LDA方法应用于软饮料的质谱数据分类,结果表明:该方法不仅适应质谱数据与性质间的非线性关系,而且可以更少、解释能力更强的特征变量取得更高的分类精度,并能实现在低维特征空间对数据的解释及可视化。  相似文献   

7.
Because mass spectrometers provide their own dispersion and resolution of analytes, electrospray ionization mass spectrometry (ESI‐MS) has become a workhorse for the characterization of complex mixtures from aerosols to crude oil. Unfortunately, ESI mass spectra commonly contain multimers, adducts and fragments. For the characterization of complex mixtures of unknown initial composition, this presents a significant concern. Mixed‐multimer formation could potentially lead to results that bare no resemblance to the original mixture. Conversely, ESI‐MS has continually reflected subtle differences between natural organic matter mixtures that are in agreement with prediction or theory. Knowing the real limitations of the technique is therefore critical to avoiding both over‐interpretation and unwarranted skepticism. Here, data were collected on four mass spectrometers under a battery of conditions. Results indicate that formation of unrepresentative ions cannot entirely be ruled out, but non‐covalent multimers do not appear to make a major contribution to typical natural organic matter spectra based on collision‐induced dissociation results. Multimers also appear notably reduced when a cooling gas is present in the accumulation region of the mass spectrometer. For less complex mixtures, the choice of spray solvent can make a difference, but generally spectrum cleanliness (i.e. representativeness) comes at the price of increased selectivity. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
Non‐negative matrix factorization (NMF) is a widely used approach in signal processing. In this work, we apply it to the component recognition of mixtures with multicomponent three‐dimensional fluorescence spectra. Compared with the popular PARAFAC for component recognition, NMF has the following advantages: on one hand, the decomposed spectra are three dimensional, and thus, more information can be obtained, which is beneficial for component recognition; on the other hand, the decomposed spectra are non‐negative and thus have a certain physical significance. More importantly, we propose a type of integrated similarity indices for the three‐dimensional fluorescence spectra, which, by construction, is good at component recognition from overlapping fluorescence spectra. Experiment results demonstrate that NMF combined with integrated similarity index provides an effective method for component recognition of multicomponent three‐dimensional overlapping fluorescence spectra. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

9.
Comprehensive analysis of high‐resolution mass spectra of aged natural dammar resin obtained with Fourier transform ion cyclotron resonance mass spectrometer (FT‐ICR‐MS) using matrix‐assisted laser desorption/ionization (MALDI) and atmospheric pressure chemical ionization (APCI) is presented. Dammar resin is one of the most important components of painting varnishes. Dammar resin is a terpenoid resin (dominated by triterpenoids) with intrinsically very complex composition. This complexity further increases with aging. Ten different solvents and two‐component solvent mixtures were tested for sample preparation. The most suitable solvent mixtures for the MALDI‐FT‐ICR‐MS analysis were dichloromethane‐acetone and dichloromethane‐ethanol. The obtained MALDI‐FTMS mass spectrum contains nine clusters of peaks in the m/z range of 420–2200, and the obtained APCI‐FTMS mass spectrum contains three clusters of peaks in the m/z range of 380–910. The peaks in the clusters correspond to the oxygenated derivatives of terpenoids differing by the number of C15H24 units. The clusters, in turn, are composed of subclusters differing by the number of oxygen atoms in the molecules. Thorough analysis and identification of the components (or groups of components) by their accurate m/z ratios was carried out, and molecular formulas (elemental compositions) of all major peaks in the MALDI‐FTMS and APCI‐FTMS spectra were identified (and groups of possible isomeric compounds were proposed). In the MALDI‐FTMS and APCI‐FTMS mass spectrum, besides the oxidized C30, triterpenoids also peaks corresponding to C29 and C31 derivatives of triterpenoids (demethylated and methylated, correspondingly) were detected. MALDI and APCI are complementary ionization sources for the analysis of natural dammar resin. In the MALDI source, preferably polar (extensively oxidized) components of the resin are ionized (mostly as Na+ adducts), whereas in the APCI source, preferably nonpolar (hydrocarbon and slightly oxidized) compounds are ionized (by protonation). Either of the two ionization methods, when used alone, gives an incomplete picture of the dammar resin composition. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

10.
We consider blind source separation in chemical analysis focussing on the 3D fluorescence spectroscopy framework. We present an alternative method to process the Fluorescence Excitation‐Emission Matrices (FEEM): first, a preprocessing is applied to eliminate the Raman and Rayleigh scattering peaks that clutter the FEEM. To improve its robustness versus possible improper settings, we suggest to associate the classical Zepp's method with a morphological image filtering technique. Then, in the second stage, the Canonical Polyadic (CP or Candecomp/Parafac) decomposition of a nonnegative three‐way array has to be computed. In the fluorescence spectroscopy context, the constituent vectors of the loading matrices should be nonnegative (since standing for spectra and concentrations). Thus, we suggest a new nonnegative third order CP decomposition algorithm (NNCP) based on a nonlinear conjugate gradient optimization algorithm with regularization terms and periodic restarts. Computer simulations performed on real experimental data are provided to enlighten the effectiveness and robustness of the whole processing chain and to validate the approach. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

11.
The analysis of complex mixtures is becoming increasingly important in various fields, such as nutrition, medicinal plants and metabolomics. The components contained in such complex mixtures are always characterized with diverse physiochemical properties that pose a major challenge during the optimization of various parameters using liquid chromatography‐mass spectrometer (LC‐MS). The parameter ‘CE energy’ that is normally set at a fixed value with a moderate range of CE spread during data‐dependent acquisition (DDA) analysis, a prevalent approach for untargeted identification, often fails to generate sufficient MS/MS fragment ions for untargeted identification of components from complex mixtures. Here we developed a simple and generally applicable acquisition method named stepped MSAll (sMSAll) in this study, aiming to obtain optimal MS/MS spectra for identification of chemically diverse compounds from complex mixtures. sMSAll collects serial MSAll scans acquired at low CE to gradually ramped‐up high CE values in a cycle that conventional DDA scans cannot afford. The resultant MS/MS spectra of each compound were compared and evaluated among serial MSAll scans, and the optimal spectra were used for identification. An untargeted data analysis strategy was then employed to analyze these optimal MS/MS spectra by searching common diagnostic ions and connecting the diagnostic ion families into a network via bridging components. This sMSAll‐based route enables identification of 71 natural products from a herbal preparation, whereas only 53 out of 71 compounds were identified using the classical DDA approach. Therefore, the sMSAll‐based approach is expected to find its wide applications for characterization of vastly diverse compounds with no priori knowledge from various complex mixtures. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

12.
A central problem in the emerging field of metabolomics is how to identify the compounds comprising a chemical mixture of biological origin. NMR spectroscopy can greatly assist in this identification process, by means of multi-dimensional correlation spectroscopy, particularly total correlation spectroscopy (TOCSY). This Communication demonstrates how non-negative matrix factorization (NMF) provides an efficient means of data reduction and clustering of TOCSY spectra for the identification of unique traces representing the NMR spectra of individual compounds. The method is applied to a metabolic mixture whose compounds could be unambiguously identified by peak matching of NMF components against the BMRB metabolomics database.  相似文献   

13.
DOSY is an NMR spectroscopy technique that resolves resonances according to the analytes’ diffusion coefficients. It has found use in correlating NMR signals and estimating the number of components in mixtures. Applications of DOSY in dilute mixtures are, however, held back by excessively long measurement times. We demonstrate herein, how the enhanced NMR sensitivity provided by SABRE hyperpolarization allows DOSY analysis of low‐micromolar mixtures, thus reducing the concentration requirements by at least 100‐fold.  相似文献   

14.
基于峭度的重叠峰解析新方法   总被引:5,自引:1,他引:4  
峭度是表征曲线陡峭程度的物理量。本实验提出了基于峭度的组分分析(component analysis based on kurtosis,CABK)方法来解析重叠峰,从中分离出纯组分信息。这种新方法的优点在于半盲源分离。即只要判断出重叠峰所含组分的数目,就可从混合谱中分离出各组分的纯谱信息。将它用来解析模拟两组分重叠峰体系和酒样的GC-MS混合谱,得到了令人满意的结果。  相似文献   

15.
This study reports an applicable analytical strategy of comprehensive identification and structure characterization of target components from Gelsemium elegans by using high‐performance liquid chromatography quadrupole time‐of‐flight mass spectrometry (LC‐QqTOF MS) based on the use of accurate mass databases combined with MS/MS spectra. The databases created included accurate masses and elemental compositions of 204 components from Gelsemium and their structural data. The accurate MS and MS/MS spectra were acquired through data‐dependent auto MS/MS mode followed by an extraction of the potential compounds from the LC‐QqTOF MS raw data of the sample. The same was matched using the databases to search for targeted components in the sample. The structures for detected components were tentatively characterized by manually interpreting the accurate MS/MS spectra for the first time. A total of 57 components have been successfully detected and structurally characterized from the crude extracts of G. elegans , but has failed to differentiate some isomers. This analytical strategy is generic and efficient, avoids isolation and purification procedures, enables a comprehensive structure characterization of target components of Gelsemium and would be widely applicable for complicated mixtures that are derived from Gelsemium preparations. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

16.
A matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry method for rapidly enumerating hydroxyl groups in analytes is described, and applied to some common polyalcohols (erythritol, mannitol and xylitol). Polyalcohols were derivatized with trimethylsilylimidazole (TMSI) either separately or as mixtures, and were analyzed, without chromatographic separation or purification. The mass spectra revealed consecutive peaks that are separated by 72 m/z units as a consequence of displacement of one hydroxyl hydrogen atom by one TMS group. The number of observed peaks was used to confirm the number of hydroxyl groups in each analyte.  相似文献   

17.
李跑  蔡文生  邵学广 《色谱》2017,35(1):8-13
化学计量学算法为重叠气相色谱-质谱(GC-MS)信号的解析提供了有效手段,但其在计算过程中一般需要将数据进行分段处理,然后只对信号的某些区间进行解析,难以实现真正意义上的高通量分析。该文结合移动窗口目标转换因子分析(MWTTFA)和非负免疫算法(NNIA),建立了一种高通量解析方法。首先,根据所有可能存在的目标组分的标准质谱信息,利用MWTTFA检验复杂信号中存在的组分,并确定目标组分的质谱信息和洗脱时间区域。以得到的质谱信息作为后续计算的输入值,利用NNIA解析得到相应的色谱信息。采用快速升温程序对17种和42种农药混合标准样品的GC-MS信号进行分析,利用所建立的方法可在10 min内得到全部组分的色谱和质谱信息。  相似文献   

18.
Raman spectra of aprotic N,N-dimethylformamide (DMF) and protic N-methylformamide (NMF) mixtures containing manganese(II), nickel(II), and zinc(II) perchlorate were obtained, and the individual solvation numbers around the metal ions were determined over the whole range of solvent compositions. Variation profiles of the individual solvation numbers with solvent composition showed no significant difference among the metal systems examined. In all of these metal systems, no preferential solvation occurs in mixtures with DMF mole fraction of x(DMF) < 0.5, whereas DMF preferentially solvates the metal ions at x(DMF) > 0.5. The liquid structure of the mixtures was also studied by means of small-angle neutron scattering (SANS) and low-frequency Raman spectroscopy. SANS experiments demonstrate that DMF molecules do not appreciably self-aggregate in the mixtures over the whole range of solvent composition. Low-frequency Raman spectroscopy suggests that DMF molecules are extensively hydrogen-bonded with NMF in NMF-rich mixtures, whereas NMF molecules extensively self-aggregate in DMF-rich mixtures, although the liquid structure in neat NMF is partly ruptured. The bulk solvent structure in the mixtures thus varies with solvent composition, which plays a decisive role in developing the varying profiles of the individual solvation numbers of metal ions in the solvent mixtures.  相似文献   

19.
The appearance of informative signals in the mass spectra of laser-ablated bio-aerosol particles depends on the effective ionization probabilities (EIP) of individual components during the laser ionization process. This study investigates how bio-aerosol chemical composition governs the EIP values of specific components and the overall features of the spectra from the bio-aerosol mass spectrometry (BAMS). EIP values were determined for a series of amino acid, dipicolinic acid, and peptide aerosol particles to determine what chemical features aid in ionization. The spectra of individual amino acids and dipicolinic acid, as well as mixtures, were examined for extent of fragmentation and the presence of molecular ion dimers, which are indicative of ionization conditions. Standard mixtures yielded information with respect to the significance of secondary ion plume reactions on observed spectra. A greater understanding of how these parameters affect EIP and spectra characteristics of bio-aerosols will aid in the intelligent selection of viable future biomarkers for the identification of bio-terrorism agents.  相似文献   

20.
A fast and reliable nuclear magnetic resonance (NMR) method for quantitative analysis of targeted compounds with overlapped signals in complex mixtures has been established. The method is based on the combination of chemometric treatment for spectra deconvolution and the PULCON principle (pulse length based concentration determination) for quantification. Independent component analysis (ICA) (mutual information least dependent component analysis (MILCA) algorithm) was applied for spectra deconvolution in up to six component mixtures with known composition. The resolved matrices (independent components, ICs and ICA scores) were used for identification of analytes, calculating their relative concentrations and absolute integral intensity of selected resonances. The absolute analyte concentrations in multicomponent mixtures and authentic samples were then calculated using the PULCON principle. Instead of conventional application of absolute integral intensity in case of undisturbed signals, the multiplication of resolved IC absolute integral and its relative concentration in the mixture for each component was used. Correction factors that are required for quantification and are unique for each analyte were also estimated. The proposed method was applied for analysis of up to five components in lemon and orange juice samples with recoveries between 90% and 111%. The total duration of analysis is approximately 45 min including measurements, spectra decomposition and quantification. The results demonstrated that the proposed method is a promising tool for rapid simultaneous quantification of up to six components in case of spectral overlap and the absence of reference materials. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号