Given a sample of binary random vectors with i.i.d. Bernoulli(p) components, that is equal to 1 (resp. 0) with probability p (resp. 1−p), we first establish a formula for the mean of the size of the random Galois lattice built from this sample, and a more complex one for its variance. Then, noticing that closed α-frequent itemsets are in bijection with closed α-winning coalitions, we establish similar formulas for the mean and the variance of the number of closed α-frequent itemsets. This can be interesting for the study of the complexity of some data mining problems such as association rule mining, sequential pattern mining and classification. 相似文献
Widely publicized reports of fresh MBAs getting multiple job offers with six-figure annual salaries leave a long-lasting general impression about the high quality of selected business schools. While such spectacular achievement in job placement rightly deserves recognition, one should not lose sight of the resources expended in order to accomplish this result. In this study, we employ a measure of Pareto-Koopmans global efficiency to evaluate the efficiency levels of the MBA programs in Business Week’s top-rated list. We compute input- and output-oriented radial and non-radial efficiency measures for comparison. Among three tier groups, the schools from a higher tier group on average are more efficient than those from lower tiers, although variations in efficiency levels do occur within the same tier, which exist over different measures of efficiency. 相似文献
One of the typical issues in financial literature is that the market tends to be overly pessimistic about value stocks, many of which are past losers. Therefore, over-reactions might capture by measuring earnings surprise vary with past return levels. In this paper, we propose a new index for an effective investment strategy to capture the return-reversal effect using both Data Envelopment Analysis (DEA) and Inverted DEA in order to consider the above characteristics of the market. Our investment strategy using the new index exhibits better performance than the naive return-reversal strategy that only uses past returns or earnings surprise. In addition, the correlations between our new index and commonly used value indices are insignificant, and the value indices cannot represent the over-valued (under-valued) situations perfectly. Hence, considering both proposed and value indices like book-to-price one, we could select value stocks more effectively than by using only one of these indices. 相似文献
We report on ideas, problems and results, which occupied us during the past decade and which seem to extend the frontiers of information theory in several directions. The main contributions concern information transfer by channels. There are also new questions and some answers in new models of source coding. While many of our investigations are in an explorative state, there are also hard cores of mathematical theories. In particular we present a unified theory of information transfer, which naturally incorporates Shannon's theory of information transmission and the theory of identification in the presence of noise as extremal cases. It provides several novel coding theorems. On the source coding side we introduce data compression for identification. Finally we are led beyond information theory to new concepts of solutions for probabilistic algorithms.
The original paper [R. Ahlswede, General theory of information transfer, Preprint 97-118, SFB 343 Diskrete Strukturen in der Mathematik, Universität Bielefeld, 1997] gave to and received from the ZIF-project essential stimulations which resulted in contributions added as GTIT-Supplements “Search and channels with feedback” and “Noiseless coding for multiple purposes: a combinatorial model”.
Other contributions—also to areas initiated—are published in the recent book [R. Ahlswede et al. (Eds.), General Theory of Information Transfer and Combinatorics, Lecture Notes in Computer Science, vol. 4123, Springer, Berlin, 2006].
The readers are advised to study always the pioneering papers in a field—in this case the papers [R. Ahlswede, G. Dueck, Identification via channels, IEEE Trans. Inform. Theory 35 (1989) 15–29; R. Ahlswede, G. Dueck, Identification in the presence of feedback—a discovery of new capacity formulas, IEEE Trans. Inform. Theory 35 (1989) 30–39] on identification. It is not only the most rewarding way to come to new ideas, but it also helps to more quickly grasp the more advanced formalisms without going through too many technicalities. Perhaps also the recent Shannon Lecture [R. Ahlswede, Towards a General Theory of Information Transfer, Shannon Lecture at ISIT in Seattle 13th July 2006, IEEE Information Theory Society Newsletter, 2007], aiming at an even wider scope, gives further impetus. 相似文献
The efficiency of decision processes which can be divided into two stages has been measured for the whole process as well as for each stage independently by using the conventional data envelopment analysis (DEA) methodology in order to identify the causes of inefficiency. This paper modifies the conventional DEA model by taking into account the series relationship of the two sub-processes within the whole process. Under this framework, the efficiency of the whole process can be decomposed into the product of the efficiencies of the two sub-processes. In addition to this sound mathematical property, the case of Taiwanese non-life insurance companies shows that some unusual results which have appeared in the independent model do not exist in the relational model. In other words, the relational model developed in this paper is more reliable in measuring the efficiencies and consequently is capable of identifying the causes of inefficiency more accurately. Based on the structure of the model, the idea of efficiency decomposition can be extended to systems composed of multiple stages connected in series. 相似文献
The possibility provided by Chemometrics to extract and combine (fusion) information contained in NIR and MIR spectra in order to discriminate monovarietal extra virgin olive oils according to olive cultivar (Casaliva, Leccino, Frantoio) has been investigated.Linear discriminant analysis (LDA) was applied as a classification technique on these multivariate and non-specific spectral data both separately and jointly (NIR and MIR data together).In order to ensure a more appropriate ratio between the number of objects (samples) and number of variables (absorbance at different wavenumbers), LDA was preceded either by feature selection or variable compression. For feature selection, the SELECT algorithm was used while a wavelet transform was applied for data compression.Correct classification rates obtained by cross-validation varied between 60% and 90% depending on the followed procedure. Most accurate results were obtained using the fused NIR and MIR data, with either feature selection or data compression.Chemometrical strategies applied to fused NIR and MIR spectra represent an effective method for classification of extra virgin olive oils on the basis of the olive cultivar. 相似文献