首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 390 毫秒
1.
2.
Jordi Castro  Jordi Cuesta 《TOP》2013,21(1):25-47
The purpose of the field of statistical disclosure control is to avoid that confidential information could be derived from statistical data released by, mainly, national statistical agencies. Controlled tabular adjustment (CTA) is an emerging technique for the protection of statistical tabular data. Given a table with sensitive information, CTA looks for the closest safe table. In this work we focus on CTA for three-dimensional tables using the L 1 norm for the distance between the original and protected tables. Three L 1-CTA models are presented, giving rise to six different primal block-angular structures of the constraint matrices. The resulting linear programming problems are solved by a specialized interior-point algorithm for this constraints structure which solves the normal equations by a combination of Cholesky factorization and preconditioned conjugate gradients (PCG). In the past this algorithm was shown to be one of the most efficient approaches for some classes of block-angular problems. The effect of quadratic regularizations is also analyzed, showing that for three of the six primal block-angular structures the performance of PCG is guaranteed to improve. Computational results are reported for a set of large instances, which provide linear optimization problems of up to 50 million variables and 25 million constraints. The specialized interior-point algorithm is compared with the state-of-the-art barrier solver of the CPLEX 12.1 package, showing to be a more efficient choice for very large L 1-CTA instances.  相似文献   

3.
The increasing demand for information, coupled with the increasing capability of computer systems, has compelled information providers to reassess their procedures for preventing disclosure of confidential information. This paper considers the problem of protecting an unpublished, sensitive table by suppressing cells in related, published tables. A conventional integer programming technique for two-dimensional tables is extended to find an optimal suppression set for the public tables. This can be used to protect the confidentiality of sensitive data in three- and higher-dimensional tables. More importantly, heuristics that are intimately related to the structure of the problem are also presented to mitigate the computational difficulty of the integer program. An example is drawn from healthcare management. Data tables are randomly generated to assess the computational time/space restrictions of the IP model, and to evaluate the heuristics.  相似文献   

4.
One of the main services of National Statistical Agencies (NSAs) for the current Information Society is the dissemination of large amounts of tabular data, which is obtained from microdata by crossing one or more categorical variables. NSAs must guarantee that no confidential individual information can be obtained from the released tabular data. Several statistical disclosure control methods are available for this purpose. These methods result in large linear, mixed integer linear, or quadratic mixed integer linear optimization problems. This paper reviews some of the existing approaches, with an emphasis on two of them: cell suppression problem (CSP) and controlled tabular adjustment (CTA). CSP and CTA have concentrated most of the recent research in the tabular data protection field. The particular focus of this work is on methods and results of practical interest for end-users (mostly, NSAs). Therefore, in addition to the resulting optimization models and solution approaches, computational results comparing the main optimization techniques - both optimal and heuristic - using real-world instances are also presented.  相似文献   

5.
Models and algorithms for a staff scheduling problem   总被引:1,自引:0,他引:1  
We present mathematical models and solution algorithms for a family of staff scheduling problems arising in real life applications. In these problems, the daily assignments to be performed are given and the durations (in days) of the working and rest periods for each employee in the planning horizon are specified in advance, whereas the sequence in which these working and rest periods occur, as well as the daily assignment for each working period, have to be determined. The main objective is the minimization of the number of employees needed to perform all daily assignments in the horizon. We decompose the problem into two steps: the definition of the sequence of working and rest periods (called pattern) for each employee, and the definition of the daily assignment to be performed in each working period by each employee. The first step is formulated as a covering problem for which we present alternative ILP models and exact enumerative algorithms based on these models. Practical experience shows that the best approach is based on the model in which variables are associated with feasible patterns and generated either by dynamic programming or by solving another ILP. The second step is stated as a feasibility problem solved heuristically through a sequence of transportation problems. Although in general this procedure may not find a solution (even if one exists), we present sufficient conditions under which our approach is guaranteed to succeed. We also propose an iterative heuristic algorithm to handle the case in which no feasible solution is found in the second step. We present computational results on real life instances associated with an emergency call center. The proposed approach is able to determine the optimal solution of instances involving up to several hundred employees and a working period of up to 6 months. Mathematics Subject Classification (2000): 90B70, 90C10, 90C27, 90C39, 90C57, 90C59  相似文献   

6.
An iterative scheme which is based on a dynamic fixation of the variables is developed to solve the 0-1 multidimensional knapsack problem. Such a scheme has the advantage of generating memory information, which is used on the one hand to choose the variables to fix either permanently or temporarily and on the other hand to construct feasible solutions of the problem. Adaptations of this mechanism are proposed to explore different parts of the search space and to enhance the behaviour of the algorithm. Encouraging results are presented when tested on the correlated instances of the 0-1 multidimensional knapsack problem.  相似文献   

7.
In this paper, we present a new formulation for the local access network expansion problem. Previously, we have shown that this problem can be seen as an extension of the well-known Capacitated Minimum Spanning Tree Problem and have presented and tested two flow-based models. By including additional information on the definition of the variables, we propose a new flow-based model that permits us to use effectively variable eliminations tests as well as coefficient reduction on some of the constraints. We present computational results for instances with up to 500 nodes in order to show the advantages of the new model in comparison with the others.  相似文献   

8.
The problem of rounding in statistical tables to protect confidentiality is an important problem in the area of data publication, especially for official statistics. Controlled rounding involves rounding the table data to a prespecified base while ensuring additivity to totals. Previous research provided a formulation of the controlled rounding problem of a simple two-way table as a transportation problem. This paper extends that work to tables with subtotals by using a capacitated transshipment formulation. It is shown that some forms of tables with subtotals always have a controlled rounding solution. Other table structures cannot be guaranteed such a solution under zero-restrictedness. Initial computational experience suggests that the method is viable for use in practical situations.  相似文献   

9.
To obtain full cooperation from respondents, statistical offices must guarantee that confidential data will not be disclosed when their reports are published. For tabular data, cell suppression is one of the preferred techniques to control statistical disclosure. When suppressing only confidential values does not guarantee the desired data protection, it is also necessary to suppress the values in some non-confidential cells. The problem of finding an optimal set of complementary suppressions—the cell suppression problem (CSP)—is NP-hard. We present a three-phase algorithm for the CSP based on a binary relaxation derived from row and column protection conditions. To enforce violated single cell conditions, integer cuts are added to the CSP relaxation. The numerical results obtained in 1410 instances with up to more than 250?000 cells, which were generated to reproduce two classes of real-world data, indicate that the algorithm is quite effective for both classes of instances and that it outperforms state-of-the-art algorithms for one of them.  相似文献   

10.
This paper deals with a central question of structural optimization which is formulated as the problem of finding the stiffest structure which can be made when both the distribution of material as well as the material itself can be freely varied. We consider a general multi-load formulation and include the possibility of unilateral contact. The emphasis of the presentation is on numerical procedures for this type of problem, and we show that the problems after discretization can be rewritten as mathematical programming problems of special form. We propose iterative optimization algorithms based on penalty-barrier methods and interior-point methods and show a broad range of numerical examples that demonstrates the efficiency of our approach. Supported by the project 03ZO7BAY of BMBF (Germany) and the GIF-contract 10455-214.06/95.  相似文献   

11.
Second-order cone programming (SOCP) problems are typically solved by interior point methods. As in linear programming (LP), interior point methods can, in theory, solve SOCPs in polynomial time and can, in practice, exploit sparsity in the problem data. Specifically, when cones of large dimension are present, the density that results in the normal equations that are solved at each iteration can be remedied in a manner similar to the treatment of dense columns in an LP. Here we propose a product-form Cholesky factorization (PFCF) approach, and show that it is more numerically stable than the alternative Sherman-Morrison-Woodbury approach. We derive several PFCF variants and compare their theoretical perfomance. Finally, we prove that the elements of L in the Cholesky factorizations LDLT that arise in interior point methods for SOCP are uniformly bounded as the duality gap tends to zero as long as the iterates remain is some conic neighborhood of the cental path.Mathematics Subject Classification (1991): 90C25, 90C51, 15A23Research supported in part by NSF Grants CDA 97-26385, DMS 01-04282, ONR Grant N000140310514 and DOE Grant GE-FG01-92ER-25126  相似文献   

12.
We take a regression-based approach to the problem of induction, which is the problem of inferring general rules from specific instances. Whereas traditional regression analysis fits a numerical formula to data, we fit a logical formula to boolean data. We can, for instance, construct an expert system for fitting rules to an expert's observed behavior. A regression-based approach has the advantage of providing tests of statistical significance as well as other tools of regression analysis. Our approach can be extended to nonboolean discrete data, and we argue that it is better suited to rule construction than logit and other types of categorical data analysis. We find maximum likelihood and bayesian estimates of a best-fitting boolean function or formula and show that bayesian estimates are more appropriate. We also derive confidence and significance levels. We show that finding the best-fitting logical formula is a pseudo-boolean optimization problem, and finding the best-fitting monotone function is a network flow problem.The first and second authors gratefully acknowledge the partial support of NSF (Grant DMS 89-06870) and AFOSR (Grants 89-0512 and 90-0008), and the third author that of AFOSR (Grant 91-0287) and ONR (Grant N00014-92-J-1028).  相似文献   

13.
We present a method for finding exact solutions of Max-Cut, the problem of finding a cut of maximum weight in a weighted graph. We use a Branch-and-Bound setting that applies a dynamic version of the bundle method as bounding procedure. This approach uses Lagrangian duality to obtain a “nearly optimal” solution of the basic semidefinite Max-Cut relaxation, strengthened by triangle inequalities. The expensive part of our bounding procedure is solving the basic semidefinite relaxation of the Max-Cut problem, which has to be done several times during the bounding process. We review other solution approaches and compare the numerical results with our method. We also extend our experiments to instances of unconstrained quadratic 0–1 optimization and to instances of the graph equipartition problem. The experiments show that our method nearly always outperforms all other approaches. In particular, for dense graphs, where linear programming-based methods fail, our method performs very well. Exact solutions are obtained in a reasonable time for any instance of size up to n = 100, independent of the density. For some problems of special structure we can solve even larger problem classes. We could prove optimality for several problems of the literature where, to the best of our knowledge, no other method is able to do so. Supported in part by the EU project Algorithmic Discrete Optimization (ADONET), MRTN-CT-2003-504438.  相似文献   

14.
Graphical models are wildly used to describe conditional dependence relationships among interacting random variables. Among statistical inference problems of a graphical model, one particular interest is utilizing its interaction structure to reduce model complexity. As an important approach to utilizing structural information, decomposition allows a statistical inference problem to be divided into some sub-problems with lower complexities. In this paper, to investigate decomposition of covariate-dependent graphical models, we propose some useful definitions of decomposition of covariate-dependent graphical models with categorical data in the form of contingency tables. Based on such a decomposition, a covariate-dependent graphical model can be split into some sub-models, and the maximum likelihood estimation of this model can be factorized into the maximum likelihood estimations of the sub-models. Moreover, some sufficient and necessary conditions of the proposed definitions of decomposition are studied.  相似文献   

15.
Ariyawansa and Felt (2001, 2004) have recently created a test problem collection for testing software for stochastic linear programs. This freely-available, web-based collection was originally created with 35 problem instances from 11 problem families representing a variety of application areas. The collection was created with plans for enriching it with problem instances based on different application areas from the research community. The work of Martel and Al-Nuaimi (1973) on manpower planning under uncertain demand represents an application area suitable for creating new problem instances to be added to the collection. The purpose of this paper is to describe the construction of a new family of stochastic programming test problems based on the work of Martel and Al-Nuaimi (1973). As part of our construction, we review the work of Martel and Al-Nuaimi (1973) leading to an extension of their models for which their solution procedure does not apply. The new test problems are based on this extension. We also present solutions to the test problems obtained using the software package CPA (2002) for stochastic programming developed by Ariyawansa, Felt and Sarich. Mathematics Subject Classifications (2000) 90C15, 90C90, 65K05. K. A. Ariyawansa: The work of this author was supported in part by the U.S. Army Research Office under Grant DAAD 19-00-1-0465.  相似文献   

16.
This work deals with the set cover with pairs problem (SCPP) which is a generalization of the set cover problem (SCP). In the SCPP the elements have to be covered by specific pairs of objects, instead of a single object. We propose a new mathematical formulation using extended variables that is capable of consistently solve instances with up to 500 elements and 500 objects. We also developed an ILS heuristic which was capable of finding better solutions for several tested instances in less computational time.  相似文献   

17.
Many significant advances have been made in recent years for solving unconstrained binary quadratic programs (UQP). As a result, the size of problem instances that can be efficiently solved has grown from a hundred or so variables a few years ago to 2000 or 3000 variables today. These advances have motivated new applications of the model which, in turn, have created the need to solve even larger problems. In response to this need, we introduce several new “one-pass” heuristics for solving very large versions of this problem. Our computational experience on problems of up to 9000 variables indicates that these methods are both efficient and effective for very large problems. The significance of problems of this size is that they not only open the door to solving a much wider array of real world problems, but also that the standard linear mixed integer formulations of the nonlinear models involve over 40,000,000 variables and three times that many constraints. Our approaches can be used as stand-alone solution methods, or they can serve as procedures for quickly generating high quality starting points for other, more sophisticated methods.  相似文献   

18.
It is known that Polyhedral Feasibility Problems can be solved via interior-point methods whose real number complexity is polynomial in the dimension of the problem and the logarithm of a condition number of the problem instance. A limitation of these results is that they do not apply to ill-posed instances, for which the condition number is infinite. We propose an algorithm for solving polyhedral feasibility problems in homogeneous form that is applicable to all problem instances, and whose real number complexity is polynomial in the dimension of the problem instance and in the logarithm of an “extended condition number” that is always finite.  相似文献   

19.
This paper investigates the construction of an automatic algorithm selection tool for the multi-mode resource-constrained project scheduling problem (MRCPSP). The research described relies on the notion of empirical hardness models. These models map problem instance features onto the performance of an algorithm. Using such models, the performance of a set of algorithms can be predicted. Based on these predictions, one can automatically select the algorithm that is expected to perform best given the available computing resources. The idea is to combine different algorithms in a super-algorithm that performs better than any of the components individually. We apply this strategy to the classic problem of project scheduling with multiple execution modes. We show that we can indeed significantly improve on the performance of state-of-the-art algorithms when evaluated on a set of unseen instances. This becomes important when lots of instances have to be solved consecutively. Many state-of-the-art algorithms perform very well on a majority of benchmark instances, while performing worse on a smaller set of instances. The performance of one algorithm can be very different on a set of instances while another algorithm sees no difference in performance at all. Knowing in advance, without using scarce computational resources, which algorithm to run on a certain problem instance, can significantly improve the total overall performance.  相似文献   

20.
A common way to produce a convex relaxation of a Mixed Integer Quadratically Constrained Program (MIQCP) is to lift the problem into a higher-dimensional space by introducing variables Y ij to represent each of the products x i x j of variables appearing in a quadratic form. One advantage of such extended relaxations is that they can be efficiently strengthened by using the (convex) SDP constraint Y - x xT \succeq 0{Y - x x^T \succeq 0} and disjunctive programming. On the other hand, the main drawback of such an extended formulation is its huge size, even for problems for which the number of x i variables is moderate. In this paper, we study methods to build low-dimensional relaxations of MIQCP that capture the strength of the extended formulations. To do so, we use projection techniques pioneered in the context of the lift-and-project methodology. We show how the extended formulation can be algorithmically projected to the original space by solving linear programs. Furthermore, we extend the technique to project the SDP relaxation by solving SDPs. In the case of an MIQCP with a single quadratic constraint, we propose a subgradient-based heuristic to efficiently solve these SDPs. We also propose a new eigen-reformulation for MIQCP, and a cut generation technique to strengthen this reformulation using polarity. We present extensive computational results to illustrate the efficiency of the proposed techniques. Our computational results have two highlights. First, on the GLOBALLib instances, we are able to generate relaxations that are almost as strong as those proposed in our companion paper even though our computing times are about 100 times smaller, on average. Second, on box-QP instances, the strengthened relaxations generated by our code are almost as strong as the well-studied SDP+RLT relaxations and can be solved in less than 2 s, even for large instances with 100 variables; the SDP+RLT relaxations for the same set of instances can take up to a couple of hours to solve using a state-of-the-art SDP solver.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号