首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
One of the main services of National Statistical Agencies (NSAs) for the current Information Society is the dissemination of large amounts of tabular data, which is obtained from microdata by crossing one or more categorical variables. NSAs must guarantee that no confidential individual information can be obtained from the released tabular data. Several statistical disclosure control methods are available for this purpose. These methods result in large linear, mixed integer linear, or quadratic mixed integer linear optimization problems. This paper reviews some of the existing approaches, with an emphasis on two of them: cell suppression problem (CSP) and controlled tabular adjustment (CTA). CSP and CTA have concentrated most of the recent research in the tabular data protection field. The particular focus of this work is on methods and results of practical interest for end-users (mostly, NSAs). Therefore, in addition to the resulting optimization models and solution approaches, computational results comparing the main optimization techniques - both optimal and heuristic - using real-world instances are also presented.  相似文献   

2.
Large gaps in one-dimensional cutting stock problems   总被引:1,自引:0,他引:1  
Its linear relaxation is often solved instead of the one-dimensional cutting stock problem (1CSP). This causes a difference between the optimal objective function values of the original problem and its relaxation, called a gap. The size of this gap is considered in this paper with the aim to formulate principles for the construction of instances of the 1CSP with large gaps. These principles are complemented by examples for such instances.  相似文献   

3.
Jordi Castro  Jordi Cuesta 《TOP》2013,21(1):25-47
The purpose of the field of statistical disclosure control is to avoid that confidential information could be derived from statistical data released by, mainly, national statistical agencies. Controlled tabular adjustment (CTA) is an emerging technique for the protection of statistical tabular data. Given a table with sensitive information, CTA looks for the closest safe table. In this work we focus on CTA for three-dimensional tables using the L 1 norm for the distance between the original and protected tables. Three L 1-CTA models are presented, giving rise to six different primal block-angular structures of the constraint matrices. The resulting linear programming problems are solved by a specialized interior-point algorithm for this constraints structure which solves the normal equations by a combination of Cholesky factorization and preconditioned conjugate gradients (PCG). In the past this algorithm was shown to be one of the most efficient approaches for some classes of block-angular problems. The effect of quadratic regularizations is also analyzed, showing that for three of the six primal block-angular structures the performance of PCG is guaranteed to improve. Computational results are reported for a set of large instances, which provide linear optimization problems of up to 50 million variables and 25 million constraints. The specialized interior-point algorithm is compared with the state-of-the-art barrier solver of the CPLEX 12.1 package, showing to be a more efficient choice for very large L 1-CTA instances.  相似文献   

4.
Rounding methods are common techniques in many statistical offices to protect sensitive information when publishing data in tabular form. Classical versions of these methods do not consider protection levels while searching patterns with minimum information loss, and therefore typically the so-called auditing phase is required to check the protection of the proposed patterns. This paper presents a mathematical model for the whole problem of finding a protected pattern with minimum loss of information, and proposes a branch-and-cut algorithm to solve it. It also describes a new methodology closely related to the classical Controlled Rounding methods but with several advantages. The new methodology is named Cell Perturbation and leads to a different optimization problem which is simpler to solve than the previous problem. This paper presents a cutting-plane algorithm for finding an exact solution of the new problem, which is a pattern guaranteeing the same protection level requirements but with smaller loss of information when compared with the classical Controlled Rounding optimal patterns. The auditing phase is unnecessary on the solutions generated by the two algorithms. The paper concludes with computational results on real-world instances and discusses a modification in the objective function to guarantee statistical properties in the solutions. Received: April, 2004  相似文献   

5.
The problem of data privacy is to verify that confidential information stored in an information system is not provided to unauthorized users and, therefore, personal and other sensitive data remain private. One way to guarantee this is to distort a knowledge base such that it does not reveal sensitive information. In the present paper we will give a universal definition of the problem of knowledge base distortion. It is universal in the sense that is independent of any particular knowledge representation formalism. We will then present a basic and general algorithm for knowledge base distortion to guarantee data privacy. This algorithm provides us with upper bounds for the complexity of the distortion problem. Moreover, we examine heuristics to improve its average performance.  相似文献   

6.
The safe dissemination of statistical tabular data is one of the main concerns of National Statistical Institutes (NSIs). Although each cell of the tables is made up of the aggregated information of several individuals, the statistical confidentiality can be violated. NSIs must guarantee that no individual information can be derived from the released tables. One widely used type of methods to reduce the disclosure risk is based on the perturbation of the cell values. We consider a new controlled perturbation method which, given a set of tables to be protected, finds the closest safe ones - thus reducing the information loss while preserving confidentiality. This approach means solving a quadratic optimization problem with a much larger number of variables than constraints. Real instances can provide problems with millions of variables. We show that interior-point methods are an effective choice for that model, and, also, that specialized algorithms which exploit the problem structure can be faster than state-of-the art general solvers. Computational results are presented for instances of up to 1000000 variables.AMS Subject Classification: 90C06, 90C20, 90C51, 90C90Jordi Castro: Partially supported by the EU IST-2000-25069 CASC project and by the Spanish MCyT project TIC2003-00997.  相似文献   

7.
This paper considers the problem of optimally sequencing different car models along an assembly line according to some contiguity constraints, while ensuring that the demands for each of the models are satisfied. This car sequencing problem (CSP) is a practical NP-hard combinatorial optimisation problem. The CSP is formulated as a nonlinear integer programming problem and it is shown that exact solutions to the problem are difficult to obtain due to the indefinite quadratic form of the CSP objective function. Two traditional heuristics (steepest descent and simulated annealing) are employed to solve the CSP approximately. Several Hopfield neural network (HNN) approaches are also presented. The process of mapping an optimisation problem onto a HNN is demonstrated explicitly, and modifications to the existing neural approaches are presented which guarantee feasibility of solutions. Further modifications are proposed to improve the solution quality by permitting escape from local minima in an attempt to locate the global optimum. Results from all solutions techniques are compared on a set of instances of the CSP, and conclusions drawn.  相似文献   

8.
This paper studies a statistical problem called instrumental variable quantile regression (IVQR). We model IVQR as a convex quadratic program with complementarity constraints and—although this type of program is generally NP-hard—we develop a branch-and-bound algorithm to solve it globally. We also derive bounds on key variables in the problem, which are valid asymptotically for increasing sample size. We compare our method with two well known global solvers, one of which requires the computed bounds. On random instances, our algorithm performs well in terms of both speed and robustness.  相似文献   

9.
We consider a budgeting problem where a specified number of projects from some disjoint classes has to be selected such that the overall gain is largest possible, and such that the costs of the chosen projects do not exceed a fixed upper limit. The problem has several application in government budgeting, planning, and as relaxation from other combinatorial problems. It is demonstrated that the problem can be transformed to an equivalent multiple-choice knapsack problem through dynamic programming. A naive transformation however leads to a drastic increase in the number of variables, thus we propose an algorithm for the continuous problem based on Dantzig–Wolfe decomposition. A master problem solves a continuous multiple-choice knapsack problem knowing only some extreme points in each of the transformed classes. The individual subproblems find extreme points for each given direction, using a median search algorithm. An integer optimal solution is then derived by using the dynamic programming transformation to a multiple-choice knapsack problem for an expanding core. The individual classes are considered in an order given by their gradients, and the transformation to a multiple-choice knapsack problem is performed when needed. In this way, only a dozen of classes need to be transformed for standard instances from the literature. Computational experiments are presented, showing that the developed algorithm is orders of magnitude faster than a general LP/MIP algorithm.  相似文献   

10.
The multiple depot ring-star problem (MDRSP) is an important combinatorial optimization problem that arises in optical fiber network design and in applications that collect data using stationary sensing devices and autonomous vehicles. Given the locations of a set of customers and a set of depots, the goal is to (i) find a set of simple cycles such that each cycle (ring) passes through a subset of customers and exactly one depot, (ii) assign each non-visited customer to a visited customer or a depot, and (iii) minimize the sum of the routing costs, i.e., the cost of the cycles and the assignment costs. We present a mixed integer linear programming formulation for the MDRSP and propose valid inequalities to strengthen the linear programming relaxation. Furthermore, we present a polyhedral analysis and derive facet-inducing results for the MDRSP. All these results are then used to develop a branch-and-cut algorithm to obtain optimal solutions to the MDRSP. The performance of the branch-and-cut algorithm is evaluated through extensive computational experiments on several classes of test instances.  相似文献   

11.
When solving the one-dimensional cutting stock problem (1D CSP) as an integer linear programming problem one has to overcome computational difficulties arising from the integrality condition and a huge number of variables. In the Gilmore–Gomory approach the corresponding continuous relaxation is solved via column generation techniques followed by an appropriate rounding of the in general non-integer solution. Obviously, there is no guarantee of obtaining an optimal solution in this way but it is extremely effective in practice. However, in two- and three-dimensional cutting stock problems the heuristics are not so good which necessitates the research of effective exact methods. In this paper we present an exact solution approach for the 1D CSP which is based on a combination of the cutting plane method and the column generation technique. Results of extensive computational experiments are reported.  相似文献   

12.
A parallel branch and bound algorithm that solves the asymmetric traveling salesman problem to optimality is described. The algorithm uses an assignment problem based lower bounding technique, subtour elimination branching rules, and a subtour patching algorithm as an upper bounding procedure. The algorithm is organized around a data flow framework for parallel branch and bound. The algorithm begins by converting the cost matrix to a sparser version in such a fashion as to retain the optimality of the final solution. Computational results are presented for three different classes of problem instances: (1) matrix elements drawn from a uniform distribution of integers for instances of size 250 to 10 000 cities, (2) instances of size 250 to 1000 cities that concentrate small elements in the upper left portion of the cost matrix, and (3) instances of size 300 to 3000 cities that are designed to confound neighborhood search heuristics.  相似文献   

13.
This paper deals with a new problem that is a generalization of the many to many pickup and delivery problem and which is motivated by operating self-service bike sharing systems. There is only one commodity, initially distributed among the vertices of a graph, and a capacitated single vehicle aims to redistribute the commodity in order to reach a target distribution. Each vertex can be visited several times and also can be used as a buffer in which the commodity is stored for a later visit. This problem is NP-hard, since it contains several NP-hard problems as special cases (the TSP being maybe the most obvious one). Even finding a tractable exact formulation remains problematic.This paper presents efficient algorithms for solving instances of reasonable size, and contains several theoretical results related to these algorithms. A branch-and-cut algorithm is proposed for solving a relaxation of the problem. An upper bound of the optimal solution of the problem is obtained by a tabu search, which is based on some theoretical properties of the solution, once fixed the sequence of the visited vertices. The possibility of using the information provided by the relaxation receives a special attention, both from a theoretical and a practical point of view. It is proven that to build a feasible solution of the problem by using the one obtained by the relaxation is an NP-hard problem. Nevertheless, a tabu search initialized with the optimal solution of the relaxation often shows that it is the optimal one.The algorithms have been tested on a set of instances coming from the literature, proving their effectiveness.  相似文献   

14.
National Statistical Agencies routinely release large amounts of tabular information. Prior to dissemination, tabular data needs to be processed to avoid the disclosure of individual confidential information. One widely used class of methods is based on the modification of the table cells values. However, previous approaches were not able to preserve the values of the marginal cells and the additivity relations for a general table of any dimension, size and structure. This void was recently filled by the controlled tabular adjustment and one of its variants, the quadratic minimum-distance controlled perturbation method. Although independently developed, both approaches rely on the same strategy: given a set of tables to be protected, they find the minimum-distance values to the original cells that make the released information safe. Controlled tabular adjustment uses the L1 distance; the quadratic minimum-distance variant considers L2. This work presents both approaches within an unified framework, and includes a new variant based on L. Among other benefits, the unified framework permits the simple comparison of the three distances, and a single general result about their disclosure risk. The three distances are evaluated with the unique standard library for tabular data protection currently available. Some of the complex instances were contributed by National Statistical Agencies, and, therefore, are good representatives of theirs real needs. Unlike alternative methods, the three distances were able to solve all the instances, requiring only few seconds for each of them on a personal computer using a general purpose solver. The results show that this class of methods are an effective and promising tool for the protection of large volumes of tabular data. All the linear and quadratic problems solved in the paper are delivered to the optimization community in MPS format.  相似文献   

15.
This paper evaluates variants of a simulated annealing algorithm which solve the total cost minimization problem in activity networks in the case that discrete time-cost execution modes are allowed on the project activities. This problem is a special case of the well known discrete time-cost trade-off problem (DTCTP). Based on a sample of randomly generated activity networks, formal tests of statistical significance are utilized to test both the quality of solutions and the time efficiency of algorithms versus problem factors. A procedure issued from the extreme values statistics is also applied on problem instances in order to determine, on the one hand, the confidence interval estimate of the optimum solution for each algorithm and, on the other hand, when to stop the running of an algorithm.  相似文献   

16.
The capacitated vehicle routing problem (CVRP) considered in this paper occurs when goods must be delivered from a central depot to clients with known demands, usingk vehicles of fixed capacity. Each client must be assigned to exactly one of the vehicles. The set of clients assigned to each vehicle must satisfy the capacity constraint. The goal is to minimize the total distance traveled. When the capacity of the vehicles is large enough, this problem reduces to the famous traveling salesman problem (TSP). A variant of the problem in which each client is visited by at least one vehicle, called the graphical vehicle routing problem (GVRP), is also considered in this paper and used as a relaxation of CVRP. Our approach for CVRP and GVRP is to extend the polyhedral results known for TSP. For example, the subtour elimination constraints can be generalized to facets of both CVRP and GVRP. Interesting classes of facets arise as a generalization of the comb inequalities, depending on whether the depot is in a handle, a tooth, both or neither. We report on the optimal solution of two problem instances by a cutting plane algorithm that only uses inequalities from the above classes.This work was supported in part by NSF grant DDM-8901495.  相似文献   

17.
Two-staged patterns are often used in manufacturing industries to divide stock plates into rectangular items. A heuristic algorithm is presented to solve the rectangular two-dimensional single stock size cutting stock problem with two-staged patterns. It uses the column-generation method to solve the residual problems repeatedly, until the demands of all items are satisfied. Each pattern is generated using a procedure for the constrained single large object placement problem to guarantee the convergence of the algorithm. The computational results of benchmark and practical instances indicate the following: (1) the algorithm can solve most instances to optimality, with the gap to optimality being at most one plate for those solutions whose optimality is not proven and (2) for the instances tested, the algorithm is more efficient (on average) in reducing the number of plates used than a published algorithm and a commercial stock cutting software package.  相似文献   

18.
An Augmented Lagrangian Algorithm for Large Scale Multicommodity Routing   总被引:1,自引:0,他引:1  
The linear multicommodity network flow (MCNF) problem has many applications in the areas of transportation and telecommunications. It has therefore received much attention, and many algorithms that exploit the problem structure have been suggested and implemented. The practical difficulty of solving MCNF models increases fast with respect to the problem size, and especially with respect to the number of commodities. Applications in telecommunications typically lead to instances with huge numbers of commodities, and tackling such instances computationally is challenging.In this paper, we describe and evaluate a fast and convergent lower-bounding procedure which is based on an augmented Lagrangian reformulation of MCNF, that is, a combined Lagrangian relaxation and penalty approach. The algorithm is specially designed for solving very large scale MCNF instances. Compared to a standard Lagrangian relaxation approach, it has more favorable convergence characteristics. To solve the nonlinear augmented Lagrangian subproblem, we apply a disaggregate simplicial decomposition scheme, which fully exploits the structure of the subproblem and has good reoptimization capabilities. Finally, the augmented Lagrangian algorithm can also be used to provide heuristic upper bounds.The efficiency of the augmented Lagrangian method is demonstrated through computational experiments on large scale instances. In particular, it provides near-optimal solutions to instances with over 3,600 nodes, 14,000 arcs and 80,000 commodities within reasonable computing time.  相似文献   

19.
Ship maintenance scheduling is a process to decide start times of maintenance activities that satisfy all precedence and resource constraints and optimize the ship availability. In this paper, ship maintenance scheduling is modelled as a constraint satisfaction problem (CSP). The variables of CSP are the start times and its domain values are the start and horizon of the schedule. To solve the ship maintenance scheduling problem in the Royal Malaysian Navy, we have adopted a constraint-based reasoning (CBR) which requires start times of the first activities of maintenance cycles to solve the problem by the CBR. Thus, we adopt a genetic algorithm (GA) to find the start times of the first activities. The simulation results showed the effectiveness of the present hybrid algorithm.  相似文献   

20.
We consider a bound for the maximum-entropy sampling problem (MESP) that is based on solving a max-det problem over a relaxation of the Boolean quadric polytope (BQP). This approach to MESP was first suggested by Christoph Helmberg over 15 years ago, but has apparently never been further elaborated or computationally investigated. We find that the use of a relaxation of BQP that imposes semidefiniteness and a small number of equality constraints gives excellent bounds on many benchmark instances. These bounds can be further tightened by imposing additional inequality constraints that are valid for the BQP. Duality information associated with the BQP-based bounds can be used to fix variables to 0/1 values, and also as the basis for the implementation of a “strong branching” strategy. A branch-and-bound algorithm using the BQP-based bounds solves some benchmark instances of MESP to optimality for the first time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号