Why backtracking alone cannot suffice to solve combinatorial problems: CP perspective, empirical hardness models evidence, and proof complexity results

Constraint Programming (CP) is a technique with particular strength to solve classes of combinatorial problems for which global constraints were identified and exploited in the design of specialized algorithms. However, I have been convinced that CP is, above all, a general model to understand combinatorial problems and their solution – using the term ‘model’ in a broader sense, as explained in a previous post and based on John Hooker’s article at CP Letters.

Regardless of the solving method used, the classic aim of CP is to tackle the Constraint Satisfaction Problem (CSP), which ultimately encodes any problem involving variables with discrete and finite values as the intersection of the regular languages defined by each of its constraints. When special properties are observed in a class of such languages that would allow us to make strong and efficient inferences among those variables, we have found one of those so-called global constraints. Not surprisingly, the concept is in fact a formalization of the widespread practice of exploiting problem structure, which was also discussed in a previous post linking the implementation of local search operators to CP concepts.

In spite of such claimed ‘agnosticism’ regarding solving methods, CSP solving is often associated with backtracking search algorithms. The rationale is that, if compared to other general solving techniques like dynamic programming, backtracking scales better in memory use with the size of the problem being solved. As a consequence, CP lectures and many lines of research are developed with the premise of using backtracking-based search. In practice, however, that post on CP and local search also points that the converse is true: local search methods like variable neighborhood search have been successfully used in commercial CP solvers.

One of such lines in which backtracking was extensively studied is the adaptive selection of algorithms, which is aimed to select the most suitable algorithm to solve a given instance according to predictive models of performance. From the characterization of CSP instances by means of properties like constrainedness, which is the ratio of pairwise constraints to variables, empirical hardness models were developed that also documented easy-hard-easy phase transitions. In short, it is easy to solve an unconstrained problem or to infer that an overconstrained one is unfeasible, but those in the ‘feasibility frontier’ are much harder to solve. A short reference about the topic is the 2005 article of Carla Gomes and Bart Selman in Nature entitled ‘Can get satisfaction’, and a precise figure of such hardness can be seen in the graph at slide number 6 of Kevin Leyton-Brown’s talk at the Workshop on Beyond Worst-Case Analysis.

Although a well-documented phenomenon, until very recently it was yet unclear to me why this happens. I was only convinced of it this week, when I came across some lecture notes of KTH Professor Jakob Nordstrom on proof complexity that discussed the resolution of formulas for the pigeonhole principle. Proof complexity is a discipline concerned with understanding the hardness of theorem proving, and comprises the study of characteristics and strengths of different proof systems as well as their equivalence. Among such systems is resolution, which is equivalent to solving through backtracking a CSP in which variables can only assume two values – true or false, or 0 or 1 – and constraints only use the logical operators NOT, OR, and AND in Conjunctive Normal Form (CNF): a set of clauses using NOT and OR joined by AND. This is a special form of the satisfiability problem, or simply SAT, for which the backtracking search with only local inference corresponds to the well-known DPLL algorithm.

The pigeonhole principle states that it is not possible to assign n pigeons into m pigeonholes without putting two of them in the same pigeonhole if n > m. It turns out that encoding an instance of this problem in polynomial size in n and m such that n = m+1 as a satisfiability problem (SAT), i.e., a CSP with variable domains of size two and constraints built of logical operators in CNF, results in an unsatisfiable problem that requires an exponential lower-bound resolution. That happens because a valid refutation requires exploring a large number of combinations of assignments to prove that none would lead to the satisfaction of the problem constraints. [check the erratum at the end of the post]

In short, backtracking really needs global constraints!

Erratum

According to the first comment to this post, there is an enconding for which the mentioned refutation is efficiently found. The discussion is continuned in the sequel of this post.

Over-constrained problems, soft constraints and family holiday parties – and why some companies ask for O.R. support

People are having fewer children, families are becoming smaller but some combinatorial problems involving them are becoming harder to solve. New families are facing harder planning and scheduling problems during Christmas and other holidays than their parents or grandparents ever did. Anyway, that’s an interesting way to explain what over-constrained problems and soft constraints are.

Suppose that you are the head of a family and you decide to run a party at Christmas evening or a banquet in the day after. One or two generations ago, it was not that hard: people used to live closer and have lots of children (I mean, more than two at least). In such case, it would not be a disaster if five out of your nine sons are not able to come over. It might be the case that families sharing common members agree on celebrating at different times. Anyway, the other parties would be so close to yours that everybody would eventually step by sometime.

However, with fewer children, people easily moving far away to pursue a career or for resting after retirement, divorced parents and grandparents running concurrent parties (maybe four grandparents married to four step-grandparents sharing a single grandchild), we must agree that pleasing everybody might become impossible.

That´s roughly what happens when some companies look for the help of an Operations Research consultant: they have a set of resources to produce goods or deliver services to their costumers and they are not sure if it became impossible to support the increasing demand with what they have or whether if they are only having a harder time to find a solution.

It might be the case that some constraints are not as important as it appeared to be. An over-constrained problem is said to be a problem upon which too many constraints are imposed, ruling out any possible solution. Looking carefully to the set of constraints, one might realize that some of them represent desirable but not mandatory situations, in which case they actually represent what we call soft constraints.

In the case of the families, what does a couple with no children and four parties in four cities at two different times do? At least in my case, we have to split in order to meet the scale. In the future, we aim at tackling this problem by running the party ourselves. 🙂