CNF Satisfiability Problem

Introduction

The Satisfiability Problem (SAT) is a classic combinatorial problem. Given a Boolean formula of \(n\) variables

(1)\[f(x_1,x_2,\dots,x_n),\]

this problem is to find such values of the variables, on which the formula takes on the value true.

The CNF Satisfiability Problem (CNF-SAT) is a version of the Satisfiability Problem, where the Boolean formula (1) is specified in the Conjunctive Normal Form (CNF), that means that it is a conjunction of clauses, where a clause is a disjunction of literals, and a literal is a variable or its negation. For example:

(2)\[(x_1\vee x_2)\;\&\;(\neg x_2\vee x_3\vee\neg x_4)\;\&\;(\neg x_1\vee x_4).\]

Here \(x_1\), \(x_2\), \(x_3\), \(x_4\) are Boolean variables to be assigned, \(\neg\) means negation (logical not), \(\vee\) means disjunction (logical or), and \(\&\) means conjunction (logical and). One may note that the formula (2) is satisfiable, because on \(x_1=\) true, \(x_2=\) false, \(x_3=\) false, and \(x_4=\) true it takes on the value true. If a formula is not satisfiable, it is called unsatisfiable, that means that it takes on the value false on any values of its variables.

Any CNF-SAT problem can be easily translated to a 0-1 programming problem as follows.

A Boolean variable \(x\) can be modeled by a binary variable in a natural way: \(x=1\) means that \(x\) takes on the value true, and \(x=0\) means that \(x\) takes on the value false. Then, if a literal is a negated variable, i.e. \(t=\neg x\), it can be expressed as \(t=1-x\). Since a formula in CNF is a conjunction of clauses, to provide its satisfiability we should require all its clauses to take on the value true. A particular clause is a disjunction of literals:

(3)\[t\vee t'\vee t''\vee\dots,\]

so it takes on the value true iff at least one of its literals takes on the value true, that can be expressed as the following inequality constraint:

(4)\[t+t'+t''+\dots\geq 1.\]

Note that no objective function is used in this case, because only a feasible solution needs to be found.

For example, the formula (2) can be translated to the following constraints:

\begin{alignat*}{8} &&x_1& &{}+{} &&x_2& & &&& & &&& &\geq 1\\ &&& & &(1-{}&x_2&) &{}+{} &&x_3& &{}+{} &(1-{}&x_4&) &\geq 1\\ &(1-{}&x_1&) & &&& & &&& &{}+{} &&x_4& &\geq 1 \end{alignat*}

with \(x_1, x_2, x_3, x_4\in\{0,1\}\).

Carrying out all constant terms to the right-hand side gives corresponding 0-1 programming problem in the standard format:

\begin{alignat*}{5} & &x_1 &{}+{} &x_2 & & & & &\geq &1\\ & & &{}-{} &x_2 &{}+{} &x_3 &{}-{} &x_4 &\geq -&1\\ &-&x_1 & & & & &{}+{} &x_4 &\geq &0\\ \end{alignat*}

with \(x_1, x_2, x_3, x_4\in\{0,1\}\).

In general case translation of a CNF-SAT problem results in the following 0-1 programming problem:

(5)\[\sum_{j\in J^+_i}x_j-\sum_{j\in J^-_i}x_j\geq 1-|J^-_i|,\quad i=1,\dots,m,\]
(6)\[x_j\in\{0,1\},\quad j=1,\dots,n,\]

where \(n\) is the number of variables, \(m\) is the number of clauses (inequality constraints), \(J^+_i\subseteq\{1,\dots,n\}\) is a subset of variables, whose literals in \(i\)-th clause do not have negation, and \(J^-_i\subseteq\{1,\dots,n\}\) is a subset of variables, whose literals in \(i\)-th clause are negations of that variables. It is assumed that \(J^+_i\cap J^-_i=\varnothing\) for all \(i\).

DIMACS CNF-SAT Format

Note

This material is based on the paper “Satisfiability Suggested Format”.

The DIMACS input file is a plain ASCII text file. It contains lines of several types described below. A line is terminated with an end-of-line character. Fields in each line are separated by at least one blank space.

Comment lines

Comment lines give human-readable information about the file and are ignored by programs. Comment lines can appear anywhere in the file. Each comment line begins with a lower-case character c.

c This is a comment line

Problem line

There is one problem line per data file. The problem line must appear before any clause lines. It has the following format:

p cnf VARIABLES CLAUSES

The lower-case character p signifies that this is a problem line. The three character problem designator cnf identifies the file as containing specification information for the CNF-SAT problem. The VARIABLES field contains an integer value specifying \(n\), the number of variables in the instance. The CLAUSES field contains an integer value specifying \(m\), the number of clauses in the instance.

Clauses

The clauses appear immediately after the problem line. The variables are assumed to be numbered from 1 up to \(n\). It is not necessary that every variable appears in the instance. Each clause is represented by a sequence of numbers separated by either a space, tab, or new-line character. The non-negated version of a variable \(j\) is represented by \(j\); the negated version is represented by \(-j\). Each clause is terminated by the value 0. Unlike many formats that represent the end of a clause by a new-line character, this format allows clauses to be on multiple lines.

Example

Below here is an example of the data file in DIMACS format corresponding to the CNF-SAT problem (2).

c sample.cnf
c
c This is an example of the CNF-SAT problem data
c in DIMACS format.
c
p cnf 4 3
1 2 0
-4 3
-2 0
-1 4 0
c
c eof

GLPK API Routines

Problem.read_cnfsat(str fname)

Read CNF-SAT problem data in DIMACS format

This class method reads the CNF-SAT problem data from a text file in DIMACS format and automatically translates the data to corresponding 0-1 programming problem instance (5)(6).

Parameters:fname (str) – file name
Returns:the problem read
Return type:Problem

Note

If the filename ends with the suffix ‘.gz’, the file is assumed to be compressed, in which case the routine decompresses it “on the fly”.

>>> p = Problem.read_cnfsat('examples/sample.cnf')
>>> p.get_num_cols() # the number of variables
4
>>> all(p.get_col_kind(j) == 'binary' for j in range(1, 5))
True
>>> p.get_num_rows() # the number of clauses
3
>>> [p.get_mat_row(i) for i in range(1,4)]
[{1: 1.0, 2: 1.0}, {2: -1.0, 3: 1.0, 4: -1.0}, {1: -1.0, 4: 1.0}]
Problem.check_cnfsat()

Check for CNF-SAT problem instance

This method checks if the specified problem object P contains a 0-1 programming problem instance in the format (5)(6) and therefore encodes a CNF-SAT problem instance.

Return type:bool
>>> p = Problem.read_cnfsat('examples/sample.cnf')
>>> p.check_cnfsat()
True
>>> clause = p.get_mat_row(1)
>>> clause[1] = 2
>>> p.set_mat_row(1, clause)
>>> p.check_cnfsat()
False
Problem.write_cnfsat(fname)

Write CNF-SAT problem data in DIMACS format

This method automatically translates the specified 0-1 programming problem instance (5)(6) to a CNF-SAT problem instance and returns the problem data in DIMACS format or writes it to a file.

Parameters:fname (str) – file name (omit to have data returned)
Returns:the problem data in DIMACS format (if argument is omitted)
Return type:str

Note

If the filename ends with suffix ‘.gz’, the file is assumed to be compressed, in which case the routine performs automatic compression on writing that file.

>>> p = Problem.read_cnfsat('examples/sample.cnf')
>>> p.set_prob_name('σαμπλε')
>>> print(p.write_cnfsat())
c σαμπλε
p cnf 4 3
1 2 0
-2 3 -4 0
-1 4 0
c eof
Problem.minisat1()

Solve CNF-SAT problem with MiniSat solver

This method is a driver to MiniSat, a CNF-SAT solver developed by Niklas Eén and Niklas Sörensson, Chalmers University of Technology, Sweden.

Returns:solution status, either 'optimal' (satistfiable) or 'no feasible' (unsatisfiable)
Return type:str

Note

It is assumed that the specified problem is a 0-1 programming problem instance in the format (5)(6) and therefore encodes a CNF-SAT problem instance.

>>> p = Problem.read_cnfsat('examples/sample.cnf')
>>> p.minisat1()
'optimal'
>>> {j: bool(p.mip_col_val(j)) for j in range(1, 5)}
{1: False, 2: True, 3: False, 4: False}
Problem.intfeas1(bool use_bound, int obj_bound)

Solve integer feasibility problem

This method is a tentative implementation of an integer feasibility solver based on a CNF-SAT solver (currently it is MiniSat; see minisat1).

Parameters:
  • use_bound (bool) – whether to require a solution with an objective function value not worse than the bound given in obj_bound
  • obj_bound (int) – specifies an upper (in case of minimization) or lower (in case of maximization) bound to the objective function
Returns:

solution status, either 'feasible' or 'no feasible'

Return type:

str

Note

The integer programming problem should satisfy to the following requirements:

  1. All variables (columns) should be either 'binary' (cf. Problem.get_col_kind) or 'fixed' (cf. Problem.get_col_type) at integer values
  2. All constraint and objective coefficients should be integer numbers in the range \([-2^{31}, +2^{31}-1]\).

Though there are no special requirements to the constraints, currently this method is efficient mainly for problems, where most constraints (rows) fall into the following three classes:

  1. Covering inequalities: \(\sum_{j}t_j\geq 1\), where \(t_j=x_j\) or \(t_j=1-x_j\), \(x_j\) is a binary variable.
  2. Packing inequalities: \(\sum_{j}t_j\leq 1\).
  3. Partitioning equalities (SOS1 constraints): \(\sum_{j}t_j=1\).
>>> p = Problem()
>>> p.add_cols(64) # consider the squares of a chess board
1
>>> square2col = lambda b_col, b_row: 8 * (b_col - 1) + b_row
>>> for col in range(1, 65):
...     # add variable encoding presence of a queen in a square
...     p.set_col_kind(col, 'binary')
...     # objective is the number of queens placed
...     p.set_obj_coef(col, 1)
>>> p.set_obj_dir('maximize') # place as many queens as possible
>>> for index in range(1, 9):
...     # at most one queen can be placed in each row
...     row = p.add_rows(1)
...     p.set_mat_row(row, {square2col(b_col, index): 1
...                         for b_col in range(1, 9)})
...     p.set_row_bnds(row, None, 1)
...     # at most one queen can be placed in each column
...     row = p.add_rows(1)
...     p.set_mat_row(row, {square2col(index, b_row): 1
...                         for b_row in range(1, 9)})
...     p.set_row_bnds(row, None, 1)
...     # at most one queen can be placed in each diagonal
...     row = p.add_rows(1)
...     p.set_mat_row(row, {square2col(index - b_row + 1, b_row): 1
...                         for b_row in range(1, index+1)})
...     p.set_row_bnds(row, None, 1)
...     # at most one queen can be placed in each cross diagonal
...     row = p.add_rows(1)
...     p.set_mat_row(row, {square2col(8 - diff, index - diff): 1
...                         for diff in range(0, index)})
...     p.set_row_bnds(row, None, 1)
>>> best = 1 # we know at least one queen can be put on the board
>>> while p.intfeas1(use_bound=True, obj_bound=best+1) == 'feasible':
...     best = p.mip_obj_val()
>>> best # the maximum number of queens
8.0
>>> {col for col in range(1, 65) if bool(p.mip_col_val(col))}
{4, 37, 41, 14, 51, 24, 26, 63}