Statistical Mechanics of Unsatisfiability and Glasses

Europe/Stockholm
Ferry Stockholm-Mariehamn and Hotel Arkipelag, Mariehamn, Åland

, , ,
Description

Summary

The meeting aims to extend the reach of statistical mechanics from satisfiability to unsatisfiability, and the eventual connection to glasses.

Background to the field

For several years, ideas from statistical mechanics have been used in developing novel algorithmic approaches for difficult optimization problems. The theoretical framework for this borrows heavily from the physics of glasses. Concepts that are of relevance range from non-equilibrium dynamics or protocols to energy landscapes to techniques of analysis such as replica theory and the cavity method. Uncharted territory exists in many cases where the groundstate energy is not known – “non-zero” – and when the structure of the energy landscape is crucial for the dynamical properties. The two main issues for this meeting are “unsatisfiability” or in computer science language max-K-SAT problems and their variants and glasses, where for the dynamical properties and real examples the low-lying energy landscape structure defines the physics. This event is meant to gather scientists interested in applications of statistical mechanics in this area. The event is also meant to gather participants from computer science working on similar ideas.

The venue was Hotel Arkipelag in downtown Mariehamn, the capital of the province of Åland, Finland. The Åland archipelago, lying between Sweden and mainland Finland, is easily reachable by ferry from Stockholm (Sweden), from Turku (Finland), and from Helsinki (Finland). In addition, there are flights from Sweden and Finland. Please note that the program starts on a ferry from Stockholm, so your presence on that is important information for the organizers.

The meeting is generously supported by NORDITA, the Aalto University Science Institute, ACCESS Linnaeus Centre, and the National Graduate School in Materials Physics (Finland) NGSMP, the National Graduate School in Computational Sciences (Finland) FICS, and the Helsinki Doctoral Programme in Computer Science (Finland) Hecse. Deadline for registration for scientists interested in participation is April 1. There is a maximum number of participants, 50, for capacity reasons.

Invited speakers include

 Dimitris Achlioptas CTI, Greece Toby Cubitt Madrid Uriel Feige Weizmann Institute of Science Silvio Franz U. Paris Sud Alexander Hartmann Oldenburg Jorge Kurchan ESPCI, Paris Elitza Maneva U. Barcelona Matteo Marsili ICTP Trieste Bart Selman Cornell U. Stefan Szeider TU Wien Zoltan Toroczkai U. Notre Dame, USA Martin Weigt U. Paris VI Jussieu Pan Zhang ESPCI, Paris Haijun Zhou Chinese Academy of Sciences, Beijing

Practical information about the conference start

The conference begins on Viking Line ferry leaving Stockholm harbour May 23 2012 at 16.45 Swedish time. We will travel on the Viking Line ferries going from Stockholm to Helsinki with a stop-over in Mariehamn (where we get off).

The trip to Mariehamn from Stockholm takes almost exactly five hours. We will have a lecture room and dinner on the ship.

It is recommended that you be in the ferry terminal at 16.00, at the latest. Someone from the organizing committee (Erik Aurell, Mikko Alava or Ralf Eichhorn) will be in the ferry terminal with the tickets from 15.30 at the latest.

Practical information about the return trip

The return trip from Mariehamn is the responsibility of the individual participants.

However, to simplify things, everyone except those who have explicitly indicated a contrary preference will be booked on Viking Line sailing from Mariehamn on Saturday May 26 at 14.25 (Finnish time), arriving in Stockholm at 18.55 (Swedish time). If you continue elsewhere by air travel from Stockholm that same evening you should count at least one hour from the ferry terminal to the airport (to be on the safe side).

• Wednesday, May 23
• 4:45 PM 8:00 PM
Ferry session
• 4:45 PM
Opening remarks 15m
• 5:00 PM
New Challenges in Inference Technology 45m
In recent years, we have seen tremendous progress in inference technologies. For example, in the area of Boolean satisfiability (SAT) and Mixed Integer Programming (MIP) solvers now enable us to tackle significant practical problem instances from applications in hardware and software verification, planning, and scheduling. Key to this success is the ability to strike the right balance between the expressiveness of the underlying representation formalism and the efficiency of the solvers. The next challenge is to extend the reach of these solvers to more complex tasks that lie beyond NP.
I will discuss our work on sampling, counting, probabilistic reasoning, and adversarial reasoning. In particular, I will discuss a sampling technique based on the so-called flat histogram method from statistical physics. The technique allows for fast probabilistic inference and learning in Markov Logic networks and other graphical models. In the area of adversarial reasoning, the UCT method, based on clever sampling strategies first developed for use in multi-armed bandit scenarios, provides a compelling alternative to traditional minimax search. The method has led to an exciting advance in the strength of GO programs. I'll discuss insights into the surprising effectiveness of the UCT technique.
Speaker: Prof. Bart Selman (Cornell University)
• 5:45 PM
Break 15m
• 6:00 PM
An Analog Approach to Boolean Satisfiability 45m
Boolean satisfiability (k-SAT) is one of the most studied optimization problems, as an efficient (polynomial-time) solution to k-SAT (for k > 2) implies efficient solutions to a very large number of hard optimization problems. Here we propose a mapping of k-SAT into a deterministic continuous-time dynamical system with a unique correspondence between its attractors and the k-SAT solution clusters. We show that beyond a constraint density threshold, the analog trajectories become transiently chaotic, and the boundaries between the basins of attraction of the solution clusters become fractal, signalling the appearance of optimization hardness. The system always finds solutions for satisfiable formulas and simulations indicate that it finds them in polynomial continuous-time even in the frozen regimes of random 3-SAT and of locked occupation problems; a property partly due to the system’s hyperbolic character. In terms of the number of discretization steps, however, the complexity of the computation via digital (Turing) machines is exponential for hard instances due to the strongly chaotic nature of the analog trajectories.

With: M. Ercsey-Ravasz
Reference: Nature Physics, vol. 7, 966-970 (2011).

Speaker: Prof. Zoltan Toroczkai (University of Notre Dame)
• Thursday, May 24
• 9:00 AM 5:00 PM
Unsatisfiability
• 9:00 AM
How to refute random nonsatisfiable formulas 45m
A 3SAT algorithm needs to correctly decide for every 3CNF formula whether it is satisfiable or not. We do not expect a 3SAT algorithm to run in polynomial time, because 3SAT is NP-hard. Nevertheless, a 3SAT algorithm may run in polynomial time on a substantial fraction of all 3SAT instances. Of particular interest to this talk are random instances of 3CNF formulas with very large density (clause to variable ratio). Such 3CNF formulas are unlikely to be satisfiable. 3SAT algorithms that terminate in polynomial time on most such instances (and always give the correct answer) are referred to as refutation heuristics. We shall survey some of the ideas used in the design of refutation heuristics.
Speaker: Prof. Uriel Feige (Weizmann Institute)
• 9:45 AM
Challenging SAT Solvers with Translated Answer-Set Programs 30m
Propositional satisfiability (SAT) solvers provide a promising computational platform for declarative problem solving. Over the years, a great variety of benchmark problems has emerged to help with the systematic evaluation of SAT solvers under development. Both highly structural and random benchmark instances are used and the performance of SAT solvers on such instances depends much on the underlying search paradigm. In this talk, I present a compact way to translate logic programs, subject to answer-set semantics proposed by Gelfond and Lifschitz, into set of clauses so that satisfying assignments correspond to answer sets. Thus, on one hand, SAT solvers can harnessed as black boxes to the task of computing answer sets. On the other hand, the translations provide an easy way to generate challeging structural (or partly randomized) benchmark instances for SAT solvers. Typical answer-set programs involve natural parameters that can be used for scaling the hardness of the resulting instances. In the experimental part, I discuss the potential of SAT solvers when solving NP-complete problems from the 2nd Answer Set Programming Competition.
Speaker: Prof. Tomi Janunen (Aalto University)
• 10:15 AM
Coffee break 30m
• 10:45 AM
Iterative algorithms and Sherali-Adams linear programming relaxations for graph (non-)isomorphism 45m
The Weisfeiler-Lehman (WL) algorithm is an iterative algorithm for certifying that two graphs are not isomorphic. At every iteration a label is generated for each vertex based on the mutliset of labels of its neighbors at the previous iteration. The starting label is the degree of the vertex. When a fixed point is reached the multisets of labels of the vertices of the two graphs are compared. These are different only if the graphs are not isomorphic. In the k level version of the WL algorithm a label is instead generated for every k- tuple of vertices. We compare the WL algorithm to a hierarchy of linear programming relaxations of graph isomorphism generated by the Sherali Adams lift-and-project method. It was already known that two graphs are fractionally isomorphic (i.e. there is a doubly stochastic matrix that converts one graph into the other) if and only if they are not distinguished by the first level of the WL algorithm. We show that, furthermore, the levels of the Sherali-Adams hierarchy interleave in power with the higher levels of the WL algorithm.
Speaker: Prof. Elitza Maneva (University of Barcelona)
• 11:30 AM
The Competition for Shortest Paths on Sparse Graphs 30m
Optimal paths connecting randomly selected node pairs, as well as multiple nodes and their fixed routers are studied analytically in the presence of non-linear overlap cost that penalizes congestion or encourages aggregation. Routing becomes more difficult as the number of selected nodes increases and exhibits interesting physical phenomena such as ergodicity breaking. The ground state of such systems reveals non-monotonic complex behaviors in average path- length and algorithmic convergence, depending on the network topology, and densities of communicating nodes and routers. Distributed linearly-scalable algorithms are also devised for various routing senarios based on the cavity and replica approach.
Speaker: Dr Chi Ho Yeung (Aston University)
• 12:00 PM
Lunch 2h
• 2:00 PM
Exponential lower bounds for DPLL algorithms on satisfiable random 3-CNF formulas 45m
We consider random CNF formulas consisting of 2- and 3-clauses, also known as the random (2+p)-SAT model. Such mixtures arise naturally in the execution of satisfiability algorithms on random 3-CNF formulas and their properties are closely connected to algorithmic performance. It is known that such mixtures are satisfiable as long as r_2 < 1 and r_3 < 2/3, where r_k is the ratio of k-clauses to variables. It has been conjectured that this is tight, i.e., that for every r_3 > 2/3, there exists r_2 < 1, such that the mixture is with high probability unsatisfiable. If true, the conjecture implies that the running time of many natural algorithms goes from linear to exponential around a critical, algorithm-specific density. We prove the statement of the conjecture with 2/3 replaced by 1, improving upon the previous bound requiring r_3 = 2.28… The proof is via the interpolation method of statistical physics applied to the ground state energy of the model. Our result implies that many well-known satisfiability algorithms require exponential time on random 3-CNF formulas with density in the provably satisfiable regime.
Joint work with Ricardo Menchaca-Mendez.
Speaker: Prof. Dmitris Achlioptas (CTI)
• 2:45 PM
Coffee break 30m
• 3:15 PM
The Parameterized Complexity of Propositional Satisfiability 45m
Parameterized complexity is a new framework for the theoretical analysis of computational problems. It considers -- in addition to the problem input size -- a secondary measurement, the parameter. The parameter can represent a qualitative aspect of the input and indicate how well the input is structured. This two- dimensional setting admits a more fine-grained worst- case complexity analysis, with the potential of being more realistic than the one-dimensional setting of classical complexity theory. The central notion of parameterized complexity is fixed-parameter tractability (FPT) which refers to solvability in time f(k)n^c, where n denotes the input size, f is some function of the parameter k, and c is a constant. Fixed-parameter tractability extends the classical notion of polynomial- time tractability. In this talk I will discuss some recent parameterized complexity results for the propositional satisfiability problem (SAT). In particular, I will focus on three groups of parameters: (i) parameters that represent the distance from a tractable subproblem; the distance can be measured in terms of the size of a smallest backdoor set, (ii) parameters that represent decomposability, such as the treewidth of graphical representations of formulas, and (iii) parameters that represent locality, such as the number of variables that can be flipped within a single step of a local search algorithm. Finally, I will discuss how parameterized complexity can provide performance guarantees for polynomial-time preprocessing, by means of the notion of "kernelization".
Speaker: Prof. Stefan Szeider (Vienna University of Technology)
• Friday, May 25
• 9:00 AM 5:00 PM
Glasses
• 9:00 AM
Phase transition for cutting-plane approach to vertex-cover problem 45m
We study the vertex-cover problem on Erdös-Reny random graphs, where previously a phase transition at connectivity c=e coinciding with an easy-hard transition of a branch- and-bound algorithm in connection with the leave-removal heuristic has been found. Hence, this algorithm works by partially exploring the space of feasible configurations. In this work, we consider an algorithm which has a completely different strategy: The problem is mapped onto a linear programming problem augmented by a cutting-plane approach, hence the algorithm operates in a space outside the space of feasible configurations until the final step, where a solution is found. Here we show that this type of algorithm also exhibits an easy--hard'' transition around c=e, which strongly indicates that the typical hardness of a problem is fundamental to the problem and not due to a specific representation of the problem.
Speaker: Prof. Hartmann Alexander (University of Oldenburg)
• 9:45 AM
On existence of equilibrium domains in non-equilibrium systems 30m
Equilibrium is a fundamental concept in statistical physics; it assumes that while the system dynamics is governed by microscopic interactions, some systems eventually reach a state where macroscopic observables remain unchanged. The evolution of many such systems is driven by the corresponding Hamiltonian energy function and their states converge to the equilibrium Gibbs-Boltzmann distribution, from which all macroscopic properties can be computed. However, the process governing the dynamics of many other systems cannot be derived from a Hamiltonian; such systems neither obey detailed balance nor converge to an equilibrium state. While many real systems, for example in the financial, social and biological areas, are inherently not in equilibrium, some of their constituents exhibit equilibrium-like behaviour in emerging localised or non- localised domains. Here we show such behaviour in model systems defined on densely and sparsely connected complex networks, as they provide a useful representation of many natural and technological systems. Equilibrium domains are shown to emerge either abruptly or gradually depending on the system parameters, for instance temperature, and disappear, becoming indistinguishable from the remainder of the system for other parameter values. Consequently, such domains may exist, under some conditions, within a non- equilibrium system but may be difficult to identify.
Speaker: Dr Alexander Mozeika (Aalto University)
• 10:15 AM
Coffee break 30m
• 10:45 AM
Analog approaches to hard optimization: from Sudoku to CNNs 45m
The analog dynamical system we recently designed to solve constraint satisfaction (Nature Physics 7, 966, 2011) opens potential new avenues to handle hard optimization problems. As an example I will present our Sudoku solver and show how it helps us define a type of "Richter scale" to characterize the hardness of Sudoku problems. These analog solvers are based on the one-to-one correspondence between the solutions of a problem and the stable fixed-points of the dynamical system. However, the question arises how can we identify unsatisfiable formulas. Here I will present some new directions for handling the MAX-k-SAT problem. Due to the chaotic dynamics appearing in these solvers for hard problems, these systems become costly to simulate on Turing machines. Thus, it would be desirable to design real physical, continuous time systems or devices that share the same dynamics as these analog solvers. To facilitate physical implementations we developed a new dynamical model similar to Cellular Neural Networks which can solve k-SAT without getting trapped in non-solution fixed points and which can be implemented by analog circuits.

The work was done together with Prof. Zoltan Toroczkai, and my graduate student Botond Molnar.

Speaker: Prof. Maria Magdolna Ercsey Ravasz (Babes-Bolyai University)
• 11:30 AM
Search methods for tile sets in patterned DNA self-assembly 30m
We consider the problem of finding, for a given 2D pattern of coloured tiles, a minimal set of tiles assembling this pattern (in the abstract Tile Assembly Model of Winfree (1998)). This Patterned self-Assembly Tile set Synthesis (PATS) problem was first introduced by Ma and Lombardi (2008), and subsequently studied by Goos and Orponen (2010), who presented an exhaustive partition-search branch-and-bound algorithm. Later we shifted to the problem of finding small (maybe not minimal) and reliable solutions, in a reasonable time interval, Lempiainen et al. (2011). We modified the basic partition search framework by using a heuristic to optimize the order in which the algorithm traverses its search space. The new algorithm was able to shorten considerably the time needed for finding small (minimal, or close to minimal) tile sets. In a recent breakthrough, we showed that the PATS problem is NP hard by providing a reduction from the 3SAT problem. Thus, local search algorithms for PATS remain the default solution for finding acceptable solutions in a reasonable time interval.
Speaker: Dr Eugen Czeizler (Aalto University)
• 12:00 PM
Lunch 2h
• 2:00 PM
From sequence co-evolution to protein (complex) structure prediction 45m
Many families of homologous proteins show a remarkable degree of structural and functional conservation, despite their large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach to link this variability (easy to observe) to structure (hard to obtain), i.e. to infer directly co-evolving residue pairs which turn our to form native contacts in the folded protein with high accuracy. The gained information is used to guide tertiary and quaternary structure prediction. As a specific example, I will discuss the auto- phosphorylation complex of histidine kinases, which are involved in the majority of signal transduction systems in the bacteria. Only a multidisciplinary approach integrating statistical genomics, biophysical protein simulation, and mutagenesis experiments, allows us to predict and verify the - so far unknown - active kinase structure.
Speaker: Prof. Martin Weigt (Université Pierre et Marie Curie)
• 2:45 PM
Coffee break 30m
• 3:15 PM
Inference: an algebraic perspective 45m
I will present a general framework which can be used to reconstruct probability distributions for strings of binary variables. While the problem of inference can be analytically controlled for small systems, a description of some of the regularization prescriptions needed to treat large systems will be provided, together with a discussion concerning their symmetries. Finally, I will present several possible applications of these methods, namely i) exact, fast inference for 1-D periodic systems, ii) exact, fast inference for tree-like graphs, iii) approximate, fast inference for generic graphs.
Speaker: Prof. Matteo Marsili (Abdus Salam ICTP)
• 4:00 PM
Replica analysis of sparse l1-reconstruction with concatenated L-orthogonal basis 30m
The sparse representation problem of recovering an N dimensional sparse vector x from M < N linear observations y = Dx given dictionary D is considered. The standard approach is to let the elements of the dictionary be independent and identically distributed (IID) zero-mean Gaussian and minimize the l1 norm of x under the constraint y = Dx. In this talk, we discuss the replica analysis of l1-reconstruction when the dictionary is a concatenation D = [O1 O2 ... OL] of L independent orthogonal random matrices. The results indicate that under some conditions, the compression threshold for such D is the same as for the IID case.
This is recent joint work with Y Kabashima and others partly available as [arXiv:1204.4065]
Speaker: Dr Mikko Vehkaperä (KTH and Aalto University)
• Saturday, May 26
• 9:00 AM 12:00 PM
Synthesis
• 9:00 AM
Critical fluctuations close to the clustering/dynamical transition: from glassy physics to computer science 45m
Speaker: Prof. Silvio Franz (Paris-Sud)
• 9:45 AM
Coffee break 30m
• 10:15 AM
Region-graph belief-propagation and survey propagation for lattice spin-glass models 45m
Graphical models for finite-dimensional spin glasses and real-world combinatorial optimization and satisfaction problems usually have an abundant number of short loops. The cluster variation method and its extension, the region graph method, are theoretical approaches for treating the complicated short-loop-induced local correlations. For graphical models represented by non-redundant or redundant region graphs, we construct approximate free energy landscapes through the mathematical framework of region graph partition function expansion. During this construction, we are able to derive the region graph belief- propagation equation and the region-graph survey propagation equation as conditions to ensure vanishing free-energy correction contributions from region subgraphs with dangling edges.

As a simple application of the general theory, we perform region graph belief-propagation simulations on the square- lattice ferromagnetic Ising model and the Edwards- Anderson model. Collective domains of different sizes in the disordered and frustrated square lattice are identified by the message-passing procedure. Such collective domains and the frustrations among them are responsible for the low- temperature glass like dynamical behaviors of the system.

J.-Q. Xiao, Haijun Zhou, "Partition function loop series for a general graphical model: free energy corrections and message passing equations", Journal of Physics A: Mathematical and Theoretical 44, 425001 (2011).

Haijun Zhou, Chuang Wang, Jing-Qing Xiao, Zedong Bi, "Partition function expansion on region-graphs and message-passing equations", Journal of Statistical Mechanics: Theory and Experiment, L12001 (2011).

Haijun Zhou, Chuang Wang, "Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results", submitted to Journal of Statistical Physics (2012), [arXiv:1204.1818].

Speaker: Prof. Haijun Zhou (Institute of Theoretical Physics, the Chinese Academy of Sciences, Beijing, China)
• 11:00 AM
Is physics (NP-)hard? 45m
The behaviour of any physical system is governed by its underlying dynamical equations. Much of physics is concerned with discovering these dynamical equations and understanding their consequences. It is perhaps surprising, therefore, that identifying the underlying dynamical equation from any amount of experimental data, however precise, is a computationally hard problem (more precisely, it is promise NP-complete). This is true both for classical and for quantum mechanical dynamics.
As a by-product, this result finally lays to rest &mdash& in a complexity-theoretic sense &mdash& the quantum and classical embedding problems, two long-standing open problems in mathematics. (The classical problem, in particular, dates back over 70 years.)
I will explain these results, and discuss some of the implications they might have for physics.