Statistical Mechanics of Inference

Europe/Stockholm
Trondheim, Norway

Trondheim, Norway

John Hertz (NORDITA), Sara Solla (Northwestern University), Yasser Roudi
Description

Background and other information

http://INDICO/confModifSchedule.py?confId=2718


For several years, ideas from statistical mechanics have been used in developing inference techniques
useful for analyzing high dimensional data and for reverse engineering complex biological systems. In recent
years, technological advances in neural multi-electrode and gene micro-array recordings have resulted
in an increase in the number of elements that can be simultaneously observed in biological systems,
making the development of appropriate statistical analysis tools a very active field. These theoretical
techniques can be used on the experimental data to infer various properties of the underlying biological
network, e. g. the pattern of interaction between genes and neurons at a large scale.


This event is meant to gather scientists interested in applications of statistical mechanics in building
statistical inference techniques, and the use of such techniques for analysing high throughput
biological data.

The event is a followup of the workshop held last year in Mariehamn.

Venue


The meeting will take place in the seminar room in the 5th floor of
the MTFS building of the Faculty of Medicine of NTNU, located on
Olav Kyrres gate 9 shown in this map.


Accommodations for invited speakers are booked at Comfort Hotel Park
located on Prinsensgate 4A, Trondheim and shown in this map.


The hotel and the conference venue are both easily accessible from
Trondheim Airport. Just take the airport bus and, for the hotel, ask
the driver to drop you at Comfort Hotel Park in Prinsens gate, or
Samfundet (student union; a red round building) if you come directly from the airport
to the workshop venue.

Confirmed invited speakers

Silvio Franz Paris Bert Kappen Nijmegen
Peter Latham London Enzo Marinari Rome
Matteo Marsili Trieste Manfred Opper Berlin
David Saad Birmingham Ingve Simonsen Trondheim
Peter SollichLondon Mikko Alava Helsinki
Håkon TjelmelandTrondheim Gasper Tkacik Vienna
Wilson Truccolo Providence Riccardo ZecchinaTorino
Micheal Biehl Groningen Marc MezardParis

Sponsors:

Kavli Institute for Systems Neuroscience

    • 09:30 10:30
      On the criticality of the inferred models 1h
      Advanced inference techniques allow one to reconstruct the pattern of interaction from high dimensional data sets. We focus here on the statistical properties of inferred models and argue that inference procedures are likely to yield models which are close to a phase transition. On one side, we show that the reparameterization invariant metrics in the space of probability distributions of these models (the Fisher Information) is directly related to the model's susceptibility. As a result, distinguishable models tend to accumulate close to critical points, where the susceptibility diverges in infinite systems. On the other, this region is the one where the estimate of inferred parameters is most stable. In order to illustrate these points, we discuss inference of interacting point processes with application to financial data and show that sensible choices of observation time-scales naturally yield models which are close to criticality.
      Speaker: Dr Matteo Marsili (Abdus Salam ICTP)
    • 10:30 11:00
      coffee break 30m
    • 11:00 12:00
      Stochastic modeling of neuronal ensemble dynamics 1h
      I will review past and ongoing work on the statistical modeling of neuronal ensemble point processes based on conditional intensity functions. These functions incorporate intrinsic single-neuron and network dynamics, as well as extrinsic inputs. I will illustrate the approach and current challenges with applications to single and dual 96-microelectrode cortical- array recordings from human and non-human primates.
      Speaker: Wilson Truccolo
    • 12:00 13:00
      Statistical mechanics for a network of real neurons 1h
      In most areas of the brain, information is encoded in the correlated activity of large populations of neurons. Here we build probabilistic models of such population codes using maximum entropy principle from new recordings of more than 100 retinal ganglion cells from a dense patch on the salamander retina. We illustrate how the pairwise maximum entropy (Ising-like) models can be extended to capture better the experimental data. We analyze the qualitative features of these codes and report on their emerging critical behavior. These results can be put into context by a theoretical examination of information- maximizing codes for noisy spiking neurons.
      Speaker: Gasper Tkacik (IST, Austria)
    • 13:00 14:00
      Lunch 1h
    • 14:00 15:00
      Inference in an over-complete, ultra-sparse model of olfaction 1h
      The general problem faced by the sensory system is to translate from spikes at sensory receptors to stimuli in the outside world. Here we consider how the brain might do this for olfaction, possibly the simplest of the senses. Our starting point is a linear generative model in which the response of each odorant receptor neuron is a linear combination of odors in the outside world; the problem, then, is to find the probabilistic mapping from those responses to odors. This is equivalent to a sparse, overcomplete model, but inference is especially hard because we use an ultra-sparse prior in which odors are either absent or present. We discuss our current progress on this problem, prospects for future directions, and what olfaction tells us about other sensory systems.
      Speaker: Peter Latham (University College London)
    • 15:00 15:30
      coffee break 30m
    • 15:30 16:30
      Generalized Linear Models for neural activity 1h
      Generalized Linear Models provide a framework for the systematic description of neural activity. The formulation of these models is based on the exponential family of probability distributions; the case of Bernoulli and Poisson distributions are relevant to the case of stochastic spiking. In this approach, the time-dependent firing rate of individual neurons is modeled in terms of experimentally accessible correlates of neural activity: patterns of activity of other neurons in the network, inputs provided through various sensory modalities or by other brain areas, and outputs such as muscle activity or motor responses. Model parameters are fit to the maximum of a likelihood function that is everywhere convex. In this talk, I will present the theory of generalized linear models, derive equations for likelihood maximization, and briefly discuss applications of this approach to a variety of problems: the incorporation of refractory effects in Poisson models, the mapping of spatiotemporal receptive fields of individual neurons, the characterization of network connectivity through directed time-dependent pairwise interactions, and the monitoring of plasticity.
      Speaker: Sara A. Solla (Northwestern University)
    • 09:30 10:30
      Fluxes and chemical potential distributions in biochemical networks 1h
      The analysis of non-equilibrium steady states of biochemical reaction networks relies on finding the configurations of fluxes and chemical potentials satisfying stoichiometric (mass balance) and thermodynamic (energy balance) constraints. Efficient methods to explore such states are crucial to predict reaction directionality, calculate physiologic ranges of variability, estimate correlations, and reconstruct the overall energy balance of the network from the underlying molecular processes. While different techniques for sampling the space generated by mass balance constraints are currently available, thermodynamics is generically harder to incorporate. Here we introduce a method to sample the free energy landscape of a reaction network at steady state. (in collaboration with Daniele De Martino, Matteo Figliuzzi, Andrea De Martino)
      Speaker: Enzo Marinari (University of Rome La Sapienza)
    • 10:30 11:00
      coffee break 30m
    • 11:00 12:00
      An approximate forward-backward algorithm 1h
      In the presentation we propose computationally feasible approximations to binary Markov random fields. The basis of the approximation is the forward-backward algorithm. This exact algorithm is computationally feasible only for fields defined on small graphs. The forward part of the algorithm computes a series of joint marginal distributions by summing out each variable in turn. We construct an approximate forward-backward algorithm by adapting approximation results for pseudo-Boolean functions. By using an approximation of the energy function which minimises the error sum of squares we construct a forward- backward algorithm which is computationally viable. We also show how our approach gives us upper and lower bounds as well as an approximate Viterbi algorithm. Through examples we demonstrate the accuracy and flexibility of our approximate algorithm.
      Speaker: Hakon Tjelmeland (NTNU)
    • 12:00 13:00
      Disentangling individual and group properties in complex networks 1h
      Not all nodes in a network are created equal. Differences and similarities exist at both individual node and group levels. Disentangling single node from group properties is crucial for network modelling and structural inference. Based on unbiased generative probabilistic exponential random graph models and employing distributive message passing techniques, we present an efficient algorithm that allows one to separate the contributions of individual nodes and groups of nodes to the network structure. This leads to improved detection accuracy of latent class structure in real world data sets compared to models that focus on group structure alone. Furthermore, the inclusion of hitherto neglected group specific effects in models used to assess the statistical significance of small subgraph (motif) distributions in networks may be sufficient to explain most of the observed statistics. We show the predictive power of such generative models in forecasting putative gene-disease associations in the Online Mendelian Inheritance in Man (OMIM) database. The approach is suitable for both directed and undirected uni-partite as well as for bipartite networks. Reichardt J, Alamino R, Saad D (2011) The Interplay between Microscopic and Mesoscopic Structures in Complex Networks. PLoS ONE 6(8): e21282. doi:10.1371/journal.pone.0021282
      Speaker: David Saad (Aston University Birmingham)
    • 13:00 14:00
      Lunch 1h
    • 14:00 15:00
      The binary garrote 1h
      In this talk, I present a new model and solution method for sparse regression. The model introduces binary selector variables $s_i$ for the features $i$ in a way that is similar to the original garrote model. The posterior probability for $s_i$ is computed in the variational approximation. I refer to this method as the Variational Garrote (VG). The VG is compared numerically with the Lasso method and with ridge regression. Numerical results on synthetic data show that the VG yields more accurate predictions and more accurately reconstructs the true model than the other methods. The naive implementation of the VG requires the inversion of a modified covariance matrix which scales cubic in the number of features. We indicate how for sparse problem the solution can be computed linear in the number of features.
      Speaker: Bert Kappen (Radboud University)
    • 15:00 15:30
      coffe break 30m
    • 15:30 16:30
      Finding influential sets of nodes in dynamical processes over networks 1h
      Speaker: Prof. Riccardo Zecchina (Politecnico di Torino)
    • 09:30 10:30
      Unconventional and Adaptive Distance Measures: applications in life sciences 1h
      An introduction to distance based classification of multi- dimensional data is given.The popular Learning Vector Quantization (LVQ) will serve as the main example in this talk. Here, typical representatives of the classes (prototypes) are determined from labelled example data in a supervised training process. In the working phase,the prototypes parameterize a classifier which can be applied to novel, unlabelled data. A key issue in LVQ and many related methods is the choice of a suitable similarity or distance measure. So- called relevance learning schemes employ parameterized distance measures which are optimized in the data-driven training process. The recently introduced Matrix Relevance LVQ, based on generalized Euclidean distances, will be discussed in greater detail. It is straightforward to extend the framework beyond Euclidean measures. As an important example, the use of statistical divergences in LVQ is introduced. Divergences can serve as generalized distances when data correspond to positive or normalized measures, as for instance in the histogram based classification of image data. Matrix Relevance Learning and Divergence based LVQ are illustrated in terms of a number of real world applications from the biomedical domain. These include adrenal tumor classification based on steroid excretion values and the detection and classification of plant diseases using color histograms.
      Speaker: Michael Biehl (University of Groningen)
    • 10:30 11:00
      coffe break 30m
    • 11:00 12:00
      Cascading failures in complex-networks: Power Blackouts and the Domino Effect 1h
      Our contemporary societies rely more and more on a steady and reliable power supply for their well-functioning. During the last few decades a number of large-scale power blackouts have been witnessed around the world, and this has caused major concerns among politicians and citizens. In this talk we will mention a few major power blackouts and discuss the sequence of events and why they occurred. These empirical examples show that major power blackouts often are results of a cascading of failures (a ``Domino effect''). We will introduce a generic (random walk) model for the study of cascading failures in networks, and investigate the impact of transient dynamics caused by the redistribution of loads after an initial network failure (triggering event). It is found that considering instead the stationary states, as has been done in the past, may dramatically overestimate (by 80-95%) the robustness of the network. This is due to the transient oscillations or overshooting in the loads, when the flow dynamics adjusts to the new (remaining) network structure. Consequently, additional parts of the network may be overloaded and therefore fail before the stationary state is reached. The dynamical effects are strongest on links in the neighborhood of broken links. This can cause failure waves in the network along neighboring links, while the evaluation of the stationary solution predicts a different order of link failures.
      Speaker: Ingve Simonsen (NTNU)
    • 12:00 13:00
      Approximate inference for continuous time Markov processes 1h
      Continuous time Markov processes (such as jump processes and diffusions) play an important role in the modelling of dynamical systems in many scientific areas ranging from physics to systems biology. In a variety of applications, the stochastic state of the system as a function of time is not directly observed. One has only access to a set of nolsy observations taken at discrete times. The problem is then to infer the unknown state path as best as possible. In addition, model parameters (like diffusion constants or transition rates) may also be unknown and have to be estimated from the data. Since Monte Carlo sampling approaches can be time consuming one is interested in efficient approximations. I will discuss variational approaches to this problem which are based on methods developed in statistical physics and machine learning and which have also interesting relations to stochastic optimal control. Applications to transcriptional regulation in systems biology will be given.
      Speaker: Manfred Opper (TU Berlin)
    • 13:00 14:00
      Lunch 1h
    • 14:00 15:00
      Mean field inference in asymmetric Ising systems 1h
      Speaker: Prof. Marc Mezard (Universite de Paris Sud)
    • 15:00 15:30
      coffee break 30m
    • 15:30 16:30
      Nonequilibrium network reconstruction 1h
      Speaker: Yasser Roudi (Kavli Inst./Nordita)
    • 16:30 16:50
      Network inference using asynchronously updated kinetic Ising model 20m
      In the presentation, I will focus on the reconstruction of the network structure based on the synthetic data produced by asynchronously update Ising model. The inferred structure can be obtained by three different approaches: naive mean field (nMF) approximation, Thouless-Anderson-Palmer (TAP) approximation and an exact learning method. I will explain these approaches briefly for asynchronous case and give a primary conclusion of the application ability of these methods.
      Speaker: HongLi Zeng (Aalto University)
    • 16:50 17:10
      Dynamic cavity method and dynamic mean-field for diluted Ising systems 20m
      The stationary state of Ising models with Glauber dynamics is studied. In case of fully connected networks, naive mean field approximation is investigated to describe long-time limit of magnetizations. For diluted networks dynamic cavity method is used for a wide range of parameters. The comparison between these two methods show that dynamic BP outperforms naive mean field in diluted networks.
      Speaker: Hamed Mahmoudi (Aalto University)
    • 17:10 17:30
      Kinetic Ising models with delayed interactions for neural data 20m
      Speaker: Aree Witoelar (Kavli Inst./NTNU)
    • 09:30 10:30
      Can projection techniques be used to understand subnetwork dynamics? 1h
      In systems biology we are encouraged to think in terms of networks to try and understand the complex behaviour of cells. There is much uncertainty in the identification process of proteins, so often the complete network of reactions in a protein interaction network (PIN) is unknown. Even in cases where the whole network is known relatively accurately, it is typically very large and therefore the dynamics hard to analyse in detail; also reliable information on the requisite reaction rates is difficult to obtain. Finally, experimental measurements of the dynamics of protein concentrations, e.g. using optical techniques, are typically possible only for a limited number of protein species simultaneously. These considerations motivate us to try to find descriptions for the dynamics of moderately sized subnetworks of much larger ("bulk") reaction networks. Eventually we would like to solve the inverse problem, i.e. infer from observed subnetwork dynamics something about the bulk network properties. As a first step towards this, we address the question of what form the dynamical equations for a subnetwork should take. We propose to use the Mori- Zwanzig projection formalism, which allows one to derive a set of dynamical equations for selected variables from a network using information from the whole network. We point out some ways in which this differs from more familiar applications of projection techniques in mode coupling theory, and give some toy examples to illustrate the approach.
      Speaker: Peter Sollich (Kings College London)
    • 10:30 11:00
      coffe break 30m
    • 11:00 12:00
      Classifying Mycobacterium Tuberculosis Complex by Message Passing 1h
      Speaker: Silvio Franz (University of Paris-Sud)
    • 12:00 13:00
      Introductory Lecture: Statistical inference on biological networks 1h
      Coupling large numbers of relatively simple elements often results in networks with complex computational abilities. Examples abound in biological systems - from genetic to neural networks, from metabolic networks to immune systems, from networks of proteins to networks of economic and social agents. Recent and continuing increases in the experimental ability to simultaneously track the dynamics of many constituent elements within these networks challenge the theorists to provide conceptual frameworks and develop theoretical tools for the analysis of such vast data. The subject poses great challenges, as the systems of interest are noisy and the available information is incomplete. The techniques and approaches of statistical physics have proven remarkably useful, but need to be further developed in their application to non equilibrium dynamical systems. In this talk, I will review the currently available theoretical tools, describe a few applications to the analysis and characterization of biological networks, and discuss the limitations of these techniques and some of the directions along which novel approaches and implementations are needed.
      Speaker: Sara A. Solla
    • 13:00 14:00
      Lunch 1h