Estimation of distribution algorithms (EDAs) are a paradigm of evolutionary computation in which the traditional genetic operators (crossover and mutation) are replaced by the construction and sampling of a probabilistic model of the promising solutions found so far. My research on EDAs spans from their theoretical foundations to their application to challenging real-world problems. I am co-author of the monograph Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation (Springer, 2002), and my PhD dissertation was devoted to the study of EDAs.

Foundations of EDAs

Theoretical Foundations

EDAs bridge the gap between probabilistic modeling and evolutionary search. Rather than combining parent solutions through crossover and mutation operators, EDAs learn a probabilistic model of the best individuals in the population and then sample new individuals from this model. This approach makes the dependencies among variables in the problem explicit and exploits them to guide the search.

My early research focused on the theoretical properties of EDAs, analyzing their convergence behavior, the quality of the models they learn, and the relationship between problem structure and algorithm performance. This included the study of linkage learning and the design of EDAs capable of identifying and exploiting variable interactions.

Taxonomy of EDAs
EDAs can be classified according to the complexity of the probabilistic model used. Univariate EDAs (such as UMDA and PBIL) assume variable independence. Bivariate EDAs (such as MIMIC and BMDA) can capture pairwise dependencies. Multivariate EDAs (such as BOA, EBNA, and ECGA) use full probabilistic graphical models — Bayesian networks or Markov networks — to represent higher-order dependencies among variables.

Discrete and Combinatorial EDAs

Bayesian Network-based EDAs

The Estimation of Bayesian Network Algorithm (EBNA) and related variants use Bayesian networks as the probabilistic model in the EDA loop. The algorithm learns the structure and parameters of a Bayesian network from the selected population, then samples new candidate solutions from the learned network.

Research on Bayesian network EDAs has addressed the structural learning problem (finding a network that accurately represents the dependencies in the data), the computational cost of network learning and sampling, and the theoretical analysis of the models learned during the search process.

Markov Network-based EDAs
Research on using Markov networks (undirected graphical models) as the probabilistic model in EDAs. Markov network EDAs can represent symmetric dependencies between variables, which makes them particularly appropriate for problems with symmetric structure. This line of research includes the development of learning algorithms for Markov network EDAs and their evaluation on benchmark problems.
EDAs for Combinatorial Optimization
Application of EDAs to difficult combinatorial optimization problems, including permutation problems (such as the quadratic assignment problem, the traveling salesman problem, and scheduling problems), pseudo-Boolean optimization problems, and multi-objective combinatorial optimization. A key challenge in this domain is the design of appropriate probabilistic models for structured solution representations.

Continuous EDAs

Gaussian-based Continuous EDAs
Extension of EDAs to continuous optimization domains using Gaussian probability distributions. Univariate Gaussian EDAs model each variable independently with a Gaussian distribution. Multivariate Gaussian EDAs (such as EGNA) use multivariate Gaussian distributions to model dependencies between variables. Research has examined how the covariance structure of the Gaussian model relates to the structure of the fitness landscape.
Copula-based Continuous EDAs
Development of EDAs using copula functions to model the joint distribution of continuous variables. Copulas allow the marginal distributions and the dependency structure to be modeled separately, providing greater flexibility than standard Gaussian models. This line of research includes vine copula EDAs that model complex multivariate dependencies using sequences of bivariate copulas.
Parametric Optimization of Energy Systems
Application of continuous EDAs to the parametric optimization of geothermal power plants and other engineering systems. The goal is to optimize the operating parameters of complex physical systems to maximize efficiency or minimize costs. Continuous EDAs provide a principled approach to this type of black-box optimization problem.

Analysis of EDA Models

What EDAs Learn
A central research question in the study of EDAs is: what information is captured in the probabilistic models learned during the evolutionary search? Research has examined the relationship between the model learned by the EDA and the structure of the fitness landscape, the variable interactions in the problem, and the properties of the optimal solutions.
EDA Models as Problem Representations
Probabilistic models learned by EDAs can be seen as compressed representations of the structure of the optimization problem. Research has investigated how to extract useful information from these models beyond their role in generating new candidate solutions. This includes using EDA models to identify relevant variable interactions, detect problem symmetries, and warm-start future optimization runs.

Applications

EDAs for Machine Learning
Application of EDAs to machine learning tasks, including feature selection, neural architecture search, and the optimization of hyperparameters. The probabilistic model of an EDA naturally captures dependencies among the features or architecture components, allowing the search to exploit these dependencies for more efficient exploration.
EDAs for Scientific Computation
Application of EDAs to problems in scientific computation, including the estimation of parameters for stochastic differential equations modelling financial systems or physical processes, protein structure prediction, and the design of experiments. EDAs provide a principled approach to black-box optimization in these domains.

Selected Publications