Intelligent control for networked systems
The efficient solution of a planning, management, or operational control problem can be obtained by appropriate optimisation algorithms, which, in turn, rely on models to explore the space of alternative solutions, but also to assess the potential impact of the proposed solution in a simulated realistic environment. The generation and the evaluation of such solutions is suitably organised and orchestrated by systems based on a variety of algorithmic approaches

Graph-based reinforcement learning in structured environments

The research concerns the study of the methodological advancements required to properly exploit graph- based neural representations in the context of reinforcement learning. The general objective here is to design a theoretical framework to learn composable policies i.e., a computational unit, a node-level policy, that takes part in the decision-making mechanism by processing information in an integrated form with respect to a committee of policies, possibly sharing the same parameters and exchanging messages along some edges. The advantage of using this modular approach resides in the possibility of exploiting the structure of the underlying system and reusing the same modules to solve different problems. The key research tasks to be addressed are:

  • Theoretical understanding of the expressiveness of composable policies based on message passing. Existing methods for designing composable policies do not have theoretical justification or guarantees as such it is necessary to correctly frame the problem to ground design choices in theory.
  • Spatio-temporal abstraction with hierarchical representations. Learning local actuator-level policies arguably makes spatio-temporal abstraction more difficult. We tackle this problem by introducing hierarchical, multi-layered, graph representations that naturally implement hierarchical reinforcement learning architectures.
  • Control of metamorphic agents. Most existing methods focus on settings where the structure of the agents is fixed and does not change with time. The research allows agents to autonomously change their morphology and devise algorithms for automatic design and graph learning. 

Combinatorial optimization and AI for intelligent planning and control

The design, planning and management of networked systems imply the solution of complex decision and control problems. The interplay between combinatorial optimization and AI is a promising research area: new algorithms can be developed to achieve more scalable and more accurate solutions to such problems. Newer applications arise from the interaction of networked systems due to the so-called electrification and digitalization of industrial processes and to the increasing awareness and willing to impact on climate changes.
Challenging problems are, for example:

  • The electrification of the logistic chain interfacing the industry of energy generation and distribution with the industry of production and transportation.
  • Sustainability and circularity of the manufacturing process asking for the optimization of newer objectives and the integration of environmental and circularity constraints.
  • Planning of nation-wide communication infrastructures to support the digital era.

From a methodological point of view: AI can enable automatic configuration of optimization algorithms, drive the exploration of search processes, and elicit the most promising solutions. Machine Learning can automatize the decomposition of large problems. AI based regression techniques can approximate with little computational effort and great accuracy complex non-linear phenomena opening for derivative-free techniques to tackle complex problems. Furthermore, fundamental problems in AI training processes can be tackled exploiting combinatorial optimization: convex, robust and mixed integer optimization can provide better solutions to parameter tuning with respect to classical heuristic approaches. 

Optimal control and self-tuning of industrial machines

 In the last years, IDSIA has been active in the development of active learning algorithms for experiment- driven design of real-time optimal controllers and automatic calibration of industrial machines. These problems require to tune a set of parameters in order to optimize the final performance. This tuning is typically performed manually by skilled domain experts through trial-end-error, and thus it can be costly and time consuming. A possible solution can be to use active learning algorithms, that perform optimization through experiments iteratively suggested to the user, with the final goal of finding the optimal parameters within a limited number of iterations.

Besides using off-the-shelf algorithms for active learning (such as Bayesian Optimization), novel active learning algorithms have been developed at IDSIA, such as optimization based on preferences [1], where the decision maker is asked to iteratively express only a qualitative pairwise preference (such as “this is better than that”) between two candidate decision vectors. This algorithm is particularly useful in the cases where an objective function is not quantifiable, either because it is of qualitative nature or because it involves several goals, whose relative importance is not well defined. Indeed, it is well known that humans are better at comparing two options rather than assessing the value of “goodness” of an option. A novel Preferential Bayesian Optimization has been also developed [2], which uses Skew-Gaussian Processes and it has been proven to outperform state-of-the-art preference-based Bayesian Optimization algorithms.
These methodologies have been successfully applied in robot sealing [3] and assembly tasks [4]; calibration of hyper-parameters for efficient implementation of (embedded) model-predictive control laws [5]; and in several applied projects funded by Innosuisse and by the EU to calibrate machine parameters in high-power laser cutting (together with Bystronic AG); drilling and wire-cut electrical discharge machining (together with Georg Fischer Ltd); and ultra-short pulse laser for high-precision manufacturing and processing (together with CSEM, BHF and FEMTOPrint SA).
A Python library containing the developed algorithms have been developed, together with a user-friendly interface. This library is currently used by IDSIA researchers and industrial partners collaborating with IDSIA on R&D projects.

[1] A. Bemporad and D. Piga. "Global optimization based on active preference learning with radial basis functions." Machine Learning, 2021

[2] A. Benavoli, D. Azzimonti, and D. Piga. "Preferential Bayesian optimisation with skew gaussian processes." Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2021

[3] L. Roveda, B. Maggioni, E. Marescotti, A.A. Shahid, A.M. Zanchettin, A. Bemporad, and D. Piga. "Pairwise Preferences-Based Optimization of a Path-Based Velocity Planner in Robotic Sealing Tasks". IEEE Robotics and Automation Letters, 2021

[4] L. Roveda, M. Magni, M. Cantoni, D. Piga, G. Bucca. "Human–robot collaboration in sensorless assembly task learning enhanced by uncertainties adaptation via Bayesian Optimization". Robotics and Autonomous Systems, 2021

Relational spatio-temporal representations for prediction and control

The research line aims at contributing to research in relational representation learning for prediction and control of complex systems where both spatial and temporal information is relevant.
The main research challenges are:

  • Design learning agents with architectural inductive biases that enable learning spatio-temporal representations through the sequential interaction with the environment in model-free reinforcement learning.
  • The study of relational spatio-temporal representations for time series analysis with application in highly dimensional multivariate time series forecasting and imputation. In this context, the framework of spatio-temporal, graph, neural processing will be extended to the processing of generic multivariate time series, e.g., those coming from sensor networks.
  • The combination of the two aforementioned approaches in model-based reinforcement learning where the policy learning process is aided by the learned (graph) spatio-temporal model of the environment.

Computational biophysics

At IDSIA, a Computational Biophysics Unit is also active. CBU employs a wide spectrum of molecular and multiscale computational techniques to study complex biological systems. Activities include in particular: drug delivery system and nanoparticle design and optimization, in silico structure- and ligand-based virtual screening with investigation of drug mechanism of action and self-assembly of biopolymers.