Graphbased reinforcement learning in structured environments
The research concerns the study of the methodological advancements required to properly exploit graph based neural representations in the context of reinforcement learning. The general objective here is to design a theoretical framework to learn composable policies i.e., a computational unit, a nodelevel policy, that takes part in the decisionmaking mechanism by processing information in an integrated form with respect to a committee of policies, possibly sharing the same parameters and exchanging messages along some edges. The advantage of using this modular approach resides in the possibility of exploiting the structure of the underlying system and reusing the same modules to solve different problems. The key research tasks to be addressed are:
 Theoretical understanding of the expressiveness of composable policies based on message passing. Existing methods for designing composable policies do not have theoretical justification or guarantees as such it is necessary to correctly frame the problem to ground design choices in theory.
 Spatiotemporal abstraction with hierarchical representations. Learning local actuatorlevel policies arguably makes spatiotemporal abstraction more difficult. We tackle this problem by introducing hierarchical, multilayered, graph representations that naturally implement hierarchical reinforcement learning architectures.
 Control of metamorphic agents. Most existing methods focus on settings where the structure of the agents is fixed and does not change with time. The research allows agents to autonomously change their morphology and devise algorithms for automatic design and graph learning.
Combinatorial optimization and AI for intelligent planning and control
The design, planning and management of networked systems imply the solution of complex decision and control problems. The interplay between combinatorial optimization and AI is a promising research area: new algorithms can be developed to achieve more scalable and more accurate solutions to such problems. Newer applications arise from the interaction of networked systems due to the socalled electrification and digitalization of industrial processes and to the increasing awareness and willing to impact on climate changes.
Challenging problems are, for example:
 The electrification of the logistic chain interfacing the industry of energy generation and distribution with the industry of production and transportation.
 Sustainability and circularity of the manufacturing process asking for the optimization of newer objectives and the integration of environmental and circularity constraints.
 Planning of nationwide communication infrastructures to support the digital era.
From a methodological point of view: AI can enable automatic configuration of optimization algorithms, drive the exploration of search processes, and elicit the most promising solutions. Machine Learning can automatize the decomposition of large problems. AI based regression techniques can approximate with little computational effort and great accuracy complex nonlinear phenomena opening for derivativefree techniques to tackle complex problems. Furthermore, fundamental problems in AI training processes can be tackled exploiting combinatorial optimization: convex, robust and mixed integer optimization can provide better solutions to parameter tuning with respect to classical heuristic approaches.
Optimal control and selftuning of industrial machines
In the last years, IDSIA has been active in the development of active learning algorithms for experiment driven design of realtime optimal controllers and automatic calibration of industrial machines. These problems require to tune a set of parameters in order to optimize the final performance. This tuning is typically performed manually by skilled domain experts through trialenderror, and thus it can be costly and time consuming. A possible solution can be to use active learning algorithms, that perform optimization through experiments iteratively suggested to the user, with the final goal of finding the optimal parameters within a limited number of iterations.
Besides using offtheshelf algorithms for active learning (such as Bayesian Optimization), novel active learning algorithms have been developed at IDSIA, such as optimization based on preferences [1], where the decision maker is asked to iteratively express only a qualitative pairwise preference (such as “this is better than that”) between two candidate decision vectors. This algorithm is particularly useful in the cases where an objective function is not quantifiable, either because it is of qualitative nature or because it involves several goals, whose relative importance is not well defined. Indeed, it is well known that humans are better at comparing two options rather than assessing the value of “goodness” of an option. A novel Preferential Bayesian Optimization has been also developed [2], which uses SkewGaussian Processes and it has been proven to outperform stateoftheart preferencebased Bayesian Optimization algorithms.
These methodologies have been successfully applied in robot sealing [3] and assembly tasks [4]; calibration of hyperparameters for efficient implementation of (embedded) modelpredictive control laws [5]; and in several applied projects funded by Innosuisse and by the EU to calibrate machine parameters in highpower laser cutting (together with Bystronic AG); drilling and wirecut electrical discharge machining (together with Georg Fischer Ltd); and ultrashort pulse laser for highprecision manufacturing and processing (together with CSEM, BHF and FEMTOPrint SA).
A Python library containing the developed algorithms have been developed, together with a userfriendly interface. This library is currently used by IDSIA researchers and industrial partners collaborating with IDSIA on R&D projects.
[1] A. Bemporad and D. Piga. "Global optimization based on active preference learning with radial basis functions." Machine Learning, 2021
[2] A. Benavoli, D. Azzimonti, and D. Piga. "Preferential Bayesian optimisation with skew gaussian processes." Proceedings of the Genetic and Evolutionary Computation Conference Companion, 2021
[3] L. Roveda, B. Maggioni, E. Marescotti, A.A. Shahid, A.M. Zanchettin, A. Bemporad, and D. Piga. "Pairwise PreferencesBased Optimization of a PathBased Velocity Planner in Robotic Sealing Tasks". IEEE Robotics and Automation Letters, 2021
[4] L. Roveda, M. Magni, M. Cantoni, D. Piga, G. Bucca. "Human–robot collaboration in sensorless assembly task learning enhanced by uncertainties adaptation via Bayesian Optimization". Robotics and Autonomous Systems, 2021
Relational spatiotemporal representations for prediction and control
The research line aims at contributing to research in relational representation learning for prediction and control of complex systems where both spatial and temporal information is relevant.
The main research challenges are:
 Design learning agents with architectural inductive biases that enable learning spatiotemporal representations through the sequential interaction with the environment in modelfree reinforcement learning.
 The study of relational spatiotemporal representations for time series analysis with application in highly dimensional multivariate time series forecasting and imputation. In this context, the framework of spatiotemporal, graph, neural processing will be extended to the processing of generic multivariate time series, e.g., those coming from sensor networks.
 The combination of the two aforementioned approaches in modelbased reinforcement learning where the policy learning process is aided by the learned (graph) spatiotemporal model of the environment.
Computational biophysics
At IDSIA, a Computational Biophysics Unit is also active. CBU employs a wide spectrum of molecular and multiscale computational techniques to study complex biological systems. Activities include in particular: drug delivery system and nanoparticle design and optimization, in silico structure and ligandbased virtual screening with investigation of drug mechanism of action and selfassembly of biopolymers.

Department of Innovative Technologies
Dalle Molle Institute for Artificial Intelligence USISUPSI
Polo universitario Lugano  Campus Est, Via la Santa 1
CH6962 LuganoViganello
T +41 (0)58 666 66 66
info@idsia.ch

Faculty of Informatics
Università della Svizzera italiana
Polo universitario Lugano, Campus Est, via La Santa 1
6900 LuganoViganello
T +41 (0)58 666 46 90
decanato.inf@usi.ch