18 November 2019

Cancer is a disease characterized by the accumulation of alterations to the genome, which selectively make cancer cells fitter to survive, proliferate and move. The understanding of progression and evolution models that underlie this processes, i.e., the characterization of sequences of alterations that lead to the emergence of the disease, is a topic attracting much attention. Of course, the problem of reconstructing such models is not new; in fact, several methods for inferring progression models (and phylogenies) from cross-sectional samples have been developed since the late 90s. Recently, we have proposed a number of algorithms to reconstruct cancer clonal evolutionary models from a variety of cross-sectional data types, either "ensemble", "bulk" or "single cell". We perform our reconstructions using a variety of algorithms based on a “probability raising” score that guarantees statistical dependencies on the inferred precedence relations. Our methods are complementary to traditional phylogeny reconstructions ones. Within this context, we have proven the correctness of our algorithms and characterized their performance. Our algorithms are collected in a R BioConductor package “TRanslational ONCOlogy” (TRONCO) that we have successfully used as part of our "Pipeline for Cancer Inference" (PiCnIc) to analyse Colorectal Cancer (CRC) data from TCGA. The newest addendum to TRONCO is the “Temporal oRder of Individual Tumors” (TRaIT) a new collection of algorithms that can be used for single-cell (and multi-region) progression analisys of cancers.
Room G1-201, Galleria 1, @12h00

14 November 2019

In this talk, we will discuss two different types of optical sensors based on luminescence: one is used for oxygen sensing type sensors, and the other is an «optical nose» for fingerprinting of substances. The problem of these types of sensors is that the measured signal is influenced by many components (like mirrors, lasers, electronics). Unfortunately these dependencies cannot be modeled mathematically in a simple way. So typically, complicated and empirical mathematical models are used, which are then fine tuned for each sensor in what is called calibration. But do we need to model those effects? Or there is another way? We will describe how the use neural networks can dramatically change how to build and use these sensors, without the need for any complicated mathematical model. Introducing neural networks in optical sensors typically does not require deep networks. However, there are several aspects that are very different from classical neural networks models which will be discussed here: one example overall is overfitting, that in this case requires completely different approaches to be dealt with. We will bring at the talk a prototype of a portable, low-cost sensor that we are currently developing at TOELT to demonstrate how a low-cost sensor can be built and used.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

30 October 2019

When building Bayesian networks with the help of domain experts, often properties of monotonicity arise, such as veterinarians expressing that “under more severe conditions, seeing the more severe symptoms becomes more likely”. It is well known that human decision makers will not use a network in their daily practice if such common properties of monotonicity are clearly violated, not even if the network shows overall high performance. In this talk, I will introduce some properties of monotonicity for Bayesian networks and further focus on the different roles that monotonicity has in engineering networks, both in building them by hand, in learning them from data and in fine-tuning.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

22 October 2019

The field of Geometric Packing problems has attracted the attention of many researchers in the late years. Generally speaking, in this setting we are given a region in the two-dimensional plane and a set of rectangles and the goal is to pack a subset of them inside the given region in such a way that they do not overlap and some given objective function is optimized. In this talk we will review our recent developments on two of these problems in the framework of Approximation Algorithms: Strip Packing and Geometric Knapsack. In the first problem, the goal is to pack all the rectangles into a strip of fixed width so as to minimize the final height of the packing, while in the second one the goal is to pack a subset of the rectangles of maximum profit into a rectangular region of fixed size. We will also discuss applications and open questions regarding the mentioned problems, the talk is meant to be accessible for non-experts.
Manno, Galleria 1, 2nd floor, room G1-204 @12h00

16 October 2019

Glyco-bioinformatics is an emerging subfield of bioinformatics aimed at expediting research in the field of glycomics. Unfortunately, the development of sugar-based virtual structures is made difficult by some structural features of sugar such as their high charge density, conformational flexibility and the torsional angles between glycosidic bonds. As a consequence, automated prediction of the binding poses of long sugar with proteins (that is a pivotal aspect of many biological processes) has been evaded so far, also due to the solvation/desolvation, weak surface complementarity and large electrostatic interactions of sugar/protein interactions. My PhD activity is aimed at overcoming these limits. To this aim, I have implemented a new computational method based on incremental docking that has been so far successfully applied to two important biological interactions such as that of heparin with the HIV-1 p17 matrix protein in the field of AIDS and VEGF with its VEGF receptor-2 in the field of tumor neovascularization. Perspective developments include the development of an algorithm able to automatize the developed computational methods and their application to other sugar/protein interactions of biological importance.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

25 September 2019 - 25 September 2019

This is an introductory level crash-course in Financial Mathematics. We review some key concepts of Financial product pricing and show how they can be applied to price Options. We present a Reinforcement Learning approach to replicate the theoretical prices.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

18 September 2019

In this talk we are going to highlight some major recent breakthroughs in the field of Natural Language Processing (NLP) and their impact in the definition of conversational models. But instead that talking about “chit-chat” conversation like “Alexa play my favorite Italian song” and the competition for passing the Turing test, we will focus on the automation of the conversations in the context of contact centers and we will address specifically the need to define “goal oriented or intent driven” conversational models. The ultimate objective is to identify what is needed to create personalized conversation in a framework where Artificial Intelligence meets Human Interactions.
Room G1-201, Galleria 1, @12h00

17 September 2019 - 17 September 2019

In this talk I will first give a short overview of the research performed at the Robotics and Perception Group. I will present recent research from our lab in the areas of vision-based navigation of micro-aerial vehicles, aggressive flight and machine learning. After this I will focus on autonomous agile flight and present our research on drone racing, where we combined a learned perception system with a classical control pipeline to teach a drone to race through a track. Finally I will describe our current project on autonomous acrobatic flight, explain the challenges that arise when pushing vision-based platforms to aggressive maneuvers and outline our approaches to integrate learning deeper in the control&estimation pipeline.
Galleria 1, @12h00

25 July 2019 - 25 July 2019

Data fusion strategies for precision medicine and drug repurposing. Over the last few years, biomedical research and clinical practice have experienced an incredible growth in terms of both amount and heterogeneity of data being collected and leveraged for different types of analysis. This data explosion represents a great opportunity to increase our knowledge about many biological mechanisms as well as to improve medical processes (i.e., diagnosis, prognosis, therapy). However, not all big data are created equal. The downside of data heterogeneity is it complicates integration analysis. For example, clinical record data is highly heterogeneous, sparsely annotated, and contains several measurement types and unstructured text fields, comprised of ambiguous statements as well as varying levels of certainty, whereas genomic and imaging data are crisp and densely annotated data with a low cardinality of distinct variables. Integrating these data is particularly challenging when the molecular measurements are not conducted on individual subjects. In order to take full advantage of the wide spectrum of biomedical data available, advanced data integration tools need to be applied. In this context, I will present data fusion strategies for precision medicine and drug repositioning from my own research. These methods will include an approach for the prediction of potential multi-target drug repurposing strategies and its performances when applied to triple negative breast cancer. A second method that will be presented computes patient similarities by integrating patient-specific genomic data and public biomedical knowledge through a matrix tri-factorization approach. Finally, I will present a network-based approach integrating genomic and drug data with Gene Ontology-based information theoretic semantic similarities for the suggestion of new drug repurposing candidates. These examples show the potential of developing new research hypotheses and conducting predictive and data interpolation operations.
IDSIA Meeting Room, Galleria 1, @14h30

26 June 2019

When time series are organized into hierarchies, the forecasts have to satisfy some summing constraints. Forecasts which are independently generated for each time series (base forecasts) do not satisfy the constraints. Reconciliation algorithms adjust the base forecast in order to satisfy the summing constraints: in general they also improve the accuracy. We present a novel reconciliation algorithm based on Bayes' rule; we discuss under which assumptions it is optimal and we show in extensive experiments that it compares favorably to the state-of-the-art reconciliation methods
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

12 June 2019

The untargeted steroid identification represents an important analytical challenge due to the chemical similarity of the molecules. Moreover, new experimental technologies such as the two-dimensional gas chromatography (GCxGC) coupled with high resolution time of fly mass spectrometry (HRMS-TOF) were demonstrated to show superior separation power especially for the isomeric compound discrimination. Unfortunately, few molecules are generally annotated, limiting thus the comprehension of the steroid metabolism in its complexity. To overcome this current limitation, in-silico retention time predictions represent an interesting option. In this work, several machine learning and deep learning algorithms were utilised for the development of retention time prediction models in GCxGC. Starting from a three-dimensional molecular representation, convolutional neural networks (CNN) showed the best prediction performances compared to the classical machine learning models based on handcrafted molecular descriptors. Moreover, CNN were demonstrated to recognize the chiral information and to solve an important issue for steroid identification without the need for a manual feature engineering. The final prediction model is applied to a real clinical case study. In combination with the MS information, retention time predictions allowed the untargeted annotation of 12 steroids in the urine of new-borns.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

11 June 2019

At very small scale, of the order of the nanometer, classical physics becomes insufficient for describing matter, because quantum effects emerge prominently. Studying this physics can be a very challenging task, both experimentally and theoretically, because of the complexity of matter itself. For this reason, in 1982, Richard Feynman proposed not to study matter directly, but to simulate it, using a so-called quantum simulator [Int. J. Theor. Physics, 21:467]. This would amount to studying some “simple” experimental quantum systems, which can be mapped onto more complex ones. More than twenty years later, thanks to great technological advances, the existence of quantum simulators was made possible. In this talk we will focus on a specific class of quantum simulators, namely cold atoms in optical lattices, which constitute a highly controllable experimental setup for the study of quantum effects. Despite their controllability, due to their quantum nature, an exact mathematical study of these systems remains inaccessible to the computational capacity of current technology. For this reason most approaches rely either on approximations or on stochastic methods. We will present a new computational approach based both on an approximation and a stochastic (Monte Carlo) method. At the end of the talk we will also briefly review the relations between quantum physics and machine learning.
IDSIA meeting room @10:00

7 June 2019

Nano-size unmanned aerial vehicles (UAVs), with few centimeters of diameter and sub-10 Watts of total power budget, have so far been considered incapable of running sophisticated visual-based autonomous navigation software without external aid from base-stations, ad-hoc local positioning infrastructure, and powerful external computation servers. In this talk, we present what is, to the best of our knowledge, the first 27g nano-UAV system able to run aboard an end-to-end, closed-loop visual pipeline for autonomous navigation based on a state-of-the-art deep-learning algorithm, built upon the open-source Crazyflie 2.0 nano-quadrotor. Our visual navigation engine is enabled by the combination of an ultra-low power computing device (the GAP8 system-on-chip) with a novel methodology for the deployment of deep convolutional neural networks (CNNs). We enable onboard real-time execution of the DroNet state-of-the-art deep CNN at 6 frame-per-second within 64mW and up to 18fps while still consuming on average just 3.5% of the power envelope of the deployed nano-aircraft. Field experiments demonstrate that the system's high responsiveness prevents collisions with unexpected dynamic obstacles up to a flight speed of 1.5m/s. In addition, we also demonstrate the capability of our visual navigation engine of fully autonomous indoor navigation on a 113m previously unseen path.
Manno, Galleria 1, 2nd floor @ 12:00

28 May 2019

InferPy is an open-source library for deep probabilistic modeling written in Python and running on top of Edward 2 and Tensorflow. Other existing probabilistic programming languages possess the drawback that they are difficult to use, especially when defining deep neural networks and probability distributions over multidimensional tensors. This means that their final goal of broadening the number of people able to code a machine learning application may not be fulfilled. InferPy tries to address these issues by defining a user-friendly API which trades-off model complexity with ease of use. In particular, this library allows users to: prototype hierarchical probabilistic models with a simple and user-friendly API inspired by Keras; define probabilistic models with complex constructs containing deep neural networks; create computationally efficient batched models without having to deal with complex tensor operations; and run seamlessly on CPUs and GPUs by relying on Tensorflow.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

30 April 2019 - 30 April 2019

Learning technologies are becoming increasingly important in today's education. This includes game-based learning and simulations, which produce high volume output, and MOOCs (massive open online courses), which reach a broad and diverse audience at scale. The users of such systems often are of very different backgrounds, for example in terms of age, prior knowledge, and learning speed. Adaptation to the specific needs of the individual user is therefore essential. In this talk, I will present two of my contributions on modeling and predicting student learning in computer-based environments with the goal to enable individualization. The first contribution introduces a new model and algorithm for representing and predicting student knowledge. The new approach is efficient and has been demonstrated to outperform previous work regarding prediction accuracy. The second contribution introduces models, which are able to not only take into account the accuracy of the user, but also the inquiry strategies of the user, improving prediction of future learning. Furthermore, students can be clustered into groups with different strategies and targeted interventions can be designed based on these strategies.
IDSIA Meeting Room, Galleria 1, Manno

11 April 2019

We are experiencing once again a period of enthusiasm in the AI research field, fired above all by the successes of the technology of deep neural networks or deep machine learning. In this talk we draw attention to what we take to be serious problems underlying current views of artificial intelligence encouraged by these successes, especially in the domain of language processing. We then show an alternative approach to language-centric AI, and illustrate this approach in relation to a specific example in the field of claims management.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

3 April 2019 - 3 April 2019

A Groebner basis is a set of multivariate polynomials that has desirable algorithmic properties. Every set of polynomials can be transformed into a Groebner basis. This process generalizes three familiar techniques (and more): 1) Gaussian elimination for solving linear systems of equations, 2) the Euclidean algorithm for computing the greatest common divisor of two univariate polynomials, 3) and the Simplex Algorithm for linear programming. In this talk I'll give a gentle introduction to Groebner bases. No prior knowledge is required.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

21 March 2019 - 21 March 2019

Suppose that I give you a square and a collection of rectangles of different shapes. How many rectangles can you pack into the square (so that they do not overlap)? This and related problems are NP-hard. In this talk I will present approximation algorithms to efficiently pack a number of rectangles close to the optimum. The talk is meant to be accessible to non-experts.
Manno, Galleria 1, 2nd floor, room G1-204 @12h00

15 February 2019

Many complex systems are characterized by multi-level properties that make the study of their dynamics and of their emerging phenomena a daunting task. The huge amount of data available in modern sciences can be expected to support great progress in these studies, even though the nature of the data varies. Given that, it is crucial to extract as much as possible features from data, including qualitative (topological) ones. The goal of TOPDRIM project has been to provide methods driven by the topology of data for describing the dynamics of multi-level complex systems. To this end, the project has developed new mathematical and computational formalisms accounting for topological effects. To pursue these objectives, the project brought together scientists from many diverse fields including as topology and geometry, statistical physics and information theory, computer science and biology. The proposed methods, obtained through concerted efforts, covered different aspects of the science of complexity ranging from foundations, to simulations through modelling and analysis, and constituted the building blocks for a new generalized theory of complexity. This seminar introduces the fundamentals behind the topological data analysis and through some applications developed both in the biomedical and financial field, presents the TOPDRIM methodology for going beyond the concept of networks by considering simplicial complexes instead.
Manno, Galleria 1, 2nd floor, room G1-201 @12:00

12 February 2019

Non-stationarity in data can arise due to the changes in various unobserved influencing factors. One way to account for non-stationarity is to employ models with time-varying parameters. Such models can be parametric or non-parametric depending on underlying assumptions they impose. The presented non-stationary approach identifies the optimal number of hidden regimes in data and the (a priori unknown) regime-switching dynamic without employing restrictive parametric assumption about the data-generating process. Within the regime, data is modelled using Maximum Entropy density, where the optimal number of density parameters is inferred via Lasso regularization technique. The resulting non-parametric methodology provides simultaneously the simplest and the least biased description of the data.
IDSIA meeting room @10:00

31 January 2019

Oscillations are a fundamental property of life and oscillatory activity is observed throughout the central nervous system at all levels of organization. They are observed across vastly different time scales ranging from year-long cycles to milliseconds. Oscillations interact across different time scales in a complex manner with one of the emerging principles being that lower level (faster) oscillations are embedded in higher level (slower) oscillations. Sleep is a prototype for such a multilevel oscillation. On the one hand, sleep is part of the slower oscillation of the circadian cycle, which interacts with the homeostatic oscillator to regulate the timing and the intra-sleep dynamics. Sleep itself is made up of ultradian cycles, that is, the 90 to 120 minute cycle between non-rapid-eye-movement (NREM) sleep and REM sleep. NREM sleep again consists of several sleep stages, currently labeled N1 to N3, which denote the succession from lighter to deeper, slow wave sleep. The specific sleep stage is characterized by the predominance of brain activity oscillations that include several EEG frequency bands. Bridging the gap between the EEG frequency bands and the ultradian cycle, recent research has begun to characterize a further oscillation of brain activity that is even slower than the slow waves and represents multi-second oscillations with periodicities between 10 to 100 s. These oscillations are called infra-slow oscillations (ISO) and include a synchronous oscillations of motor (periodic legs movements), autonomic and cortical activity during sleep. The physiological as well the pathological meaning of the motor component of ISO has become an hot topic in sleep research.
Manno, Galleria 1, 2nd floor, room G1-201

17 January 2019 - 17 January 2019

This talk describes the combination of machine learning with microscopy techniques for investigating the mechanisms of the immune system. After giving an overview about the applications of machine learning to in-vivo imaging, the capabilities of a graph-based, semi-supervised clustering algorithm will be presented. More in details, the immune system involves a complex network of cellular interactions. This network can be described as a system whose output can be either protective (i.e. from pathogens and tumors) or pathogenic (i.e. leading to autoimmune diseases). In-vivo video microscopy (IVM) is a recently developed method to investigate the behavior of the immune system in living animals. IVM acquires 4D videos capturing the migration of cells which correlates to their spatiotemporal interaction patterns. However, automatic classical automatic analysis methods for this type of data require cell segmentation and tracking which are challenging due to the high plasticity, lack of textures and frequent contacts between cells. To this end, we present a semi-supervised clustering algorithm for segmentation and tracking by grouping voxels according with a trainable grouping criterion. Moreover, we present novel analysis methods that do not require segmentation nor tracking.
IDSIA, Galleria 1