Highlights
Events
8 February 2018

As an optimization problem, clustering exhibits a striking phenomenon: It is generally regarded as easy in practice, while theory classifies it among the computationally intractable problems. To address this dichotomy, research has identified a number of conditions a data set must satisfy for a clustering to be (1) easily computable and (2) meaningful. In this talk we show that all previously proposed notions of struturedness of a data set are fundamentally local properties, i.e. the global optimum is in well defined sense close to a local optimum. As a corollary, this implies that the Local Search heuristic has strong performance guarantees for both the tasks of recovering the underlying optimal clustering and obtaining a clustering of small cost.
Room 222 - Galleria 1 @12:00

22 February 2018

The study of online algorithms and competitive analysis provide a solid foundation for studying the quality of irrevocable decision making when the data arrives in an online manner. While in some scenarios the decisions are indeed irrevocable, there are many practical situations when changing a previous decision is not impossible, but simply expensive. In this work we formalize this notion and introduce the consistent k-clustering problem. With points arriving online, the goal is to maintain a constant approximate solution, while minimizing the number of reclusterings necessary. We prove a lower bound, showing that Ω(k log n) changes are necessary in the worst case, for a wide range of objective functions. On the positive side, we give an algorithm that needs only O(k^2 log^4 n) changes to maintain a constant competitive solution. This is an exponential improvement on the naive solution of reclustering at every time step. Finally, we show experimentally that our approach performs much better than the theoretical bound, with the number of changes growing approximately as O(log n). Joint work with Sergei Vassilvitskii.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

7 March 2018 - 7 March 2018

Many systems in the world are vast networks consisting of autonomous nodes that communicate with each other in order to jointly solve a task. One common feature of these complex networks is that due to their size it is impossible for an individual node to have a global view. Instead, nodes have to base their decisions on local information about nearby nodes only. Despite this intrinsic locality, the network as a whole is supposed to arrive at a global solution. Understanding capabilities and limitations of such local algorithms is the key challenge of distributed graph algorithms. The LOCAL model---the standard synchronous message-passing model of distributed computing introduced by Linial in 1987---is designed to abstract the pure concept of locality. In this talk, we discuss the maximal matching problem in the LOCAL model. In particular, we introduce a new rounding technique that leads to a deterministic O(log^2 Delta log n)-round algorithm on any n-node graph with maximum degree Delta. This is the first improvement in almost 20 years over the celebrated O(log^4 n)-round algorithm by Hanckowiak, Karonski, and Panconesi.
Galleria 1 - 2nd floor, room G1-204

21 March 2018

Recently, machine learning methods have started to enter the field of photonics, ranging from quantum mechanics, nanophotonics, optical communication and optical networks. Though machine learning offers many powerful techniques, linking it to optical communication may not be trivial, due to the inherent peculiarities of optical technologies. In this talk, we will discuss open challenges and benefits that machine learning methods can bring to optical communications, especially in the field of optical networks design and management.
Galleria 1, 2nd floor, room G1-204

28 March 2018 - 28 March 2018

A Bayesian network (BN) is a versatile and well-known probabilistic graphical model with applications in a variety of fields. Their graphical nature makes BNs excellent models for representing the complex probabilistic relationships existing in many real problems ranging from bioinformatics to law, from image processing to economic risk analysis. In this talk, we will present the difficult task of learning their dependency graph, also known as their structure, from data.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

29 March 2018 - 29 March 2018

Semantic networks are one of the most valuable tools in today's NLP. They power the intelligence behind conversational interfaces, help search engines answer queries, and are one the most used ways to represent the knowledge present in natural language. In this talk, I will present some of my work related to them. Firstly, we will see an example of a simple use case for semantic networks within opinion mining. I will show how we used them to build a model able to detect political disaffection in Twitter messages in Italian language. Semantic networks here helped, e.g., in distinguishing general political disaffection from a sentiment against a specific political party. We applied this model to 35 millions tweets, and – in order to validate the quality of the generated time-series – we compared our results to opinion surveys. Secondly, I will illustrate a theoretical model for semantic networks. In a semantic network, each node is usually tagged with different categories. How does the presence or absence of such categories interplay with the network link structure? I will present a model that is able to describe complex interactions between categories and links, while at the same time being simple enough to derive scalable algorithms. Finally, I will show a practical application for this model on semantic networks, presenting an algorithm to mine surprising links.
Manno, Galleria 1, room G1-204

9 April 2018 - 9 April 2018

This talk concerns the problem of quickly computing distances and shortest paths on distributed networks (the CONGEST model). There have been many developments for this problem in the last few years, resulting in tight approximation schemes. This left widely open whether exact algorithms can perform equally well. In this talk, we will discuss some recent progress in answering this question. The talk will focus on surveying the state of the art, and will discuss some recent algorithms in details as time permits. Based on joint work with Chien-Chung Huang and Thatchaphol Saranurak (FOCS 2017) and with Sebastian Krinninger (work in progress).
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

13 April 2018 - 13 April 2018

This talk will present the current state and the proposed next efforts in the field of natural human-robot interfaces for mobile robotic systems based on body motion. The intuitive mapping system is data-driven from the user's body motion when they are asked to naturally move as they were in control of the machine performing a predefined set of manoeuvres . Previous studies demonstrated that this kind of interface presents a steeper learning curve in terms of teleoperation performance with respect to a classic remote controller. Further developments of the interface will consist in a personalised mapping, adapted ad-hoc on the user, the introduction of a machine learning-based nonlinear regressor to increase the decoding power of the gestural interface, an online adaptation paradigm and a shared control system finalised to the optimisation of estimated trajectories.
IDSIA Meeting Room - Galleria 1 - 11:00 am

23 April 2018 - 23 April 2018

Plagiarism is increasingly becoming a major issue in the academic and educational domains. Automated and effective plagiarism detection systems are direly required to curtail this information breach, especially in tackling idea plagiarism. The proposed approach is aimed to detect such plagiarism cases, where the idea of a third party is adopted and presented intelligently so that at the surface level, plagiarism cannot be unmasked. The work aims to explore syntax-semantic concept extractions with genetic algorithm in detecting cases of idea plagiarism. The work mainly focuses on idea plagiarism where the source ideas are plagiarized and represented in a summarized form. Plagiarism detection is employed at both the document and passage levels by exploiting the document concepts at various structural levels. Initially, the idea embedded within the given source document is captured using sentence level concept extraction with genetic algorithm. Document level detection is facilitated with word-level concepts where syntactic information is extracted and the non-plagiarized documents are pruned. A combined similarity metric that utilizes the semantic level concept extraction is then employed for passage level detection. The proposed approach is tested on PAN 2014 plagiarism corpus for summary obfuscation data, which represents a challenging case of idea plagiarism. The results obtained are found to exhibit significant improvement over the compared systems and hence reflects the potency of the proposed syntax-semantic based concept extractions in detecting idea plagiarism.
IDSIA Meeting Room Galleria 1 @9:30

25 April 2018 - 25 April 2018

In an attempt to explain what Logic (and a logic) is, in this talk, thorough a brief glimpse into the history of the concerned discipline (spanning from Chrysippus of Soli and Aristotle to Gödel and Turing, and beyond), I will provide an overview of some of its fundamental concepts and results underpinning the connections with researches at the Institute. In guise of conclusion, I will present ongoing works related to statistical relational languages and imprecise probabilities.
IDSIA, Room G1-204 - Galleria 1 - 12:00-13:00

3 May 2018 - 3 May 2018

A short journey through quantum entanglement from 1909 to 2018, Ok, so maybe not that short.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

23 May 2018

This presentation brainstorms on how creativity can be algorithmically forged and eased through Design. A multi-dimensional parallel between semantics and deep learning outlines the complementarity of these AI approaches. Assuming an architecture which eases the communication between the two paradigms, I propose ways in which semantics could solve three fundamental flaws of Deep Learning: narrow focus, opacity and incapacity to build on what we already know.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

1 June 2018

We consider two types of searching models, where the goal is to design an adaptive algorithm that locates an unknown vertex in a graph by repeatedly performing queries. In the vertex-query model, each query points to a vertex v and the response either admits that v is the target or provides a neighbor of von a shortest path from v to the target. This model has been introduced for trees by Onak and Parys [FOCS 2006] and by Emamjomeh-Zadeh et al. [STOC 2016] for arbitrary graphs. In the edge-query model, each query chooses an edge and the response reveals which endpoint of the edge is closer to the target, breaking ties arbitrarily. Our goal is to analyze solutions to these problems assuming that some responses may be erroneous. We develop a scheme for tackling such noisy models with the following line of arguments: For each of the two models, we analyze a generic strategy that assumes a fixed number of lies and give a precise bound for its length via an amortized analysis. From this, we derive bounds for both a linearly bounded error rate, where the number of errors in T queries is bounded by r*T for some r<1/2, and a probabilistic model in which each response is incorrect with some probability p<1/2. The bounds for adversarial case turn out to be strong enough for non-adversarial scenarios as well. We obtain thus a much simpler strategy performing fewer vertex-queries than one by Emamjomeh-Zadeh et al. For edge-queries, not studied before for general graphs, we obtain bounds that are tight up to log=CE=94 factors in all error models. Applying our graph-theoretic results to the setting of edge-queries for paths, we obtain a number of improvements over existing bounds for searching in a sorted array in the presence of errors, including an exponential improvement for the prefix-bounded model in unbounded domains.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

7 June 2018

This talk will be about the implementation of Bayesian networks in the multipurpose software OpenCossan. After a brief introduction about the software and its origins, we will show the different methods available in the software, the different types of data supported and types of output produced, together with some applications to risk analysis. Finally, we will discuss the reasons for implementing an algorithm for approximate inference in credal networks within our software and discuss a possible collaboration between IDSIA and the IRU of the University of Liverpool.
Manno, Galleria 1, 2nd floor, room G1-201

7 June 2018 - 7 June 2018

The dominant logical theory of belief revision, AGM (Alchourron, Gärdenfors and Makinson), imposes an amount of idealisation on cognitive agents revising their beliefs in the light of new information: their belief states are perfectly consistent, closed under the full force of classical logic, and they know all logical truths. Humans are not like that: they can hold inconsistent beliefs; their belief states are not classically closed; and they can be subject to framing effects, revising their beliefs in different ways when presented with logically equivalent options (‘If you apply for the job, you have 40% chances of making it’ vs ‘… you have 60% chances of failing’). Behavioural economics has shown the momentous consequences of framing in social choice and decision theory, where presenting the same situation positively (‘glass half full’) or negatively (‘glass half empty’) can lead to dramatically different beliefs and choices. In this talk, I present a semantics for a belief revision operator which is hyperintensional, that is, capable of modelling distinctions more fine-grained than classical logical equivalence, and in particular, framing effects and inconsistent beliefs. Unlike most approaches to belief revision for non-ideal agents, my semantics does not resort to non-classical logics. The framework combines, instead, a standard semantics for propositional modal logic with a simple mereology of contents.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

13 June 2018 - 13 June 2018

Since the pioneristic work of Seymour Papert, Mindstorms: Children, Computers, and Powerful Ideas (1980), the idea of using robots in education has grown steadily and is nowadays a very important interdisciplinary field of research, at the crossing between educational sciences and robotics, known under the name of Educational robotics. At the beginning of the present academic year, a group of teachers - researchers of IDSIA and of the Dipartimento Formazione e Apprendimento of SUPSI (university of teacher education of Southern Switzerland) has been established, with the aim of developing the competences of SUPSI in this field. In few months, the group, in collaboration with the EPFL, has already obtained two research projects: (i) a project of the SNF program Agora, named Introducing People to Research in Robotics through an Extended Peer Community in Southern Switzerland (the project proposal has been awarded with the Optimus Agora Award) and a project, named Robotics Teacher Community (ROTECO), founded by the Swiss Academies of Sciences. Both projects are based on the use of the educational roboto Thymio, that has been developed by the EPFL. During this PIDSIA, we will present briefly the field of educational robotics, the members of our group, the two projects that will start in september 2018 and the educational robot thymio.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

27 June 2018

Information are essential in our everyday life. With them, languages represent the basis to assert or deny facts. Like people speak about something also the surrounding nature tell us something with is own language, the “chemical language”. As "Wittgenstein” reported in the Tractatus Logico-Philosophicus (4.002) “Everyday language is a part of the human organism and is no less complicated than it.” we will have a walk inside the chemical language and we will discover how we can infer biological knowledges from molecular structures.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

28 June 2018

Most challenging problems in the life sciences, natural sciences and engineering involve disentangling complex networks of probabilistic and causal relationships among heterogeneous, large data sources. This endeavour poses three challenges: devising algorithms that can scale to high dimensions; formulating models that are auditable and interpretable, but also flexible; and balance interpretability against predictive accuracy. Bayesian networks strive to address these concerns, and can be learned efficiently from both big data and high-dimensional data. In this talk I will discuss some of my recent research on computational and information-theoretic aspects of structure learning, touching on the challenges of incomplete and correlated observations and more in general on the nature of prediction in machine learning models.
Meeting Room @ IDSIA, Galleria 1, 11h00

31 August 2018

Resources such as labeled corpora are necessary to train automatic models within the natural language processing (NLP) field. Historically, a large number of resources regarding a broad number of problems are available mostly in English. One of such problems is known as Personality Identification where based on a psychological model (e.g. The Big Five Model), the goal is to find the traits of a subject’s personality given, for instance, a text written by the same subject. In this presentation I will talk about a new corpus in Spanish called Texts for Personality Identification (TxPI) and I'll show some basic baselines for text classification.
IDSIA, Galleria 1, Meeting Room @11:00

4 September 2018 - 4 September 2018

Stockouts are a menace. Despite decades of investments and research, shelves are still empty, online orders not filled. Simply because stuff isn’t there. That is because inventory management is notoriously difficult for retailers in their multi-tier supply chains. Stocking too much causes inefficiency, markdowns, and waste. Stocking too little causes lost sales, dissatisfies shoppers, and diminishes loyalty. Finding out where, when and why what happened that caused items to be missing and at what cost is a tedious, manual, and time consuming task. So it rarely gets done. This presentation gives an overview of the causes and costs of stockouts, and invites the audience to gather ideas for the automated identification and classification of stockouts and their causes from data - in order to create and submit a collaborative funding / research proposal.
IDSIA, Galleria 1, Room G1-204 @12:00

21 September 2018 - 22 September 2018

The topic of the 2018 Meeting will be Logic and Quantum Physics: As it requires a radical revision of the common view of the nature of physical reality but also of how to handle information processes, since its birth quantum physics has been a rich source of challenges and inspirations for philosophy, logic and computer science. In recent times there has been a growing, exciting body of researches in the foundations of quantum theory, stemming particularly from quantum logic and quantum information, that deserves to be shared and discussed. The aim of this event is therefore to bring together some of the experts in quantum logic, quantum information and philosophy of physics, to provide with a general overview of the main problematics to a wide audience, but also to stimulate interactions and discussions based on some of the latest developments in the concerned fields with both established researchers and graduate and post graduate students.
Lugano - USI Campus

26 September 2018

Coresets are one of the central methods to facilitate the analysis of large data sets. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show a negative result, namely, that no strongly sublinear sized coresets exist for logistic regression. To deal with intractable worst-case instances, we introduce a complexity measure $\mu(X)$, which quantifies the hardness of compressing a data set for logistic regression. $\mu(X)$ has an intuitive statistical interpretation that may be of independent interest. For data sets with bounded $\mu(X)$-complexity, we show that a novel sensitivity sampling scheme produces the first provably sublinear $(1\pm\eps)$-coreset. Our algorithms are viable in practise, comparing favorably to uniform sampling as well as to state of the art methods in the area. Joint work with Alexander Munteanu, Christian Sohler, and David Woodruff. To appear at NIPS 2018.
Manno, Galleria 1, 2nd floor, room G1-204 @12:00

3 October 2018

Retail trades greatly benefit from price promotions (promos), i.e., temporary price reductions. The marketing literature on the topic is vast, mainly under the heading “Trade Promotion Optimization”, but not much has been produced on the optimization of the schedule of promotions of brands or items on long time horizons. There is a rich offer of commercial packages to support these decisions, despite the rather poor coverage in the optimization literature of the many, intertwining operational constraints that characterize actual applications. This work proposes a model for retailer chains, focused beyond transactional trade promotion management. It considers both manufacturers, who provide products to sell, and retailers, who are responsible for sales to the consumers. Input data to the model are derived from statistic analytics based on historical data, and yields the expected baseline and the uplift (ratio between the average sales volume with and without a promotion) for each promo in each time period, together with the different contributions to the uplift: cannibalization, halo, promotional dip, forward buying, etc. Building on this, we propose a mathematical model of the effectiveness of a promotion plan in the horizon of interest. (Joint work with Marco Antonio Boschetti)
Galleria 1, 2nd floor, IDSIA meeting room @16h30

15 November 2018

We discuss PSAT, a probabilistic extension of the classical satisfiability (SAT) problem. This is achieved by assigning weights to the clauses of a SAT instance. The PSAT instance is satisfiable if and only if a probability mass function over the literals and consistent with the weights exists . We present two algorithms for PSAT based, respectively, on column generation and integer linear programming, both showing evidence of phase transition. PSAT solves inferences in a recently proposed probabilistic logic (CCL, in [Antonucci & Facchini, 2018]). This allows to perform machine learning with logical constraints under relaxed independence assumptions over probabilistic facts. As an application, we consider label ranking and show that existing solvers can be used to solve practical ranking tasks.
Manno, Galleria 1, 2nd floor, room G1-204 @12:h00

27 November 2018

Policy optimization is an effective Reinforcement Learning approach to solve continuous control tasks. Recent achievements have shown that alternating online and offline optimization is a successful choice for efficient trajectory reuse. However, deciding when to stop optimizing and collect new trajectories is non-trivial, as it requires to account for the variance of the objective function estimate. In this talk, we propose a novel, model-free, policy search algorithm, POIS, applicable in both action-based and parameter-based settings. We first derive a high-confidence bound for importance sampling estimation; then we define a surrogate objective function, which is optimized offline whenever a new batch of trajectories is collected. Finally, the algorithm is tested on a selection of continuous control tasks, with both linear and deep policies, and compared with state-of-the-art policy optimization methods.
Galleria 1, 2nd floor, room G1-204 @12:00