ENBIS-17 in Naples

9 – 14 September 2017; Naples (Italy) Abstract submission: 21 November 2016 – 10 May 2017

My abstracts

 

The following abstracts have been accepted for this event:

  • Construction of Two-Level Factorial and Fractional Factorial Designs with Runs in Blocks of Size Two

    Authors: Janet Godolphin (University of Surrey, UK)
    Primary area of focus / application: Design and analysis of experiments
    Secondary area of focus / application: Education & Thinking
    Keywords: Factorial effect, Isomorphic, Confounding, Replicate generator, Fraction generator
    Submitted at 13-Mar-2017 12:13 by Janet Godolphin
    Accepted
    12-Sep-2017 15:10 Construction of Two-Level Factorial and Fractional Factorial Designs with Runs in Blocks of Size Two
    In many experiments involving factorial and fractional factorial designs, attention is focused on estimation of all main effects and two factor interactions. The presentation concerns design construction when, due to practical constraints, runs are arranged in blocks of size two.

    For p factors, and M at least as large as a given function of p, a construction approach is provided which generates all designs in which M replicates are arranged in blocks of size two so that all main effects and two factor interactions are estimable. The method incorporates recognition of isomorphic designs to avoid double counting. A design ranking is proposed to give guidance on design selection which prioritises estimation of main effects. This is useful in practice since for some p, M combinations the number of designs is large (for example, for p=8 and M=4 there are 343 designs) and there can be considerable variation in the quality of estimation between designs.

    The full factorial designs can be used as a source of root designs for construction of designs in fractional replicates, again in blocks of size two. The method is illustrated by examples with up to p=15 factors.
  • Nonnegative Matrix Factorization with Side Information for Time Series Recovery and Prediction

    Authors: Jiali Mei (EDF Lab & Université Paris-Sud), Yohann De Castro (Université Paris-Sud), Yannig Goude (EDF Lab & Université Paris-Sud), Jean-Marc Azaïs (Institut de Mathématiques Université de Toulouse), Georges Hébrail (EDF Lab)
    Primary area of focus / application: Other: French SFdS session on Computer experiments and energy
    Keywords: Electricity consumption, Matrix factorization, Optimization, Time Series analysis
    Submitted at 16-Mar-2017 14:39 by Jiali Mei
    Accepted (view paper)
    12-Sep-2017 12:10 Nonnegative Matrix Factorization with Side Information for Time Series Recovery and Prediction
    Motivated by the reconstruction and prediction of electricity consumption, we extend Nonnegative Matrix Factorization (NMF) to take into account outside features.
    We consider Nonnegative Matrix Factorization in general linear measurement schemes, and propose a general framework which models non-linear relationship between features and the response variables.
    We extend previous theoretical results in NMF to obtain a sufficient condition on the identifability of matrix factorization.
    Based the classical Hierarchical Alternating Least Squares (HALS) algorithm, we propose a new algorithm (HALSX, or Hierarchical Alternating Least Squares with eXogeneous variables) which estimates the factorization model.
    The algorithm is validated on both simulated and real electricity consumption data, to show its performance in reconstruction and prediction.
  • Solving Kalai-Smorodinski Equilibria Using Gaussian Process Regression

    Authors: Victor Picheny (INRA), Mickael Binois (Chicago Booth School of Business), Abderrahmane Habbal (Universite Cote d'Azur)
    Primary area of focus / application: Other: ISBA session on Bayesian Optimization
    Secondary area of focus / application: Other: Bayesian Optimisation
    Keywords: Multi-objective optimization, Gaussian process, Game theory, Bayesian optimization
    Submitted at 20-Mar-2017 11:23 by Victor Picheny
    Accepted (view paper)
    12-Sep-2017 12:10 Solving Kalai-Smorodinski Equilibria Using Gaussian Process Regression
    Game theory arose from the need to model economic behavior, where multiple decision makers with antagonistic goals is a natural feature. Nowadays it finds a broad range of applications in machine learning and engineering. In this context, the Kalai-Smorodinski (KS) equilibrium is a particularly attractive concept, as it mixes game theory concepts with multi-criteria decision ones (in particular, Pareto-optimality). However, in a derivative-free, expensive, noisy black-box context (e.g. computer experiments), there is no algorithmic solution available to find KS equilibria. Here, we propose a novel Gaussian-process based approach for finding KS equilibria, in the form of a Bayesian optimization algorithm, with sequential sampling decisions based on acquisition functions. Our approach is evaluated on several synthetic game problems with varying number of players and decision space dimensions, including a finite-element model for a Cauchy problem. We show that equilibria can be found reliably for a fraction of the cost (in terms of black-box evaluations) compared to classical, derivative-based algorithms, and illustrate how the KS solution is an attractive alternative to multi-objective optimization.
  • Analysing Ordered Categorical Data with the Generalized Taguchi’s Statistic

    Authors: Pietro Amenta (Department of Law, Economics, Management and Quantitative Methods, University of Sannio), Luigi D’Ambra (Department of Economics, Management and Institutions, University of Naples), Antonello D’Ambra (Department of Economics, University of Campania “Luigi Vanvitelli”), Anna Crisci (Department of Low and Economic Sciences, Pegaso Telematic University)
    Primary area of focus / application: Modelling
    Secondary area of focus / application: Mining
    Keywords: Ordered categorical data, Taguchi’s statistic, Data mining, Quantification process, Logistic model
    Submitted at 30-Mar-2017 18:24 by Pietro Amenta
    Accepted
    12-Sep-2017 09:20 Analysing Ordered Categorical Data with the Generalized Taguchi’s Statistic
    In industrial experiments for quality improvement the output consists often of categorical data with a clear ordering in the categories. This is due to the inherent nature of quality characteristic or to the convenience of the measurement technique. A well-known example of this occurrence is the study of a polysilicon deposition process by Phadke (1989). Several techniques have been proposed for the analysis of ordered categorical data for quality improvement in industrial settings. We remind the proposals of Taguchi (1974), Nair (1986), Jeng & Guo (1996), Asiabar & Ghomi (2006), and Wu & Yeh (2006).
    A generalization of the Taguchi’s statistic, measuring the association between a nominal explanatory and an ordered categorical response variable is here proposed for analysing ordered categorical data in quality engineering. This new measure, based also on quantification process for the ordered categories, is named “Generalized Cumulative Chi-Squared Statistic” (GCCS) and a class of GCCS-type tests is also introduced. GCCS allows a graphical investigation of the optimal combination by considering the ordinal nature of the variable as well as in the quantification process of the ordered categories. We highlight that including the quantification process within the analysis is often an overlooked aspect in statistical literature.
    An empirical study from industrial experiments for quality improvement has been developed. This study has been performed on a strategy based on the conjoint use of the Generalized Taguchi’s statistic and the Logistic Model. It allows to obtain an optimal combination of factors highlighting the levels to improve process quality.
  • Training Data Scientists: Challenges and Issues

    Authors: Gilbert Saporta (CNAM)
    Primary area of focus / application: Education & Thinking
    Keywords: Data scientists, Shortage of talents, Life long learning, Teaching, Skills
    Submitted at 2-Apr-2017 12:11 by Gilbert Saporta
    Accepted (view paper)
    12-Sep-2017 12:20 Training Data Scientists: Challenges and Issues
    The Data Revolution leads to the creation of a large number of jobs, in particular those involved in big data analytics, namely the data scientists who are the heirs of statisticians and data miners. We first give some numerical information on the shortage of talents based on various recent sources.
    In a second part, we report the impact on training programs of the emergence of these new jobs, according to the qualities required. We address finally the training challenge : despite a remarkable development, initial training by universities will not be enough to provide quickly the thousands of specialists which are needed. Besides other solutions (bootcamps, on line courses, etc.) we advocate lifelong training of scientists and engineers already in service in order to respond to the mass demand for data scientists. A particular experience is
    presented. The conclusion calls for learned societies to be concerned about the certification of data scientists.
  • Elevator Pitches & Co – Effectively Presenting Myself and my Work

    Authors: Kristina Lurz (prognostica GmbH), Andrea Ahlemeyer-Stubbe (Ahlemeyer-Stubbe Data Mining and More), Anja Zernig (KAI GmbH)
    Primary area of focus / application: Other: Young Statisticians Session
    Keywords: Effective presentation, Elevator pitch, Interactive session, Soft skills
    Submitted at 2-Apr-2017 20:39 by Kristina Lurz
    Accepted
    12-Sep-2017 11:40 Special Session: Young Statisticians Session
    As data scientists and statisticians, we are working in an interdisciplinary field that requires more than solely excellent knowledge about statistics. A crucial part of our work is the necessity to present ourselves and our work to other people - our colleagues during a talk at a conference, a person interested in one of our posters, a professor we would like to have as co-advisor for a PhD thesis or a future employer to whom we have to explain the advantages of employing a data scientist or statistician.
    In circumstances like these, time is often limited to make a lasting impression. Imagine, at the beginning of a client’s meeting we need to present ourselves and our work appealingly and efficiently. If we do not come to the point, everyone gets bored and loses interest. It is up to each of us to use our time in the spotlight as efficiently as possible.
    This Young Statisticians Session is an interactive session that consists of two interconnected parts: A presentation by an experienced professional statistician and consultant, Andrea Ahlemeyer-Stubbe, who talks about her experiences with respect to the do’s and don’ts in presenting oneself, as well as short example presentations by several colleagues. Together, we will be developing ideas and rules to improve our presentations/posters/elevator pitches, using the framework of a so-called world café.
    If you would like to actively participate in the session by means of preparing a 3-minute presentation, please email us before the conference (kristina.lurz@prognostica.de). With your contribution, we are looking forward to a lively and informative session.