ENBIS-17 Pre-Conference on Big Data in Business and Industry - ECAS-ENBIS Summer Course

9 – 10 September 2017; Scuola per l’Alta formazione of the University of Naples "L’Orientale"

9/09/2017 (9.00- 13.00)
Instructors: G. Barcaroli (Italian National Statistical Institute-Istat, Italy)
Course 1: Improving the Quality of Official Statistics by Using Alternative Data Sources: The Istat Experience

The goal of this course is to illustrate, with reference to a current survey (ICT usage and e-commerce in enterprises), how the use of new sources of data (Internet data) in combination with survey data can increase the quality of the estimates produced by the survey and enrich the information available in a Business Register. Advanced techniques for web scraping, text processing and machine learning will be presented.


9/09/2017 (14.00- 17.00)
Instructors: M.Vives-Mestres (University of Girona, Spain) and R.Kenett (KPA Group and University of Turin)
Course 2: Association Rules Analysis Using Compositional Data Methods

In this course, we will illustrate the application of compositional data analysis methods to text analysis. To provide a context, the course will start with an introduction to compositional data methods and review other applications such as survey data analysis and geophysical data.


10/09/2017 (9.00- 13.00)
Instructor: R.Kenett (KPA Group and University of Turin,Italy) and Marco Reis (University of Coimbra, Portugal)
Course 3: Big Data and Industrial Statistics

In this course, we will first introduce the application of big data analysis methods within the context of industrial statistics. In terms of applications special attention will be given to manufacturing 4.0 environments and other sensor driven processes. The methodologies that will be covered include multivariate process control and Bayesian network applications.


10/09/2017 (14.00- 17.00)
Instructor: A.Fassò (University of Bergamo, Italy)
Course 4: Multivariate Spatio-Temporal Methods for Large Datasets

The course will introduce concepts and modelling tools related to vector valued and function valued spatiotemporal large datasets. In particular, considering data in continuous space (geostatistics), spatial covariance functions for large datasets are considering including tapering and dimension reduction techniques. Using these, spatio-temporal models for data in continuous space and discrete time will be developed using the Dynamic Coregionalization Model (DCM), with application to human exposure to air pollution. Functional data analysis is then introduced as a powerful tool for handling advanced technological data. Recent advanced applications will be discussed, including functional control charts with applications to technological data in form of profiles. Moreover functional DCM will be developed and kriging for functional data indexed in space and time will be illustrated in earth observing systems for climate monitoring.