ENBIS-17 in Naples9 – 14 September 2017; Naples (Italy) Abstract submission: 21 November 2016 – 10 May 2017
The following abstracts have been accepted for this event:
Limits to the Reliability of the Rasch Psychometric Model
Authors: Leslie Pendrill (RI:SE Research Institutes of Sweden)
Primary area of focus / application: Other: Reliability of Subjective Measurement Systems
Keywords: Metrology, Qualitative, Social science, Rasch, Psychometrics, Reliability, Entropy
Submitted at 1-Feb-2017 14:42 by Leslie Pendrill
Accepted (view paper)
Measuring Uncertainties Through Uncertainties: A Theoretical Approach
Authors: Filomena Maggino (Università di Firenze), Carolina Facioni (Istituto Nazionale di Statistica - ISTAT), Isabella Corazziari (Istituto Nazionale di Statistica - ISTAT)
Primary area of focus / application: Other: Design of Experiment for Product Quality and Sustainability in Agri-Food Systems
Keywords: Measuring trends, Measuring change, Dynamic factor analysis, Futures studies
Submitted at 1-Feb-2017 17:47 by Filomena Maggino
Of course, they are, in a very particular and complex way, which involves many sciences. For example, designing the possible future trends is a very practiced exercise in statistics. It belongs to the field of inferential statistics, aimed at establishing knowledge from data by taking into account the error associated to them. This kind of knowledge allows statistical forecasts and predictions to be determined. One of the logical and instrumental concepts allowing trends to be read is that of change - and, of course, its opposite one, stability - which is far from being easy to be defined and managed through observed data (Maggino, Facioni, 2015). This is particularly true in the presence of complex phenomena, such as those defining and composing, e.g., the quality-of-life topic.
A great help in understanding complexity and trends, comes from the whole contributes of methods to analyze multi-way or multi-mode data, developed extensively in years ’80s-‘90s. One of those methods, applied in many different fields (social, demographic, economic, environmental) is the Dynamic Factor Analysis (DFA) (Coppi and Zannella, 1979; Corazziari, 1999), a method for multi-way data, based on the joint application of a factorial analysis and regression over time. DFA considers quantitative array of data classified according to three criteria: statistical unit, quantitative variable and time of data collection.
The Futures Studies approach is in Europe finds its theoretical basis in the French Bertrand de Jouvenel’s philosophical reflections (de Jouvenel, 1964). We can find a link between philosophical theory of futures and its translation in the practice of social research in de Jouvenel’s theorization about possible, probable, and desirable futures. How can we understand if a probable future can be more - or less - probable respect to a different hypothesis of future? Answer to such a difficult question requires a multidisciplinary approach, where statistical models, methodology of social science are enhanced in their ability to express the change - and sometimes the risk that the change itself implies.
Mining Operational Shipping Data for Insight into Fuel Consumption and Emissions
Authors: Shirley Coleman (ISRU, Newcastle University), Kayvan Pazouki (MAST, Newcastle University), Rose Norman (MAST, Newcastle University), Ibna Zaman (MAST and ISRU, Newcastle University)
Primary area of focus / application: Other: Big Data in Shipping
Keywords: Natural variation, Demographics, Open data, Statistical Process Control, Weather, Business improvement, Data extraction
Submitted at 4-Feb-2017 18:55 by Shirley Coleman
Accepted (view paper)
Aggregation Using Input-Output Tradeoff with Application to Wind Modeling
Authors: Aurélie Fischer (Université Paris Diderot), Mathilde Mougeot (Université Paris Diderot)
Primary area of focus / application: Other: Modeling, forecasting and risk evaluation of wind energy production
Keywords: Classification, Regression estimation, Aggregation, Nonlinearity, Consistency, Wind energy
Submitted at 8-Feb-2017 16:29 by Aurélie FISCHER
Accepted (view paper)
In these approaches, some agreement condition between estimators has to be satisfied for all individual estimators, which could lead to problems if there is a bad initial estimator. In practice, a few disagreements are allowed; for establishing the theoretical results, the proportion of estimators satisfying the agreement condition is required to tend to 1.
Here, we propose a modified procedure involving both these consensus ideas and the Euclidean distance between entries. This may be seen as an alternative approach allowing to minder the effect of a possibly bad estimator in the initial list.
We show the consistency of this new strategy and propose an experimental study with application to wind modeling.
Biau, G., Fischer, A., Guedj, B. & Malley, J. (2016). COBRA: A combined regression strategy, Journal of Multivariate Analysis, 146, 8-28.
Mojirsheibani, M. (1999). Combining classifiers via discretization, Journal of the American Statistical Association, 94, 600-609.
Mojirsheibani, M. (2000). A kernel-based combined classification rule, Statistics & Probability Letters, 48, 411-419.
Mojirsheibani, M. (2002). An almost surely optimal combined classification rule, Journal of Multivariate Analysis, 81, 28-46.
Mojirsheibani, M. (2002). A comparison study of some combined classifiers, Communications in Statistics - Simulation and Computation, 31, 245-260.
High-Dimensional Copulas for Solving Unbalanced Classification Problems in Industrial Risk Mitigation
Authors: Nicolas Bousquet (EDF R&D), Bertrand Iooss (EDF R&D)
Primary area of focus / application: Mining
Secondary area of focus / application: Reliability
Keywords: Machine learning, Unbalanced classification problems, Parzen method, Copulas, Weak signals, Production reliability
Submitted at 9-Feb-2017 10:11 by Nicolas Bousquet
The expert knowledge on those events expresses only by choosing covariates, assumed to reflect a piece of true information about the regular phenomenon (e.g., a tide within an estuary), and assuming that the outlier is « explained » by the same physics, for objective reasons. The lack of knowledge and the need for industrial mitigation can lead to build a prognostic mechanism (computing for instance a probability of occurrence) based on the statistical classification of feared and non-feared events. In such unbalanced situations, usual learning methods (SVM, random forests, etc.) provide a result which remains biased because of the very low number of feared events, and very simple empirical techniques (Parzen ratio) can do as well as these more elaborated approaches. Nonetheless, several techniques can be used to improve the ROC curve featuring a classification algorithm. The incorporation of expert information can be addressed by using high-dimensional copulas, which can outperform the results of other methods. A governing example highlights the benefits of this approach, that consider the massive clogging problem of French and English production plants.
Data-Driven DoE: A Case Study
Authors: Volker Kraft (SAS Institute / JMP Division)
Primary area of focus / application: Other: Software
Keywords: Data-driven DoE, LCD, Live demonstration, Software
Submitted at 9-Feb-2017 12:20 by Volker Kraft
The case study will show an example how Data-driven DoE was applied to improve a manufacturing process for liquid crystals displays (LCDs): Historically, the pigment milling step had caused many problems, with long mill times that were also extremely variable. Even though only a small and messy data set of historical process measurements was available, a significant improvement could be effected in just one cycle of learning – Situation appraisal, designed experiment, modeling and optimization. A live demonstration will show the challenges and solutions during all these steps.