ENBIS-17 in Naples

9 – 14 September 2017; Naples (Italy) Abstract submission: 21 November 2016 – 10 May 2017

My abstracts

 

The following abstracts have been accepted for this event:

  • Limits to the Reliability of the Rasch Psychometric Model

    Authors: Leslie Pendrill (RI:SE Research Institutes of Sweden)
    Primary area of focus / application: Other: Reliability of Subjective Measurement Systems
    Keywords: Metrology, Qualitative, Social science, Rasch, Psychometrics, Reliability, Entropy
    Submitted at 1-Feb-2017 14:42 by Leslie Pendrill
    Accepted (view paper)
    12-Sep-2017 16:00 Limits to the Reliability of the Rasch Psychometric Model
    Metrological assurance of qualitative evaluations, such as made by a person acting as a Measurement Instrument (e.g. in person-centred care) seems to be possible using a generalised linear model specifically based on the Rasch psychometric approach. Traditional classical test theory and many of the usual tools of statistics cannot work reliably on ordinal or nonimal scales typical of person responses, rating questionnaires or other qualitative evaluations. This talk will recall how to establish (i) metrological references (item banks of task difficultly, for example); uncertainty budgets for categorical data (using informational entropy). Thereafter, it will be demonstrated how the reliability of subjective measurement systems and the Rasch model can be evaluated with respect to measurement scale shift and scale stretching with novel tools analysing rating scores and logistic regression residuals. Examples will range from industrial parts inspection to cognitive assessment.
  • Measuring Uncertainties Through Uncertainties: A Theoretical Approach

    Authors: Filomena Maggino (Università di Firenze), Carolina Facioni (Istituto Nazionale di Statistica - ISTAT), Isabella Corazziari (Istituto Nazionale di Statistica - ISTAT)
    Primary area of focus / application: Other: Design of Experiment for Product Quality and Sustainability in Agri-Food Systems
    Keywords: Measuring trends, Measuring change, Dynamic factor analysis, Futures studies
    Submitted at 1-Feb-2017 17:47 by Filomena Maggino
    Accepted
    11-Sep-2017 10:50 Measuring Uncertainties Through Uncertainties: A Theoretical Approach
    What does define a work as “scientific”? While pointing out that the definition of "scientific" is far from unproblematic, and that behind it there is a long debate – perhaps, not yet concluded - we can accept the fact that a scientific work is characterized by a systematic, controlled, empirical, and critical approach. Anyway, when our aim is to draw the possible developments of future events, we are faced with a practical obstacle. Indeed, we cannot have any empirical experience of the future. Have we, therefore, to be inferred that forecasting, exploring future – or, better: exploring futures (Barbieri Masini. 2000) – anticipating futures (Arnaldi, Poli, 2012) - have not to be considered activities of a scientific kind?
    Of course, they are, in a very particular and complex way, which involves many sciences. For example, designing the possible future trends is a very practiced exercise in statistics. It belongs to the field of inferential statistics, aimed at establishing knowledge from data by taking into account the error associated to them. This kind of knowledge allows statistical forecasts and predictions to be determined. One of the logical and instrumental concepts allowing trends to be read is that of change - and, of course, its opposite one, stability - which is far from being easy to be defined and managed through observed data (Maggino, Facioni, 2015). This is particularly true in the presence of complex phenomena, such as those defining and composing, e.g., the quality-of-life topic.
    A great help in understanding complexity and trends, comes from the whole contributes of methods to analyze multi-way or multi-mode data, developed extensively in years ’80s-‘90s. One of those methods, applied in many different fields (social, demographic, economic, environmental) is the Dynamic Factor Analysis (DFA) (Coppi and Zannella, 1979; Corazziari, 1999), a method for multi-way data, based on the joint application of a factorial analysis and regression over time. DFA considers quantitative array of data classified according to three criteria: statistical unit, quantitative variable and time of data collection.
    The Futures Studies approach is in Europe finds its theoretical basis in the French Bertrand de Jouvenel’s philosophical reflections (de Jouvenel, 1964). We can find a link between philosophical theory of futures and its translation in the practice of social research in de Jouvenel’s theorization about possible, probable, and desirable futures. How can we understand if a probable future can be more - or less - probable respect to a different hypothesis of future? Answer to such a difficult question requires a multidisciplinary approach, where statistical models, methodology of social science are enhanced in their ability to express the change - and sometimes the risk that the change itself implies.
  • Mining Operational Shipping Data for Insight into Fuel Consumption and Emissions

    Authors: Shirley Coleman (ISRU, Newcastle University), Kayvan Pazouki (MAST, Newcastle University), Rose Norman (MAST, Newcastle University), Ibna Zaman (MAST and ISRU, Newcastle University)
    Primary area of focus / application: Other: Big Data in Shipping
    Keywords: Natural variation, Demographics, Open data, Statistical Process Control, Weather, Business improvement, Data extraction
    Submitted at 4-Feb-2017 18:55 by Shirley Coleman
    Accepted (view paper)
    12-Sep-2017 14:30 Mining Operational Shipping Data for Insight into Fuel Consumption and Emissions
    Vast quantities of shipping data are generated by fleets of vessels and contain valuable operational information about the performance of each vessel. The shipping data can be integrated with observational reports from the crew, loading information, timings and global position system (GPS) details. The enriched data can then yield insight into the natural variation in fuel consumption that can provide a basis on which to make decisions about efficiencies, routes and emissions. This paper reports the process of extracting insight from multiple ferry journeys in the North UK and rationalises the systematic variation underlying the random variation in the datasets.
  • Aggregation Using Input-Output Tradeoff with Application to Wind Modeling

    Authors: Aurélie Fischer (Université Paris Diderot), Mathilde Mougeot (Université Paris Diderot)
    Primary area of focus / application: Other: Modeling, forecasting and risk evaluation of wind energy production
    Keywords: Classification, Regression estimation, Aggregation, Nonlinearity, Consistency, Wind energy
    Submitted at 8-Feb-2017 16:29 by Aurélie FISCHER
    Accepted (view paper)
    12-Sep-2017 10:10 Aggregation Using Input-Output Tradeoff with Application to Wind Modeling
    We introduce a new learning strategy based on an idea of Mojirsheibani (1999, 2000, 2002a, 2002b): this author proposed a method for combining several classifiers, relying on a consensus notion, which has been recently extended to the context of regression in Biau et al. (2016).
    In these approaches, some agreement condition between estimators has to be satisfied for all individual estimators, which could lead to problems if there is a  bad initial estimator. In practice, a few disagreements are allowed; for establishing the theoretical results, the proportion of  estimators satisfying the agreement condition is required to tend to 1.
    Here, we propose a modified procedure involving both these consensus ideas and the Euclidean distance between entries. This may be seen as an alternative approach allowing to minder the effect of a possibly bad estimator in the initial list.
    We show the consistency of this new strategy and propose an experimental study with application to wind modeling.

    Biau, G.,  Fischer, A.,  Guedj, B. & Malley, J.  (2016).  COBRA: A combined regression strategy, Journal of Multivariate Analysis, 146, 8-28.
    Mojirsheibani, M. (1999). Combining classifiers via discretization, Journal of the American Statistical Association, 94, 600-609.
    Mojirsheibani, M. (2000). A kernel-based combined classification rule, Statistics & Probability Letters, 48, 411-419.
    Mojirsheibani, M. (2002). An almost surely optimal combined classification rule, Journal of Multivariate Analysis, 81, 28-46.
    Mojirsheibani, M. (2002). A comparison study of some combined classifiers, Communications in Statistics - Simulation and Computation, 31, 245-260.
  • High-Dimensional Copulas for Solving Unbalanced Classification Problems in Industrial Risk Mitigation

    Authors: Nicolas Bousquet (EDF R&D), Bertrand Iooss (EDF R&D)
    Primary area of focus / application: Mining
    Secondary area of focus / application: Reliability
    Keywords: Machine learning, Unbalanced classification problems, Parzen method, Copulas, Weak signals, Production reliability
    Submitted at 9-Feb-2017 10:11 by Nicolas Bousquet
    Accepted
    12-Sep-2017 10:50 High-Dimensional Copulas for Solving Unbalanced Classification Problems in Industrial Risk Mitigation
    Environmental risks that can typically affect highly-protected production plants are due to lacks of water, floods, clogging from plants or animals, among others. Those events are scarce and unfortunately, in many situations there is no clear understanding of the physical reasons for which a particular phenomenon is an outlier or not.
    The expert knowledge on those events expresses only by choosing covariates, assumed to reflect a piece of true information about the regular phenomenon (e.g., a tide within an estuary), and assuming that the outlier is « explained » by the same physics, for objective reasons. The lack of knowledge and the need for industrial mitigation can lead to build a prognostic mechanism (computing for instance a probability of occurrence) based on the statistical classification of feared and non-feared events. In such unbalanced situations, usual learning methods (SVM, random forests, etc.) provide a result which remains biased because of the very low number of feared events, and very simple empirical techniques (Parzen ratio) can do as well as these more elaborated approaches. Nonetheless, several techniques can be used to improve the ROC curve featuring a classification algorithm. The incorporation of expert information can be addressed by using high-dimensional copulas, which can outperform the results of other methods. A governing example highlights the benefits of this approach, that consider the massive clogging problem of French and English production plants.
  • Data-Driven DoE: A Case Study

    Authors: Volker Kraft (SAS Institute / JMP Division)
    Primary area of focus / application: Other: Software
    Keywords: Data-driven DoE, LCD, Live demonstration, Software
    Submitted at 9-Feb-2017 12:20 by Volker Kraft
    Accepted
    11-Sep-2017 12:30 Data-Driven DoE: A Case Study
    Using statistically designed experiments (DoE) is the best approach to learning from data, since it has the potential to be both efficient and effective. However, realizing this potential relies on clearly articulating what you already know, and what you wish to learn. If they exist, observational data can help to establish what is ‘known’, but handling such data appropriately can be difficult. However, if these difficulties can be addressed, one can exploit more coherent cycles of learning that leverage data to the full - ‘Data-driven DoE’ for short.

    The case study will show an example how Data-driven DoE was applied to improve a manufacturing process for liquid crystals displays (LCDs): Historically, the pigment milling step had caused many problems, with long mill times that were also extremely variable. Even though only a small and messy data set of historical process measurements was available, a significant improvement could be effected in just one cycle of learning – Situation appraisal, designed experiment, modeling and optimization. A live demonstration will show the challenges and solutions during all these steps.