ENBIS-8 in Athens21 – 25 September 2008 Abstract submission: 14 March – 11 August 2008
Keynote by Jeroen de Mast: Exploratory data analysis in quality improvement projects23 September 2008, 14:00 – 14:50
The keynote will be given by Jeroen de Mast of the Institute for Business and Industrial Statistics of the University of Amsterdam (IBIS UvA).
- Submitted by
- Jeroen de Mast
- Jeroen de Mast
- IBIS UvA
- Compared to the vast literature on confirmatory data analysis (hypothesis testing, estimation, modeling), the literature on exploratory data analysis (EDA) is far less elaborate, both in pure volume of texts devoted to the subject and in precision and depth of its theoretical development. Sometimes, EDA is even described as an art, rather than a science.
In this presentation I will show a number of explicated principles for EDA that can be taught to practitioners and statisticians to help them master this art faster. The framework is developed on the basis of a large number of real-life applications. The purpose and process of EDA are defined, and contrasted to the purpose and process of confirmatory data analysis and descriptive data analysis.
In the process of EDA, three steps are discerned: display the data, identify salient features, and interpret salient features. The details of each of these steps are elaborated, and I will present the underlying principles, such as Shewhart’s assignable causes, the maximum entropy principle, abduction, and explanatory coherence. Furthermore, the roles of probabilistic reasoning and automatic statistical procedures in EDA are discussed. Finally, I will place EDA in the wider context of hypothesis and idea generation, a discipline that is studied in philosophy of science (discovery), the cognitive sciences (problem solving), and the medical sciences (diagnosis). We will study what approaches for hypothesis generation there are besides EDA, and we will analyse how EDA compares to these other approaches.
The resulting framework provides structure and practical advice which facilitates teaching of EDA to practitioners and statisticians alike. The precise definitions, delineations and references to relevant scientific disciplines helps the further theoretical understanding and development of EDA.