ENBIS-20 Online Conference

28 September – 1 October 2020; Online

Young Statistician Award

29 September 2020, 15:30 – 16:00



Raffaele Vitale, PhD, Postdoctoral Associate at KU Leuven, Belgium


Although Principal Component Analysis (PCA) and Partial Least Squares regression (PLS) are currently recognised as some of the most powerful approaches for the analysis and interpretation of multivariate data especially in the field of industrial processes, strong non-linear relationships among objects and/or variables may represent a difficult issue to solve when one tries to model them by means of these methods. In similar contingencies, a good alternative is represented by the so-called kernel-based techniques, which have already been broadly used in, e.g., chemistry and biology. Even if kernel-based approaches allow to easily cope with strong non-linearities in data, their main disadvantage is that the information about the importance of the original variables in the final models is lost. Recently, the principles of non-linear bi-plots and so-called pseudo-sample projection, originally described by Gower and Hardings in 1988, have been extended to overcome this limitation. Here, they will be adapted and exploited to enable kernel model interpretation. More in detail, this work will be focused on evaluating the power of kernel-based methodologies coupled to pseudo-sample projection in 2 different scenarios of paramount importance for manufacturing industries: batch process monitoring and analysis of mixture designs of experiments. All the case studies that will be presented will highlight how such a combination can be particularly useful in those contexts where huge amounts of complex information are routinely collected (as in modern manufacturing scenarios) and can be easily resorted to for a wide range of applications. Particular attention will be paid to some new intuitive graphical tools – based on the concept of pseudo-sample projection – implemented to support users in the complicated task of kernel model assessment, thus facilitating and accelerating decision making and troubleshooting. This provides a striking advantage over classical machine-learning techniques which still suffer from the drawback of being full black-box methodologies. This talk is based on joint work with Daniel Palací-López, Onno de Noord and Alberto Ferrer.


Dr. Raffaele Vitale graduated in Analytical Chemistry in 2011 (Università di Roma “La Sapienza”, Italy) and obtained his Ph.D. title in Statistics and Optimization in 2017 (Universitat Politècnica de València, Spain) with a thesis entitled Novel chemometric proposals for advanced multivariate data analysis, processing and interpretation. He is currently working as Postdoctoral Associate at KU Leuven in Belgium in the framework of the ADGut project for the investigation of the developmental mechanisms of Alzheimer’s disease. Raffaele is author of 23 peer-reviewed publications, has been awarded the Best Italian Master’s Thesis in Analytical Chemistry prize in 2012, the International Association of Spectral Imaging student prize in 2016, the V Siemens Process Analytics Prize for Young Scientist in 2017 and the III Jean-Pierre Huvenne Award for the Best Ph.D. Thesis in Chemometrics in 2019.

Return to programme