ENBIS-17 in Naples

9 – 14 September 2017; Naples (Italy) Abstract submission: 21 November 2016 – 10 May 2017

My abstracts


The following abstracts have been accepted for this event:

  • Estimation of Variance Components and Use of Tolerance Interval for Accuracy Measure on Assay Qualification and Validation

    Authors: Dan Lin (GSK), Bernard Francq (GSK), Walter Hoyer (GSK)
    Primary area of focus / application: Quality
    Secondary area of focus / application: Modelling
    Keywords: Assay qualification and validation, Intermediate precision, Accuracy, Total error, Tolerance interval, Graphical user interface
    Submitted at 6-Mar-2017 11:48 by Dan Lin
    11-Sep-2017 16:20 Estimation of Variance Components and Use of Tolerance Interval for Accuracy Measure on Assay Qualification and Validation
    During development of a vaccine, different analytical methods for determining the antigen concentration, the (relative) potency or the level of impurities in the produced vaccine batches need to be developed. In this paper, we focus on evaluating two aspects of the desired method performance: precision and accuracy in the process of assay development and validation. Precision is a measure of the variability in a series of measurements obtained from repeated samplings within and between assay runs. Historically, repeatability (only intra-assay variability) and intermediate precision (combined inter- and intra-assay variability) have been studied separately, see e.g. ICH guideline Q2(R1) for a number of samples spanning the intended working range of the analytical method. More recently, regulatory authorities expect a comprehensive approach including a variance decomposition to clearly distinguish the different contributions to the total variability, as described in United States Pharmacopoeia chapter <1033>. A linear mixed model across all samples will be used to estimate the variance components and the construction of total variability with its confidence interval. The particular emphasis of this presentation is to study the effect of pooling some factors inside the statistical model, e.g. including a single factor “session” (i.e., the combination of “Day” and “Operator”) as opposed to the factors “Day”, “Operator”, and their interaction. Some simulation results will be shown to illustrate the effect of under-estimation for the intermediate precision in case of model mis-specification. Accuracy is defined as the closeness between individual test results and accepted reference value. Its measure should take into account the systematic error (namely, trueness, i.e., the ratio between mean test result and the true theoretic value) and random error (precision). For this purpose, the tolerance interval for the accuracy measure will be used to capture the total error (trueness and precision) from the linear mixed model. An automated tool with a graphical user interface will be presented in which a user can upload their validation data and receive a complete pdf report within seconds.
  • Design of Experiments Used in Engineering Using Virtual Simulations

    Authors: Bryan Dodson (SKF Group Six Sigma), Giacomo Landi (SKF USA Inc.), Rene Klerx (SKF Group Six Sigma)
    Primary area of focus / application: Design and analysis of experiments
    Secondary area of focus / application: Six Sigma
    Keywords: Design of Experiments, Simulations, Engineering, Second order polynomials
    Submitted at 6-Mar-2017 12:12 by Rene Klerx
    11-Sep-2017 10:30 Design of Experiments Used in Engineering Using Virtual Simulations
    In recent years, the use of Design of Experiments (DOE) in engineering has been used increasingly. While test rig experiments remain costly, the advancement of computer hardware and software made virtual simulations a real possibility for many problems that was not possible to model before. The bearing industry is a notable example. Nowadays, many variables can be considered in the study generating large amount of data. When the underlying physical equations are not known, it is common practice to interpolate the DOE results assuming a second order polynomial. While this is a reasonable assumption, it can also be misleading and valuable information may be lost. In this paper, different interpolations are tested for common engineering equations, including higher order polynomial, Fourier series and splines. Advantages and disadvantages of each alternative are discussed and guidelines are suggested.
  • Clustering Time Series from Call Networks to Predict Churn

    Authors: María Óskarsdóttir (KU Leuven), Tine Van Calster (KU Leuven), Bart Baesens (KU Leuven), Wilfried Lemahieu (KU Leuven), Jan Vanthienen (KU Leuven)
    Primary area of focus / application: Mining
    Secondary area of focus / application: Business
    Keywords: Social network analytics, Time Series analysis, Time Series clustering, Call detail records, Churn
    Submitted at 6-Mar-2017 12:20 by María Óskarsdóttir
    13-Sep-2017 10:30 Clustering Time Series from Call Networks to Predict Churn
    Accurately predicting potential churners is important in fast-moving and saturated markets such as the telecommunication industry. Being able to identify certain behavioral patterns that lead to churn, is equally important, because it allows the organization to make arrangements for retention in a timely manner. Moreover, previous research has shown that the decision to leave one operator for another, is often influenced by the customer’s social circle. Therefore, features representing the churn status of their connections are usually good predictors of churn when it is treated as a binary classification problem, which is the traditional approach.
    We propose a method to discover common behavior among churn-prone telecom customers and to distinguish them from loyal customers. More precisely, we use call detail records (CDR) of the customers of a telecommunication provider to build call networks on a weekly basis over the period of six months. From each network, we extract features based on each customer’s connections within the network, resulting in individual time series of link-based measures. In order to identify common behavior, we then apply time series clustering techniques and assign the customers to different groups at subsequent time points. Finally, we analyze how the customers move between the clusters to identify frequent patterns, especially amongst churners.
    Our approach offers the possibility to discover behavioral patterns of potential churners, depending on the temporal aspect of phone usage as well as individual call networks. The result, once the patterns have been extracted, is a model that is simple in deployment and easily expandable.
  • Self-Starting Control Charts and their Accurate Run-Length Distributions

    Authors: Yoshiaki Kunikawa (Tokyo University of Science), Seiichi Yasui (Tokyo University of Science)
    Primary area of focus / application: Process
    Keywords: Short run process, Average run length, Standard deviation of run length, Recursive control limits
    Submitted at 6-Mar-2017 12:37 by Seiichi Yasui
    Accepted (view paper)
    13-Sep-2017 10:10 Self-Starting Control Charts and their Accurate Run-Length Distributions
    When monitoring processes by control charts, it is necessary to estimate control limits. However, in small-lot production, it is difficult to obtain sufficient data so as to obtain accurate estimates of control limits because it is required to start monitoring quickly. Thus, self-starting control charts have been proposed. Many have proposed as Shewhart, CUSUM, and EWMA types. The plotted values in these charts are constructed by the inverse cumulative normal distribution function of the probability integral transformation of the standardized residuals that are obtained from the accumulated data which have judged as in-control. We propose new two self-starting control charts in which the statistics for subgroups are directly plotted without any transformation as well as the conventional control charts. However, the control limits in our charts are obtained by weighted average of previous control limits.
    In general, occurance of the out-of-control signal for a certain plot is not independent of those for any other previous plot in self-staring control charts. This is the reason why the data judged as in-control is incorporated with the last control limits to judge the next plot. As a result, though simulation is used to determine appropriate control limits, we are able to analytically calculate control limits of proposed charts and Q type control charts.
    In this paper, it is demonstrated to derive exact run length distributions for proposed control charts and Q type control charts (Start-up Shewhart Xbar charts) which allow for proper control for the desired in-control ARL, and the performance is evaluated.
  • Statistical Standards and Open Source Software: Synergies and Challenges

    Authors: Emilio L. Cano (University of Castilla-La Mancha), Matías Gámez (University of Castilla-La Mancha), Noelia García (University of Castilla-La Mancha)
    Primary area of focus / application: Economics
    Secondary area of focus / application: Quality
    Keywords: Standardisation, Quality control and improvement, Statistical software, ISO standards, SPC, Good practice
    Submitted at 6-Mar-2017 14:33 by Emilio L. Cano
    Accepted (view paper)
    13-Sep-2017 09:40 Statistical Standards and Open Source Software: Synergies and Challenges
    Industry works with Standards, rather than with text books. Thus, the statistical software developed for industry should always follow industry standards. The statistical software R have had an impressive increase of use during recent years, not only in research and academic environments but also in practical applications. However, commercial software is still dominant for quality control and improvement statistical methods. Even so, thanks to a handful of R-enthusiastic industry practitioners, some companies are starting to use R for those methods.
    Statistical methods for quality control and improvement are basically general statistical methods applied to industry. Thus, the use of base R in industrial environments should be natural. However, the lack of a sort of seamless Graphical User Interface for standardized tasks is a barrier to use R by non-statistically-skilled professionals. We know that this is not optimal, but it is also true that, in many cases, the statistical tools are just standardized procedures and it is enough for the sake of the business to interpret the outputs given the correct input.
    In this work, standards from the ISO Technical Committee 69 (Applications of statistical methods) and AIAG are reviewed. An R infrastructure with the language and procedures of industry is proposed. The main result shows how Free and Open Source Software like R and its ecosystem can fill the existing gaps, leading to innovation in business and industry. Furthermore, international standards, well known and accepted by the industry, proves to be a catalyst for adopting R.
  • Statistical Modelling, Semiconductor Process Optimization and Control: The Integrate Experience

    Authors: Giuseppe Garozzo (STMicroelectronics)
    Primary area of focus / application: Other: Monitoring and optimization of semiconductor processes
    Keywords: European project, Semiconductor factory, Process control, Process modelling, Dry etching
    Submitted at 6-Mar-2017 15:37 by Giuseppe Garozzo
    11-Sep-2017 17:30 Statistical Modelling, Semiconductor Process Optimization and Control: The Integrate Experience
    INTEGRATE is the name of an European Project focused on factory operation methodologies, data acquisition and analysis concepts, factory information and control system, and process data analysis from heterogeneous samples. ST-Microelectronics was the national coordinator for Italian consortium. The partners involved were from academic and industrial area. From academic partners the semiconductor industry required their expertise on (i) data managing; (ii) statistical analysis (iii) physical modelling underlying statistical inference.
    We report the main goal obtained by this synergy and in particular our experience on process optimization (i.e. data sampling reduction, cycle time improvement) and control (i.e. out of specification reduction, yield enhancement). Finally we present a case study where statistical inference and physical modelling can interact in order to improve efficiency in process development time and cost.