Overview of all Abstracts

The following PDF contains the Abstractbook as it will be handed out at the conference. It is only here for browsing and maybe later reference. All abstracts as PDF

My abstracts

 

The following abstracts have been accepted for this event:

  • The ENBIS papers database

    Authors: Christopher McCollin (Nottingham Trent University, Nottingham, UK)
    Primary area of focus / application:
    Submitted at 25-Jun-2007 10:09 by
    Accepted
    All the details of ENBIS papers such as author, organisation, title, main
    content, etc were put together into one spreadsheet on Excel to derive some
    preliminary results on final take-up on presentation of papers, main authors,
    main subject headings, etc. These results will be presented with details of the
    present state of the database and scope for future work.
  • Six Sigma, the good, the bad and the very bad

    Authors: Jonathan Smyth-Renshaw
    Primary area of focus / application:
    Submitted at 28-Jun-2007 13:13 by
    Accepted
    This presentation will examine my personnal view of Six Sigma. On my last visit to a ENBIS confernence I heard a very negative presentation on Six Sigma. This was a concern. Since, at long last business is starting to wake up to the power of data, the use of statistics, and the data of data management. I wish to present my art gallery of Six Sigma images, the good, the bad and the very bad.
  • Analysis of Repeated Measures Data that are Autocorrelated at Lag(k)

    Authors: Serpil Aktas, Melike Kaya (Hacettepe University, Ankara, Turkey)
    Primary area of focus / application:
    Submitted at 28-Jun-2007 13:34 by Serpil Aktas
    Accepted
    Several measurements are taken on the same experimental unit in repeated measures
    analysis. The subjects are assumed to be drawn as a random sample from a homogeneous
    population and observations of a variable which are repeated, usually over time.
    When data are taken in sequence , such data tend to be serially correlated that is,
    current measurements are correlated with past measurements. Within-subject
    measurements are likely to be correlated, whereas between-subject measurements are
    likely to be independent in repeated measures design.
    Suppose that Y1, Y2 ,.,Yt are random variables taken from t successive time points.
    The Serial dependency can occur between Yt and Yt-1 . The corresponding correlation
    coefficients are called autocorrelation coefficients. The distance between the
    observations that are so correlated is referred as the lag. The covariance structure
    of repeated measures involves both the between subject and within subject. Usually,
    the between subject errors are assumed independent and the within subject error
    assumed correlated. After performing the analysis of variance when there is a
    significant differences between the factors, multiple comparisons tests are used. In
    these procedures the standard error of the mean is estimated by dividing the
    MSwithin from the entire Anova by the number of observations in the group, then
    taking the square root of that quantity but the standard error of the mean needs an
    autocorrelation correction when the data are autocorrelated. In this study, a
    simulation study were performed to illustrate the behavior of the post hoc
    procedures when data is lag(k) autocorrelated and results were compared to the usual
    procedures.
  • Robust elimination of atypical data points in small samples and high dimensions

    Authors: Florian Sobieczky, Birgit Sponer and Gerhard Rappitsch
    Primary area of focus / application:
    Submitted at 6-Jul-2007 16:46 by Gerhard Rappitsch
    Accepted
    A method of eliminating observations with low statistical depth is proposed, leading to improved affine invariant location estimation. The technique particularly addresses the situation of small samples and high dimensionality of the estimation space, a setting in which the conventional notion of an outlier is not appropriate. Removal of atypical observations is achieved via pruning the longest branches of a spanning tree of the sample. The tree depends on the statistical depth of the observations. If halfspace depth is chosen as the relevant statistical depth function, the algorithm inherits the characteristic robustness and high breakdown properties [see D. Donoho and M. Gasko] while being highly efficient in high dimensions [see P. J. Rousseeuw and A. Struyf]. However, it goes beyond the depth-trimming discussed recently in the literature [see Y. Zuo] and thereby gains the essential feature for successfully processing small samples. The validation of the proposed method is performed by testing a set of multivariate distributions (e.g. multi-normal and t-distribution) and comparing the higher order moments before and after elimination. The impact of the proposed methodology is shown for industrial examples in a production environment where early elimination of atypical observations is important for further statistical post-processing.

    In particular, we demonstrate the improvement in the case of correlation estimation for various multivariate distributions. For this application,
    special attention has to be paid to the influence of atypical
    observations
    on the geometry of the estimated contour lines of the underlying
    density.
    Further applications are shown from semiconductor industry to
    investigate
    the correlation of electrically measured performance parameters after
    fabrication (e.g. threshold voltage) and inline measurements of process
    parameters (e.g. oxide thickness).

    D. L. Donoho, M. Gasko: `Breakdown properties of location estimates based on halfspace depth and projected outlyingness’, Annals of Statistics 1992, Vol. 20, No.4, p. 1803-1827

    P. J. Rousseeuw, A. Struyf: `Computing location depth and regression depth in higher dimensions’, Statistics and Computing, 8:p.193--203, 1998. 12

    Y. Zuo: `Multidimensional trimming based on projection depth’, Annals of Statistics 2006, Vol.34, No.5,p. 2211-2251

  • A comparison of neural network and control charting for monitoring profiles in manufacturing processes

    Authors: M. Pacella (1) and Q. Semeraro (2)
    Primary area of focus / application:
    Submitted at 10-Jul-2007 18:22 by Massimo Pacella
    Accepted
    The issue of monitoring profiles has been defined as being one of the most promising areas of research in statistical process control. One immediate difficulty is how to characterize a profile. As a matter of fact, the identification of a statistical model may become more difficult than expected, thus representing an obstacle to the introduction of profile monitoring in actual applications. For example, when a profile represents the physical dimensions of a machined surface, as it results in manufacturing applications, measurements data often exhibit complex spatial correlation.
    The aim of this work is to explore a different approach for monitoring profiles, which uses the Adaptive Resonance Theory (ART) neural network. The implementation of this neural network is based on a set of profiles which are representative of the process in its natural, or in-control, state.
    Throughout the paper, a real case study related to profiles data obtained by a common machining process is used. With reference to the Phase II of profile monitoring, performance of the proposed approach are compared to those of multivariate control charting of the parameters vector. Although the proposed neural network does not produce always outperforming results, it presents comparable performance in several cases. The main advantage presented by the approach is that the model of profile data is “autonomously” derived by the neural network, without requiring any further intervention by the quality practitioner. This feature may create an important bridge between profile monitoring and quality monitoring of several specifications in actual applications.

    Affiliations:

    (1) Università del Salento, Dipartimento di Ingegneria dell'Innovazione, Lecce, Italy
    (2) Politecnico di Milano, Dipartimento di Meccanica, Milano, Italy

  • Software reliability growth models: systematic descriptions and implementations

    Authors: Ed Brandt, Isaac Corro Ramos (corresponding author), Alessandro Di Bucchianico and Rob Henzen
    Primary area of focus / application:
    Submitted at 30-Jul-2007 15:01 by Isaac Corro Ramos
    Accepted
    We present a systematic approach to software reliability models based on best practices from statistics. Basic steps in a statistical analysis of software reliability data should include data collection, trend tests, model selection, model estimation, model validation and model interpretation. Several problems arise when we try to meet these standards. Normally, assump-tions of independent and identically distributed observations are broken by software reliability models. Therefore, standard results from statistics cannot be used although this is often done. Imprecise mathematical de-scriptions of the models are usually found in the literature. Even if the model description is correct, we often find a lack of attention for numerical instabilities in parameter estimation

    We also report on the status of a new tool that we are developing to sup-port our systematic approach. Existing tools for software reliability analy-sis like Casre and Smerfs3 do not make full use of state-of-the-art statistical methodology or do not conform to best practices in statistics. Our tool uses well-documented state-of-the-art algorithms and encourages applying best practices from statistics. Moreover, it can easily be extended to incorporate new models. We decided to use Java for the interface (platform independ-ent) and the statistical programming language R (see www.r-project.org ) for the statistical computations. We pay special attention to convergence issues and apply specific algorithms that avoid standard numerical prob-lems.
  • Analysis of CUSUM and EWMA control charts for Poisson data under parameter estimation

    Authors: Murat Caner Testik
    Primary area of focus / application:
    Submitted at 9-Aug-2007 09:28 by
    Accepted
    Cumulative Sum (CUSUM) and Exponentially Weighted Moving Average (EWMA) type control charts are common in industry since they are easy to implement and yet powerful. In order to monitor counts data, such as number of nonconformities in a unit from a repetitive production process, CUSUM and EWMA type control charts were developed under the assumption of Poisson distribution. Although the Poisson distribution may be an appropriate model for such type of processes, in-control process parameters may be unknown in practice and these may be replaced with the estimates from a reference sample. Due to the additional variability introduced by parameter estimation, operational performance of a control chart might differ from the expected performance when the parameters are known. In this research, effect of estimated process mean on the performance of the CUSUM and EWMA type control charts are discussed for Poisson data monitoring.

    Key words: Attributes control chart, CUSUM, EWMA, Poisson distribution, markov chain, statistical process control, estimated parameters.
  • Analysis of CUSUM and EWMA control charts for Poisson data under parameter estimation

    Authors: Murat Caner Testik (Hacettepe University, Ankara, Turkey)
    Primary area of focus / application:
    Submitted at 9-Aug-2007 09:31 by
    Accepted
    Cumulative Sum (CUSUM) and Exponentially Weighted Moving Average (EWMA) type control charts are common in industry since they are easy to implement and yet powerful. In order to monitor counts data, such as number of nonconformities in a unit from a repetitive production process, CUSUM and EWMA type control charts were developed under the assumption of Poisson distribution.
    Although the Poisson distribution may be an appropriate model for such type of processes, in-control process parameters may be unknown in practice and these may be replaced with the estimates from a reference sample. Due to the additional variability introduced by parameter estimation, operational performance of a control chart might differ from the expected performance when the parameters are known.
    In this research, effect of estimated process mean on the performance of the CUSUM and EWMA type control charts are discussed for Poisson data monitoring.

    Key words: Attributes control chart, CUSUM, Poisson distribution, markov chain, statistical process control, estimated parameters.

    Specifics: Submitted for consideration as a talk (not a poster presentation).
  • Testing randomness for the gaming industry: tackling the multiple testing issue

    Authors: Dr Neil H. Spencer, University of Hertfordshire, U.K.
    Primary area of focus / application:
    Submitted at 16-Aug-2007 09:22 by
    Accepted
    In recent years, the gaming industry has experienced considerable growth. In the face of increasing competition, companies have sought to appeal to players by portraying themselves as trustworthy. One way of doing this is to allow the random number generation processes used to be scrutinised by independent bodies. The author has carried out work in this area for organisations including Camelot (operators of the U.K. National Lottery), Gamesys (providers of online gaming) and Active Game Design (providers of “fruit machines”). Whatever the situation, a batch of numbers is obtained from the random number generator, and a battery of statistical tests applied. However, this application of a number of tests causes problems. Assessing each test at the 5% level of significance is problematic because one would expect 5% of tests to be “significant” even if everything is random. For independent tests, adjustments can be made to allow for the fact that multiple tests are being conducted. However, when testing for randomness, it is usually the case that the tests are correlated. Existing approaches to overcoming this problem involve simulating an empirical distribution for the smallest p-value obtained from the tests. However, this discards information provided by the results of the other tests which may also indicate whether or not the random number generator is producing what would be expected from a random process. This paper presents an alternative method for dealing with the multiple testing problem for non-independent tests, where none of the information provided by the tests is discarded.

    PLEASE NOTE: I WILL NOT BE ABLE TO ARRIVE IN DORTMUND BEFORE LUNCHTIME ON MONDAY 24TH SEPTEMBER, SO I WOULD BE OBLIGED IF THIS WAS TAKEN INTO ACCOUNT WHEN SCHEDULING MY TALK (IF ACCEPTED). MANY THANKS IF THIS IS POSSIBLE. NEIL SPENCER.
  • Data mining of a mail order customer database for Kansei Engineering

    Authors: Kathryn Smith and Shirley Coleman (University of Newcastle upon Tyne, Newcastle upon Tyne, UK)
    Primary area of focus / application:
    Submitted at 22-Aug-2007 13:31 by
    Accepted
    The emotional responses customers have toward a company’s products can be revealed using Kansei Engineering (KE). Analysis of these responses yield insight into the importance of design factors and the relationship they share with the emotional responses. KE typically produces 3 dimensional data with customers, products and emotional response via semantic scales as the dimensions. These relationships are the key to the importance of KE in the design process and in providing a broad portfolio of products. As KE is expensive to do properly, data mining can provide an alternative way of assessing which design factors are important to which types of customers. It can also prepare the groundwork for KE. This paper investigates what information can be obtained from data mining sales data as a pre-cursor to KE. The sales data from a mail order catalogue company was data mined in order to detect any differences between customer segments (segments defined by recency of last purchase, frequency of purchases and value of sales) for their geographical region, product choices and product characteristics, including size and colour. Interestingly, the differences between customer segments for the colour and style of products were not very profound. However, customers from the ‘higher worth’ segment purchased more large size items. Product size is apparently an important factor and is related to customer worth. Understanding differences like these within the customer database is critical for informed design choices and for future KE investigations.