# ENBIS: European Network for Business and Industrial Statistics

Forgotten your password?

Not yet a member? Please register

## Submitted abstracts

**For the Fourth Annual ENBIS Conference**

*More information on this conference can be found on the events page.*

*In particular: see the conference programme*.

### Sessions

Session 1A: Six Sigma and quality improvement

Session 1B: DOE, special responses

Session 1C: Process monitoring and charting

Session 2A: Statistical consulting

Session 2B: Reliability and safety

Session 2C: Process modelling, cases

Session 3A: DOE, general

Session 3B: Process modelling, multivariate

Session 3C: Business and economics

Session 4A: Six Sigma, process capability

Session 4B: Process models and controllers

Session 4C: Business and economics

Session 5A: DOE cases

Session 5B: Reliability and safety, cases

Session 5C: Statistical modelling in pharma

Poster session: Presentation and exhibition of posters

Session 6A: DOE methodology

Session 6B: Statistical modelling

Session 6C: Measurement processes

### Index by number

1. 5. 6. 7. 8. 9. 10. 11. 13. 15. 16. 18. 19. 20. 21. 22. 23. 24. 26. 27. 29. 30. 31. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. 55. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 92. 93. 94. 95. 96. 97. 99.

### Index by author

### 1A. Six Sigma and quality improvement

**7. Statistical Engineering**

*Authors:* Stefan Steiner (University of Waterloo) and R. Jock MacKay, University of Waterloo *Keywords:* problem solving, variation reduction *Format:* presentation (Six Sigma and quality improvement) *Contact:* shsteine@uwaterloo.ca

Statistical Engineering is a problem-solving algorithm and quality improvement system developed at the Institute for Improvement in Quality and Productivity at the University of Waterloo. The algorithm is useful for addressing chronic excess variation in high volume manufacturing. The algorithm is structured to be as proscriptive as possible focusing on Ã¢â‚¬Å“how toÃ¢â‚¬Â solve problems. The goal of this talk is to provide an overview of the Statistical Engineering algorithm. The focus of the seminar will be on the seven possible variation reduction approaches: 1. fix the obvious using knowledge of a dominant cause of variation; 2. desensitize the process to variation in a dominant cause; 3. feed-forward control based on a dominant cause; 4. feedback control; 5. make the process robust to noise variation; 6. 100% inspection; 7. move the process center closer to the target. Consideration of these different approaches drives the choices in the algorithm. The ideas will be illustrated with numerous examples from our consulting work.

**11. Linking improvement programs into a coherent framework**

*Authors:* Itziar Ricondo (University of Navarra) and Elisabeth Viles (University of Navarra) *Keywords:* quality improvement *Format:* presentation (Six Sigma and quality improvement) *Contact:* iricondo@tecnun.es

The aim of this work is to create a framework to locate the different improvement programs in order that organizations use it as roadmap for quality improvement and excellence. Organizations are becoming increasingly aware of the importance of quality and quality improvement. However, there is evidence of the difficulties they face to choose their improvement program/s, due to the wide range of possibilities and the fact that they have no clear idea of what aspects different programs cover: planning, improvement, control, assessmentÃ¢â‚¬Â¦ There is a wealth of literature on the subject of quality, improvement and the broader concept of business improvement: TQM, Reengineering, Lean, Learning Organizations and Knowledge Management, EFQM, Six SigmaÃ¢â‚¬Â¦These different improvement programs have been studied in an isolated way, due to the fact that they differ in origin and time frame. Furthermore, they have been presented as opposite concepts, such as TQM vs Reengineering, Six Sigma vs TQM. However, everybody agrees that neither of these improvement programs is a panacea for all business problems. Despite the literature, organizations don't have a holistic picture of the scope of each program and how to engage them in their own journey to quality. This work attempts to bridge this gap analysing the objective, principles, methodology, tools& techniques, human resources and success factors associated to each of them, and also to relate them to provide a coherent reference framework for organizations.

**19. Hypothesis generation in improvement projects: how to identify possible causes?**

*Author:* Jeroen de Mast (IBIS UvA) *Keywords:* Problem-solving; diagnosis; exploratory data analysis; discovery *Format:* presentation (Six Sigma and quality improvement) *Contact:* jdemast@science.uva.nl

In quality improvement projects – such as Six Sigma projects – an exploratory phase can be discerned, during which possible causes, influence factors or variation sources are identified. In a later, confirmatory phase the effects of these possible causes are experimentally verified. Whereas the confirmatory phase is well understood, in both the statistical sciences and philosophy of science, the exploratory phase is poorly understood. This paper aims to provide a framework for the type of reasoning in the exploratory phase by reviewing relevant theories in philosophy of science, artificial intelligence and medical diagnosis. Furthermore, the paper provides a classification and description of approaches that could be followed for the identification of possible causes.

**83. Reducing the Stops in a Powder Detergent Line Using the Six Sigma Methodology**

*Authors:* Lluis Marco (Technical University of Catalonia (UPC)) and Ulisse Balteo (Procter & Gamble); Xavier Tort-Martorell (Technical University of Catalonia, UPC) *Keywords:* Six Sigma Methodology, Gage R&R Study, Logistic Regression, Powder Detergent *Format:* presentation (Six Sigma and quality improvement) *Contact:* lluis.marco@upc.es

The purpose of this presentation is to explain our experience in a multinational company that produces laundry detergents, among other products. Quality was an important issue in the company and they already had a quality improvement program based on teams. However, managers were interested in introducing the Six Sigma DMAIC methodology to evaluate it and see first hand the differences with the one they were using. Too many stops in one of the powder detergent lines due to alimentation clogs was one of the biggest cronic troubles. So after a short period of training on Six Sigma (basic concepts, organization, DMAIC steps and a refreshment of the statistical tools) a team was arranged with the purpose of reducing stops in that line. It involved technicians from the alimentation line, the chief line operator, an expert in the material used for the carton boxes, a specialist from the main carton provider and a statistician from university skilled in the Six Sigma methodology, acting as a Black Belt. Besides the main objective of reducing line stops, other goals of the project were opening the Ã¢â‚¬Å“operating windowÃ¢â‚¬Â for alimentation (be able to use less restrictive settings for the machine) and reducing material costs. Quantitative figures for these objectives were specified in the Project Charter. The presentation will describe how the project progressed, following the DMAIC steps. The Mesure phase turned out as a very important one, since many unknown issues were discovered thanks to a Gage R&R study. Both historical data and data collected during the Mesure phase were used in the Analyze phase but, no clear relations came up from that analysis. It was decided to experiment bringing the line to extreme conditions. The response was a dicotomic variable and therefore the results were analyzed through a logistic regression. This allowed getting a model for the proper working area of the line. In our final conclusion we will also report the feelings regarding the project by team members and management as well as thoughts on how the project could have been done better.

**99. Inflator Control Plan Project - A 6-Sigma Approach for Airbag Development Programs**

*Author:* Anja Schleppe (Autoliv GmbH) *Keywords:* Inflatable Curtain Performance, Propagation of Variability, 6-Sigma *Format:* presentation (Design of experiments) *Contact:* anja.schleppe@autoliv.com

This Inflator Control Plan project was started to understand how to tune inflators for Inflatable Curtain performance. The project was started with more than 15 engineers from 3 companies, 2 of them located in the USA and the 3rd one located in Germany. Inflators are main components of airbag modules. This means, inflator outputs become module inputs, and variability of inflator outputs propagates into module outputs. The interdisciplinary team had the task to understand this link between inflator outputs and final product outputs to reduce the number of development tests in future airbag development programs.

### 1B. DOE, special responses

**27. Designed experiments and longitudinal response data: what's a good model?**

*Authors:* Jan Engel (CQM) and S. A. Ortega (TU/e) *Keywords:* Factorial experiments, Longitudinal data, statistical model *Format:* presentation (Design of experiments) *Contact:* Engel@cqm.nl

In many industrial problems the judgement of product quality is performed on longitudinal data. These involve time and temperature curves measured in products prototyping, and studies of the behaviour of human beings when they use consumer products. Our starting point will be the modelling of longitudinal response data from personal care experiments where treatments are applied to subjects. Analysis of the experimental data will be two-fold: 1) exploratory analysis, including graphics and PCA, and 2) inferential analysis by model estimation and testing. In the spirit of the latter, we applied five models from statistical literature. These models are fitted to the data from the experiment and the estimation and testing results are compared. A simulation study investigates the power of testing the main effect of a treatment, and its interaction with time. Conclusions are drawn on the usability of these models for this type of industrial problems.

**35. Optimal block designs for conjoint experiments**

*Authors:* Roselinde Kessels (Katholieke Universiteit Leuven) and Martina Vandebroek, Peter Goos *Keywords:* conjoint analysis, optimal design, blocked experiments *Format:* presentation (Design of experiments) *Contact:* roselinde.kessels@econ.kuleuven.ac.be

In marketing, conjoint experiments are frequently carried out to measure consumer preferences for the attributes of various products or services. A specific format of a conjoint experiment is to present respondents with a set of alternatives or profiles that are defined as combinations of different attribute levels. Respondents are then requested to rate each of the alternatives on a scale. Most often, a different set of alternatives is given to each respondent. With these experiments, the relative values respondents attach to a product or service can be determined, or more precisely, the price they are willing to pay for it. In this presentation, we will investigate the relationship between the number of respondents needed and the number of alternatives submitted to the respondents. For example, if there are 30 observations available, then the literature is ignorant about whether it is statistically more efficient to submit 10 alternatives to 3 respondents or 3 alternatives to 10 respondents. To solve this problem, we build on the algorithm of Goos and Vandebroek (2001) for the design of blocked experiments when the block effects are random. In this algorithm, the set of alternatives that are presented to one respondent is treated as a block. For several instances, we compute the optimal block designs and infer some general findings.

**37. Analysing experimental design results when the response is a curve : a case study in polymers R&D.**

*Authors:* bernadette govaerts (Institut de Statistique - Université Catholique de Louvain) and Bernadette Govaerts *Keywords:* Experimental design, functional analysis *Format:* presentation (Design of experiments) *Contact:* govaerts@stat.ucl.ac.be

In many industrial experiments, the response observed is a curve as, for example, when a rheometer is used to analyse the hardness of a product as a function of its temperature. In such context, when a set of (designed) experiments are performed, the polynomial regression approach usually used to predict the response as a function of the experimental factors should be adapted to the functional character of the response. This talk will review several methods available to analyse the results of a designed experiment when the response is a curve and compare them on a case study coming from the polymer industry. The different approaches proposed are parametric, semi- or non-parametric and are inspired from the functional analysis literature in statistics and the PLS one in chemometrics. They all aim to predict the functional response as a function of the design factor settings. A bootstrap procedure is also investigated to build prediction intervals around the predicted curve and test the significance of model parameters.

**41. LISREL: A unified alternative to ANOVA, regression and Principal Components in Designed Experiments when the outcome is multidimensional**

*Authors:* Erik Mønness (Hedmark Univ. College and ISRU, University of Newcastle) and Shirley Coleman (ISRU, University of Newcastle) *Keywords:* Designed Industrial Experiments, Multivariate techniques, LISREL *Format:* presentation (Statistical modelling) *Contact:* erik.monness@hihm.no

LISREL: A unified alternative to ANOVA, regression and Principal Components in Designed Experiments when the outcome is multidimensional. Erik Mønness 1), Shirley Coleman 2) 1. ISRU, Newcastle University, UK and Hedmark University College, Norway 2. ISRU, Newcastle University, UK Abstract In designed industrial experiments, the outcome is often multidimensional. Performing a regression/ANOVA on each single outcome may not take into account the correlation structure of the outcome, and thereby not have an optimal parsimonious information value. One solution to the multivariate response problem is to first perform a principal component or a factor analysis on the outcome. Then a regression/ANOVA analysis can be done using the factor scores as the experimental output (Two-stage analysis). In industrial applications, one is usually interested in both establishing a cause-effect relation, and also to estimate the actual size of the impact. To do so in a two-stage analysis, one has to combine the factor loadings with the regression effects into the estimated impact. Factor analysis issues such as using covariance or correlation becomes crucial when the goal is estimation. The LISREL model (Karl Jöreskog and Dag Sörbom) may be a unified one-step solution in these cases. We will use data from a testing experiment with high precision breathing apparatus, to be used by fire-fighters, to explore several models. The data is a 25-1 design with 5 replicates, giving 16*5=80 runs. There are 7 result variables; 3 measurements of static pressure, and 4 measurements of pressure when breathing through the apparatus. We will compare prediction from ordinary regression/ANOVA, two-stage analysis and LISREL. Also model reduction will be explored.

**63. Optimal Two-level Factorial Designs for Binary Response Data**

*Authors:* Roberto Dorta Guerra (Universidad de La Laguna) and González-Dávila, E. (Universidad de La Laguna); Ginebra, J. (Universitat Politècnica de Catalunya) *Keywords:* D-optimal design, generalized linear model, factorial and fractional factorial design. *Format:* presentation (Design of experiments) *Contact:* rodorta@ull.es

Two-level factorial experiments are very useful in the early screening stages of an investigation and as building blocks for response surface exploration. Under first order normal linear models, the amount of information gathered through these experiments, as measured through the determinant of their information matrix, does neither depend on where the experiment is centered, nor on how it is oriented relative to the contour lines of the surface, and balanced allocations are always more informative than unbalanced ones with the same number of runs. As a consequence, when planning for two-level factorial experiments for continuous responses and any number of factors, the only thing that matters is the range of variation of the factors involved. Instead, the issues involved when planning for two-level factorial experiments for binary responses are a lot more complicated, because none of these properties hold. This paper searches for the designs that maximize the determinant of the information matrix, within the class of two-level factorial experiments centered at a given point, for first order binomial models with either one, two or three factors. That allows one to explore how the performance of these experiments on binary responses depends on their location, orientation, range, and relative allocation of the total number of runs to each one of the support points.

### 1C. Process monitoring and charting

**13. A Variable Sampling Interval S2 EWMA Control Chart for Monitoring the Process Variance**

*Authors:* Philippe CASTAGLIOLA (Ecole des Mines de Nantes) and S. Fichera (UniversitÃƒÂ di Catania);F. Giuffrida (UniversitÃƒÂ di Catania);G. Celano (UniversitÃƒÂ di Catania) *Keywords:* EWMA Control Chart, Variable Sampling Interval, Sample Variance, Markov Chain *Format:* presentation (Process modelling and control) *Contact:* philippe.castagliola@emn.fr

Control charts represent a strategic tool to monitor the state of statistical control of a manufacturing process. Traditionally, control charts based on the monitoring of individual measurements and the sample mean have been deeply investigated in literature. On the other hand, less attention have received the control charts working on statistics dealing with the process variance. In this paper an adaptive VSI EWMA control chart monitoring the sample variance S2 is considered. This chart works by considering a statistic based on a logarithmic transformation of the sample variance, which allows the considered variable to be approximately normally distributed with parameters (0,1). The statistical properties of the proposed EWMA chart are improved by introducing an adaptive scheme, that considers two different sampling frequencies to be adopted, depending on the position of the last plotted point on the chart. The introduction of the adaptive selection for the sample frequency requires a re-formulation of the Markov chain needed to evaluate the ARLs and ATSs of the chart. An analysis based on the evaluation of the statistical performances of the proposed chart through the comparison with a Fixed Sampling Interval EWMA confirms the efficiency of the proposed statistical tool. An illustrative example completes the research.

**34. Probability Estimation for Mixture Distributions and their Application to Statistical Process Control**

*Authors:* András Zempléni (Eötvös Loránd University, Budapest) and Csilla Hajas (Eötvös Loránd University, Budapest); Belmiro Duarte (Instituto Superior de Engenharia de Coimbra, Portugal); Pedro Saraiva (Department of Chemical Engineering, University of Coimbra, Portugal) *Keywords:* cost function, maximum likelihood, process control, shift *Format:* presentation (Process modelling and control) *Contact:* zempleni@ludens.elte.hu

In previous ENBIS conferences, [1], [2] the authors have presented work regarding the application of Markov Chains for optimal definition of SPC charts. Extending on such previous work, we consider the case of process monitoring where one has to face frequent changes in the mean value of the monitored variable. The corresponding control charts are optimised with respect to different, realistic cost functions associated with sampling, false alarms and non-detected changes. This new approach allows one to incorporate different losses, due to delays or other effects of unnecessary alarms. We also present alternative methods for addressing shift-intensity and magnitude-estimation, based on data observed before the chart is actually designed, through maximum likelihood estimation techniques. The asymptotic distribution of these estimators, as well as their small-sample properties, are also given. The practical results obtained through the application of our approaches to industrial data collected from a Portuguese paper mill are also described, showing the potential benefits derived from their use in real environments for achieving adequate statistical process control and monitoring. References Zempleni, A., Hajas, Cs., Duarte, B. and Saraiva, P. Ã¢â‚¬Å“Optimal Cost Control Charts for Shift DetectionÃ¢â‚¬Â, presented at the third European Network for Business and Industrial Statistics (ENBIS) conference, Barcelone, Spain (2003). Zempleni, A., Véber, M., Duarte, B. and Saraiva, P. Ã¢â‚¬Å“Control Charts: a cost optimization approach via Bayesian statisticsÃ¢â‚¬Â, presented at the fourth European Network for Business and Industrial Statistics (ENBIS) conference, Rimini, Italy (2002). Acknowledgement This work was prepared by members of the Pro-ENBIS consortium, supported by European Commission 5th Framework Programme, Contract No. G6RT-CT-2001-05059

**45. Testing and utilising the Poisson nature of clinical data in the NHS**

*Authors:* Shirley Coleman (ISRU) and Oystein Evandt (IMPRO, Oslo, Norway), Chris Pritchett (South Tyneside District Hospital, Tyne and Wear, UK) *Keywords:* breast cancer, random occurrences, Poisson, control chart, dispersion test *Format:* presentation (Process modelling and control) *Contact:* shirley.coleman@ncl.ac.uk

The Poisson distribution arises when events occur at random with a constant rate. Many examples of Poisson type data arise in medical as well as industrial contexts. In the medical scenario, however, patients are often the underlying units and the numbers sampled are very variable because of organisational factors, seasonality and human factors including medical staff and the patients themselves. The number of malignancies found, and the corresponding number of patients investigated, in a weekly one-stop symptomatic breast clinic held in a district general hospital in the North East of England from January 2000 to February 2004 are presented as an example. It is useful to be able to model the data to aid understanding of the process behind the occurrences, to make forecasts and to benefit from statistical process control (SPC). Standard tests for Poisson data include the Dispersion test comparing observed and theoretical variance. The Dispersion test is recommended when the sample sizes are fairly constant, which in the example in question means that the number of patients investigated each week is fairly constant. A test borrowed from industrial statistics methodology can be used with varying sample sizes, i.e. varying numbers of patients attending per clinic session, and is described for this application. SPC charts are increasingly being used in the UK National Health Service (NHS). An SPC chart can be used to monitor the weekly number of malignancies found in the clinics. Exceptional numbers of occurrences exceed the control limit and may signal a change in the population, or in the referral pattern and hence the performance of the system. The position of the c-chart control limits needs to be chosen carefully. The limits recommended in standard texts are liberal in that it is less likely to exceed the control limits than quoted. This effect is greater for low rate of occurrence data such as that found in the breast clinic. A new set of control limits for the c-chart can be developed which is more appropriate to the situation where a more conservative approach is preferred. A number of alternative charts will be presented.

**53. Monitoring infrequent failures of high-volume production processes**

*Authors:* Alessandro Di Bucchianico (Eindhoven University of Technology/ Department of Mathematics) and G.D. Mooiweer (Sara Lee/DE), E.J.G. Moonen (Eindhoven University of Technology) *Keywords:* control chart, high-volume production process, destructive testing *Format:* presentation (Process modelling and control) *Contact:* a.d.bucchianico@tue.nl

We consider the problem of monitoring a high-volume production process with a low percentage of defective products. Testing these products is difficult and is only possible in a destructive way with a defect/no-defect result. An example of such a process is the production process of vacuum coffee packs. In such a case, it is sensible to monitor Y, the number of non-defective items checked between two defective items. An approach due to Nelson is to consider a power transformation of Y to achieve approximately normal observations. This implicitly assumes continuous-time modelling. We suggest discrete-time modelling and avoid transforming Y, since the approximation may be poor in the tails of the distribution. We study various monitoring procedures using standard techniques in mathematical statistics. Our study involves both theoretical performance calculations with different sampling strategies, as well as practical constraints and deliverables from a case study.

**93. A modified EWMA control chart for AR(2)-processes**

*Authors:* Thijs Vermaat (IBIS UvA) and Ronald Does (University of Amsterdam); Søren Bisgaard (University of Massachusetts-Amherst and University of Amsterdam) *Keywords:* EWMA control chart, autocorrelated data *Format:* presentation (Process modelling and control) *Contact:* tvermaat@science.uva.nl

In the time of Shewhart an analyst went to the production line, grabbed a sample and measured this sample in his laboratory. This could be repeated at most a couple times a day. Nowadays online registration systems are installed on production lines. This online measuring results in very high sampling rates. As a consequence of the short sampling interval, autocorrelation appears in the observed measurements, especially in the chemical and food industries. In this presentation we discuss the effects of autocorrelation for real life examples. We develop a modified EWMA control chart which is adapted for autocorrelation. Considered the real life example, different aspects of this modified EWMA control chart will be treated. It is demonstrated that this modified EWMA control chart works very well in practice. Also some theoretical aspects will be derived for this control chart.

### 2A. Statistical consulting

**1. An approach to ensure sustainable improvement**

*Author:* jonathan smyth-renshaw (jonathan smyth-renshaw & associates limited) *Keywords:* improvement, sustainability *Format:* presentation (Statistical consulting) *Contact:* smythrenshaw@btinternet.com

I would like to discuss the following model, which combines both improvement and sustainability and a decision rule to determine whether you should be positioned in the improvement or sustainability loop. The model is shown below:- A summary of each step follows:- Step 1: What is the current situation? (Sustain loop) The start point is to assess the current position. This can be undertaken using a number of ways; however, I favour the Kano Model, which is an excellent model for linking customer's satisfaction and a business' ability to deliver products/services.

Step 2: Current standard (Sustain loop) The best approach here is to map the actual process on paper highlighting the current controls and also use of photographs as evidence of deviation from the standard. If the quality is acceptable then you move to step 6 and step 3 if it is not acceptable.

Step 3: Problem statement (Improvement loop) Before you begin to solve a problem you need to define the problem, this step is therefore about data collection only and producing a problem statement.

Step 4: Problem solving (Improvement loop) Once the problem is defined an appropriate problem solving technique can be used. Experience has shown that if you define the problem clearly, having collected the correct data the solution is often clear. However, more complex methods are available if required. I believe this approach stops you from moving straight into using a complex approach to solve what is a simple problem.

Step 5: Solution implementation (Improvement loop) Once you have a solution you need to ensure the solution is built into the current standard. Furthermore, you need to ensure all potential users are educated in the change to the standard.

Step 6: Maintenance of standards (Sustain loop) Most business operations are undertaken in Step 6, i.e. the quality is acceptable. The key activities are to define the activities, which are necessary to ensure sustainability.

**6. Workshop on communication barriers: Tom Lang's puzzle**

*Author:* Antje Christensen (Novo Nordisk) *Keywords:* communication, consultancy *Format:* (Statistical consulting) *Contact:* antc@novonordisk.com

Conducting a valid statistical analysis is one thing, communicating the results to the client is quite another. Communication skills and knowledge of communication barriers are therefore important for a statistical consultant. Medical writer and communication teacher Tom Lang has devised a laboratory example of the communication process in technical writing. It involves a puzzle and requires participation from the audience. The example will be presented and the outcome discussed. The points addressed include tacit assumptions, anticipation of the reader's needs for information, and design for ease of communication.

**51. The role of statistics in a strategy of information management for large logistic networks**

*Authors:* Sonja Kuhnt (Department of Statistics) and Thomas Fender, Silvia Kuhls *Keywords:* information management, logistic networks *Format:* presentation (Data mining) *Contact:* kuhnt@statistik.uni-dortmund.de

Large logistic networks exist wherever a large number of different items are transported via several nodes. Modelling and simulation of such networks requires valid input data in the right quantity and granularity. We develop a model for the procedure of information management in this context. This includes a basic application-oriented method kit classifying and combining methods from data collection, statistical analysis and visualization techniques. Concerning the statistical aspect types of similar problems within the framework of logistic nets need to be identified and translated into a taxonomy by which statistical methods can be classified. Due to the high dimensionality and heterogeneity of the data methods of modern robust statistics are often needed. We will demonstrate the developed notions including details of the accomplished statistical analysis on an example concerning the simulation of international freight flow.

### 2B. Reliability and safety

**82. Information in Type II right-censored life tests**

*Authors:* Lourdes Rodero (Technical University of Catalonia) and Josep Ginebra (Technical University of Catalonia) *Keywords:* Information measures, Experiments in reliability, Type II censored data *Format:* presentation (Reliability and safety) *Contact:* lourdes.rodero@upc.es

A life test in which a total of n items are placed on test, but where instead of continuing until all n items have failed, the test is terminated at the time that a pre-specified number of failures, r, happens, is what is known as a Type II right-censoring experiment. Such tests save time and money, since it could take a long time for all items to fail, but that is at the expense of missing some of the information that would have been available, had the whole sample been available, instead of just the r smallest observations. Thus the importance of measuring the amount of information about the parameters of a lifetime distribution, contained in an experiment observing only the r smallest order statistics of an i.i.d. sample from that distribution. That is an issue closely related to the question of which part of the sample contains what part of the information, and properly addressed can help decide on the value of r needed to make sure that the life test will provide enough information. This question is most often tackled using the Fisher information as the information measure, but that approach has the handicap that Fisher information can be matrix valued, and that, other than for location distributions, Fisher information depends on the unknown parameters of the lifetime distribution. Instead, in our presentation we will defend the Bayesian approach as the most natural way to deal with this problem, we will discuss some of the information measures that apply in that context, and we will illustrate their use on simple lifetime distributions.

**48. Early reliability prediction in the field**

*Authors:* Roxana Alice Ion (Technische Universiteit Eindhoven) and Peter Sander *Keywords:* Failure probability; Fast fieldback *Format:* presentation (Reliability and safety) *Contact:* r.a.ion@tm.tue.nl

Nowadays the required short time-to-market implies that it is hardly possible to develop and test the product on fitness for use for at least the warranty period. As a result companies have to accept the risk that new developed products will have reliability problems. In order to reduce the risk, manufacturers need to have fast reliable information of the behaviour about the field behaviour of their innovative products. If information about the field reliability is available soon after product launch, then it can be used in case the question rises whether the field complaints should lead to a product recall. The field reliability determines to some extend the warranty cost, and companies like to have a prediction of the warranty costs as early as possible. In this talk we focus on the fast estimation of the failure probability during the warranty period using field data. We discuss a method that is able within three months after market introduction to estimate the probability that a product fails during the warranty period. We will focus on complex systems like professional systems. Due to the high costs of these systems they are reparable systems. This paper discusses the estimation of failure probability during the warranty period given the limitations of the field failure data. Real field data will enrich the paper.

**10. FUNCTIONAL ANALYSIS MAINTENANCE: A TOOL FOR THE OPTIMIZATION OF THE CORRECTIVE MAINTENANCE. Application of FAM to a train electronic control system.**

*Authors:* Elisabeth Viles (University of Navarra) and David Puente *Keywords:* Realibility, Corrective maintenance *Format:* presentation (Reliability and safety) *Contact:* eviles@tecnun.es

Although it's certain that reliability theory and the use of RAMS parameters assure a major efficacy when they are applied in the earlier stages of the product development cycle, it's also true that it's important to look for tools in order to optimise the exploitation stage, registering the generating knowledge, at least under the guarantee stage. Failures occur since perfect product doesn't exist or it's not economically viable, so it becomes necessary to perform maintenance work in order to satisfy the running time required by the customers. While most of the approaches in this stage aim at failure prevention (RCM, TPM,Ã¢â‚¬Â¦) and reliability of the forecasts (RCFA,...), it has been developed a tool named Ã¢â‚¬Å“Functional Analysis for MaintenanceÃ¢â‚¬Â (FAM) which seeks the reduction and optimization of the costs related to corrective maintenance through a simple, fast and efficient treatment of the failure, which may be conducted by any person, requiring only bare knowledge about the product. The more complex the product is, the higher cost reduction is achieved. Furthermore, FAM may be a key factor for the competitiveness of products with inefficient maintenance techniques, as it has been demonstrated with the application of FAM to a train electronic control and supervision system.

**25. RELIABILITY ESTIMATION OF REPAIRABLE SYSTEMS USING NONHOMOGENEOUS POISSON PROCESS**

*Authors:* Ilia Frenkel (Negev Academic College of Engineering) and Ilya Gertsbakh (Ben-Gurion University), Lev Khvatskin ( Negev Academic College of Engineering) *Keywords:* Nonhomogeneous Poisson process, computer-intensive procedure, goodness-of-fit tests *Format:* presentation (Reliability and safety) *Contact:* iliaf@nace.ac.il

We consider a Nonhomogeneous Poisson process (NHPP) with Log-linear and Power form intensity functions, which is used to estimate a reliability of repairable systems. Parameter estimation is carried out by the maximum likelihood method. For the case of the known intensity function, testing the hypothesis that the given sample path is a realization of NHPP, can be accomplished using the fact, that under the NHPP model the mean value functions of NHPP, computed in sequence of ordered failure times, are the failure times of Homogeneous Poisson Process (HPP) with constant intensity function of one, and the intervals between events in the HPP form a sample of i.i.d. standard exponential random variables. Thus it is possible to use standard goodness-of fit tests to check the exponentiality of the process. We propose a computer-intensive procedure for testing the hypothesis that the given sample path belongs to NHPP without making the assumption that the intensity function is known and is being estimated from sample path. We demonstrate our method on the failure data of repairable systems which exist in literature and for our own failure data for the Schlosser Vibration Machine. We also demonstrate the several methods for generating families of stochastic processes with the known probabilistic structure, which includes both NHPP and not NHPP, and check how our method recognizes the underlying process. These processes were used for testing of power properties of different goodness-of-fit tests.

**32. On the Availability Maximization of Protective Equipment**

*Authors:* Lev Khvatskin (Negev Academic College of Engineering - NACE) and Ilya Gertsbakh (Ben-Gurion University), Ilia Frenkel (Negev Academic College of Engineering) *Keywords:* Availability, Protective Equipment, Inspections, Optimal Schedule *Format:* presentation (Reliability and safety) *Contact:* khvat@nace.ac.il

Protective equipment (e.g. fire extinguishers) is kept in storage for a long time. Its state (up or down) can be established only during inspections. If the equipment fails between inspections, its state is diagnosed at the nearest inspection and the equipment is completely restored (replaced) toward the end of the inspection. If the inspection reveals that the equipment is in the up state, it is left in storage until the next inspection. Our purpose is to find the optimal schedule for inspections which would provide maximal availability of the system. We consider various lifetime distributions and take into account the possibility that the inspection itself may increase the equipment failure rate. For the case of keeping in storage of a large number N of similar units, we consider an inspection policy based on a checkout of a sample of k, k<

**38. On the use of Bayesian Belief Networks in modelling human and organisational causes of maritime accidents**

*Authors:* Fabrizio Ruggeri (CNR-IMATI) and Mauro Pedrali (D'Appolonia); Paolo Trucco(Politecnico di Milano); Enrico Cagno (Politecnico di Milano) *Keywords:* Human and Organisational Factors, Bayesian Belief Networks, Risk Analysis *Format:* presentation (Reliability and safety) *Contact:* fabrizio@mi.imati.cnr.it

This paper presents a novel approach for integrating human and organisational factors into risk analysis. This approach has been developed and applied to a case study in the marine industry, but it can be utilised in other industrial sectors. The approach consists of a BBN model of the maritime transport system that has been developed by taking into account the different actors of the maritime transport system (i.e., ship-owner, shipyard, port, and the regulator) and their mutual influences. These influences have been modelled through a set of variables whose combinations express the relevant functions performed by each actor.

### 2C. Process modelling, cases

**16. Minimax Estimation of a Bounded Binomial Parameter**

*Authors:* Nahid Sanjari Farsipour (Shiraz University) and Ghazvininejad (Shiraz Univ.) *Keywords:* Minimaxity, Binomial distribution, bounded parameter, Balanced loss function *Format:* presentation (Statistical consulting) *Contact:* sanjari_n@yahoo.com

In this paper, we cinsider minimax estimation of a binomial parameter p, when it is bounded above by a known constant. Some numerical results are derived.

**29. Stochastic modeling of space time nucleation process in two powder transformations : thermal decomposition of calcium carbonate and dehydration of the monohydrate lithium sulfate**

*Authors:* Celine Helbert (Ecole Nationale Supérieure des Mines de Saint-Etienne) and Celine Helbert; Laurent Carraro *Keywords:* nucleation; process modelling; Markov chain; space and time Poisson point process; Monte Carlo simulation *Format:* presentation (Process modelling and control) *Contact:* helbert@emse.fr

Many powder transformations proceed by nucleation and growth. In the case of lime fabrication, the transformation proceeds by nucleation at the surface of the grain and growth inward (Mampel assumptions). At high temperature and low pressure, adjustments of Monte Carlo simulations to measurements show a high uncertainty on the estimation of the nucleation parameter whereas the growth parameter is well determined. This observation and physical considerations lead us to propose a new stochastic model for nucleation. Up to now nucleation was modeled by a space time Poisson point process. Nucleation was determined by a unique parameter which represents a mean number of nuclei per unit of time and per unit of surface. The new model is supposed to be more realistic since it takes into account the two main steps of the nucleus formation : defects apparition and defects migration on the grain surface. A nucleus is then a cluster of n defects. This model contains three parameters: the first characterizes the apparition of defects, the second characterizes the global mobility of the defects on the surface and the third characterizes the attraction between defects. On a discretised grain surface the temporal evolution of the defects spatial repartition follows a Markov Chain. This chain can be evaluated by the use of Monte Carlo simulations. By these simulations, we can obtain the distribution of a well known quantity : the induction time. This quantity is defined as the stopping time of the chain when we observe the formation of the first nucleus easily and it can be observed in experiments. The influence of the different factors (model parameters, nucleus shape etc) on the induction time is studied via an experimental design. As real measurement data on the induction time have become available in the case of the dehydration of the monohydrate lithium sulfate, my current researches concern the adjustment of our model to these data.

**52. Spiralling in BTA deep-hole drilling -- How to model varying frequencies**

*Authors:* Nils Raabe (Universität Dortmund) and Winfried Theis, Oliver Webber *Keywords:* varying frequencies, process improvement, deep-hole drilling *Format:* presentation (Process modelling and control) *Contact:* raabe@statistik.uni-dortmund.de

One serious problem in deep-hole drilling is the formation of a dynamic disturbance called spiralling which causes holes with several lobes. Since such lobes are a severe impairment of the bore hole the formation of spiralling has to be prevented. Gessesse et al. (1994) explain spiralling by the coincidence of bending modes and multiples of the rotary frequency. This they derive from an elaborate finite elements model of the process. We observed spiralling in several experiments and found that a convergence of the most prominent frequencies of the bending moment toward the rotary frequency and its harmonics can be observed. So we propose a method to estimate a slow change of frequencies over time from spectrogram data. This makes it possible to significantly simplify the usage of the explanation of spiralling in practice because the finite elements model has to be correctly modified for each machine and tool assembly while the statistical method uses observable measurements. Estimating the variation of the frequencies as good as possible opens up the opportunity to prevent spiralling by e.g. changing the rotary frequency. Gessesse, Y.B., Latinovic, V.N., and Osman, M.O.M (1994):"On the problem of spiralling in BTA deep-hole machining", Transaction of the ASME, Journal of Engineering for Industry 116, pp. 161-165

**20. A Statistical Method for MOS Transistor Mismatch Analysis and its Application during Semiconductor Process Development**

*Authors:* Gerhard Rappitsch (austriamicrosystems AG) and Eduard Schirgi, Hubert Enichlmair *Keywords:* Mismatch, Semiconductor, Process, Parameter Extraction *Format:* presentation (Process modelling and control) *Contact:* gerhard.rappitsch@austriamicrosystems.com

Device mismatch is defined as the local variation of semiconductor devices with identical layout. This mismatch is caused by the random nature of process steps (eg. ion implantation or dopant diffusion) where the correlation distance is smaller than the active area of the device. This leads to a local variation of important electrical parameters of the device like the threshold voltage or the gain factor of MOS transistors. Since the functionality of many analog designs (e.g. A/D converters, bandgap reference circuits) relies on good device matching it is important to analyse the mismatch already during the process development phase and start process improvements as early as possible. A statistical method is presented that allows for an accurate determination of device mismatch parameters. Therefore, an appropriate linear variance model is introduced based on an analytical model of the MOS transistor. Based on the measurements of the MOS transistor drain current for several matched pairs on a test chip, the sensitivities of this variance model are calculated by parameter extraction. Afterwards, the variances of the threshold voltage and the gain factor are determined by optimisation. This procedure is carried out for different device sizes and the mismatch of the device parameters is related to the device area using Pelgrom's law. The proposed method is used as an analysis tool during process development. Mismatch parameter extraction enables to analyse the effect of process improvements on the device matching.

### 3A. DOE, general

**81. A Practical Framework for Robust Design using Computer Experiments**

*Authors:* Ron Bates (London School of Economics) and Daniele Romano *Keywords:* Robust design, computer experiments, statistical modelling *Format:* presentation (Statistical modelling) *Contact:* r.a.bates@lse.ac.uk

Robust design (RD) concerns the introduction of noise into the design improvement / optimization problem and is the subject of an increasing amount of research. Traditionally, RD methods employ Response Surface Methodology (RSM) to find solutions robust to noise by conducting physical experiments. However, if computer experiments are used, some typical constraints of RD can be relaxed. In this case, controlling noise factors in the experiments may be easier and less costly than in physical experimentation. Provided that the noise factors are inputs of the computer code, changing factor levels is also inexpensive, further reducing the cost of experimentation. This allows a better exploitation of the RD potential and also induces some relevant modifications in the existing Robust Design procedures. More noise factors can be studied, there is no need to rely on parsimonious polynomial models for fitting the mean and variance of system responses, the experimental designs may not be classical factorial or RSM designs and noise factors may have either fixed or random levels in the experiments. In this environment, Parameter and Tolerance design problems can be naturally integrated, increasing the number of options available for structuring a RD study. In the paper we outline available options and evaluate them in terms of the trade-off between accuracy and cost. The proposed framework is supported by evidence collected in selected case studies.

**86. Challenges in the Application and Implementation of Experimental Designs**

*Author:* Ekkehard Glimm (Aicos Technologies AG) *Keywords:* experimental design, software, visualization *Format:* presentation (Design of experiments) *Contact:* eglimm@aicos.com

Users of statistical experimental designs are often scientists with some basic knowledge of statistics, but a somewhat tenuous grasp of both the power and the limitations of this approach to experimentation. While some researchers are over-confident of analysis results, say from a screening experiment, and carry its interpretation too far, others are over-anxious. This talk is intended to illustrate some of the demands that researchers have on experimental designs. It is an overview of requests we got from the users of our experimental design software Stavex. It shows that some researchers tend to specify over-complicated models (i.e. mixture models, where this is not really necessary), but that, on the other hand, there is a need for - designs that allow for the fact that groups of factors are mutually exclusive alternatives, - designs that allow for a subsample of feasible design points where the underlying factors are continuous, but only certain combinations of factor values are admissible. The talk will illustrate how such designs can be set up. In addition, some errors and pitfalls in the interpretation of analysis results from experimental designs will be discussed. Visualization can assist both the selection of an appropriate design and the interpretation of the analysis results. So-called "4D plots" can be used to display the relative variance (at the design selection stage) or the modelled response (at the analysis stage) as a function of 3 factors simultaneously. It is demonstrated how such plots guard against flawed interpretations.

**92. An Overview of Composite Designs Run as Split-Plots**

*Authors:* Geoff Vining (Virginia Tech) and Geoff Vining and Scott Kowalski *Keywords:* Design of Experiments; Response Surface Methodology; Ordinary Least Squares; Generalized Least Squares *Format:* presentation (Design of experiments) *Contact:* vining@vt.edu

Many industrial experiments involve factors that are hard-to-change as well as factors that are easy-to-change, which naturally leads to split-plot experiments. Unfortunately, the literature for second-order response surface designs traditionally assumes a completely randomized design. Vining, Kowalski, and Montgomery (2004) outlines the general conditions for response surface experiments such that the ordinary least squares estimates of the model are equivalent to the generalized least squares estimates. Vining and Kowalski (2004) use this result to derive conditions for exact tests of most of the model parameters and the Satterthwaite's procedure for the other parameters. This paper summarizes the results of Vining, Kowalski, and Montgomery (2004) and Vining and Kowalski (2004). It illustrates how to modify standard central composite designs to accommodate the split-plot structure. It concludes with a fully analyzed example.

### 3B. Process modelling, multivariate

**31. Multivariate quality control procedures for known heteroscedasticity.**

*Authors:* Daniel Nel (Centre for Statistical Consultation) and Inet Kemp, University of the Free State *Keywords:* Quality Control, heteroscedasticity, multivariate. *Format:* presentation (Six Sigma and quality improvement) *Contact:* dgnel@sun.ac.za

In multivariate quality control procedures (Alt (1985)) it is often encountered that the covariances among the variables concerned have changed between the in-control or setting-up phase and the eventual testing phase. This may be due to known environmental factors over which the researcher has no control but known to exist, hence the name Ã¢â‚¬Å“known heteroscedasticityÃ¢â‚¬Â. The approximate degrees of freedom solution of Nel and van der Merwe (1986) to the multivariate Behrens-Fisher problem is used to construct testing procedures for multivariate quality control when such known heteroscedasticity is present. Adaptations to the Shewhart and CUSUM (Sparks (1992) SQC procedures will be discussed to provide for such known heteroscedasticity. The Takemura decomposition of Hotellings T2 statistic is extended to a decomposition of the Behrens-Fisher statistic for Ã¢â‚¬Å“known heteroscedasticityÃ¢â‚¬Â under the assumption that a Common Principal Components model (Flury, (1988)) can be used to describe the nature of this heteroscedasticity between the in-control and the testing phases. This decomposition can be used to recognize and identify variables giving an out-of-control signal. References: Alt, F.B. (1985): Multivariate Quality Control. In Encyclopedia of Statistical Sciences. Vol. 6, eds. N.L. Johnson and S. Kotz, John Wiley. Flury, B (1988): Common Principal Components and Related Models. John Wiley. Nel, D.G. and Van der Merwe C.A. (1986): A solution to the multivariate Behrens- Fisher Problem. Communications in Statistics, A, Theor. Meth. ,15, 3719 Ã¢â‚¬â€œ 3736 Sparks R. (1992): Quality Control with Multivariate Data. Australian Journal of Statistics, 34, 375 Ã¢â‚¬â€œ 390.

**54. Synthesis Maps for Multivariate Ordinal Variables**

*Authors:* Vicario Grazia (Dip. Matematica-Politecnico di Torino) and G. Brondino, F. Franceschini, M. Galetto-Politecnico di Torino *Keywords:* Multivariate Ordinal Variables, Multidimensional Ordinal Structures, Synthesis Maps, Ranking and Ordering, Statistical Process Control, MultiCriteria Analysis *Format:* presentation (Six Sigma and quality improvement) *Contact:* grazia.vicario@polito.it

Many quality characteristics of products or services are commonly evaluated on ordinal scales. Systematic analysis of categorical variables collected over time may however be very useful for process management strategy. In order to measure customer satisfaction or quality improvement in a process, two or more quality characteristics are often conjointly measured and the multivariate ordinal data summarised by suitable indexes. A common practice suggests evaluating a synthetic index by mapping every realization of a multivariate ordinal variable into numbers and ranking it assigning the result of a synthesis of its components. This procedure is not always legitimate from the measurement theory point of view. In this paper an alternative approach based on the algebraic theory of the ordered sets is proposed, avoiding to map the multivariate components into numbers. The information concerning the multivariate ordinal variable components are synthesised by ordering the multivariate sample space. The ordering criterion is defined on the basis of specific priorities and requirements as expressed by the process control management. An analogous approach is suggested for Statistical Process Control when ordinal variables are involved. In this case the statistical analysis, usually performed by exploiting an arbitrary numerical conversion of the rating scales and by applying traditional synthesis operators (sample mean and variance), is carried out by the use of a new sample scale. This is obtained by ordering the original variable sample space according to some specific "dominance criterion". Practical effects in the use of these methods are shown on a series of application examples.

### 3C. Business and economics

**49. Measuring advertising effectiveness using logistic models**

*Author:* Kristina Birch (Center for Statistics, Copenhagen Business School) *Keywords:* logistic regression, conditional logistic regression, single-source data, advertising effectiveness *Format:* presentation (Statistical modelling) *Contact:* kristina@cbs.dk

Measuring the effectiveness of market activities is crucial when it comes to media planning. In this case, pure single-source data have become available which gives rise to a broader class of analyses. In order to analyse and measure the effects of advertising on consumer level using single-source binary data, logistic regression models can produce very misleading results if the variation between consumers is ignored. Various solutions to this problem are proposed. The conditional logistic regression model is presented, which unlike the conventional logistic regression model, takes into account the fact that the same respondent is observed several times. The conditioning taking place is on the total number of positive responses, i.e. on the total number of purchases of the brand in question. Alternatively, a random effect model can be used. However, these models only deal with the within-respondent effects. The between-respondent effect can be explored by using an over-dispersion model. Here, the models are briefly presented and used to analyse the exposure effects of advertising for various brands from the British Adlab database.

**50. Confirmatory Factor Analysis and Models of Structural Equations for Customer Satisfaction: a Case Study on Local Public Transport**

*Authors:* Ennio Davide Isaia (Dep. Statistics & Mathematics) and Alessandra DUrio *Keywords:* Customer Satisfaction, Factorial Analysis,Structural Equations *Format:* presentation (Statistical modelling) *Contact:* isaia@econ.unito.it

In this paper we shall focus our attention to a model of Structural Equations in order to analyse the satisfaction of young students daily using local public buses. Our starting point is a recent survey conducted by the Department of Statistics and Mathematics ``Diego de Castro'' of Torino with the collaboration of five local transport enterprises working on the same line. According to the evaluation expressed by the 359 subjects on some crucial aspects of the Quality perceived (such as safety of the travel, punctuality and comfort of the service, staff's professional level and/or courtesy,...), our aim is to ratify the casual relationships between the latent variables of the model and, at the same time, to provide for a dimension of Quality perceived by the customers. In order to identify the latent variables and the measurement groups, we resort to a preliminary confirmatory Factorial Analysis which gives us also some informations on the covariance between the latent variables themselves. The model is primarily based on the graphic representation of the path-analysis for the study of causal models, while the coding of the resulting graph in the structural equations system has been carried out according to the standard notation stated by Lisrel methodology. To estimate the parameters of the model, according to the maximum likelihood criterion, we used the Lisrel software implemented on SPSS interface (AMOS 3.6). Validation of the model will be proposed in terms of Chi-Square (CS), Goodness of Fit Index (GFI) and Root Mean Squared Residuals (RMR).

**68. Discrete Methodology in the Analysis of Likert Scales**

*Authors:* Rainer Göb () and Christopher Mccollin (Nottingham Trent University), Maria Fernanda Ramalhoto (Instituto Superior Técnico, Lisboa) *Keywords:* Likert scales, multinomial model *Format:* presentation (Statistical modelling) *Contact:* goeb@mathematik.uni-wuerzburg.de

Likert scales are widely used in survey studies to assess peoples' opinions, attitudes, and preferences by questionnaires. In particular, the questionnaires propagated by the Servqual approach are based on Likert scales. Though Likert scales are discrete in nature they are often evaluated with techniques designed for continuous measurements. The present paper considers evaluation techniques under the proper discrete understanding, in particular, the use of simultaneous confidence intervals for multinomial probabilities. Approximate and exact confidence intervals are considered.

**69. on the design of sample surveys**

*Author:* ron kenett (KPA Ltd.) *Keywords:* surveys process, questonniare design, household surveys *Format:* presentation (Six Sigma and quality improvement) *Contact:* ron@kpa.co.il

Statistical analysis is a science that relies on a transformation of reality into dimensions that lend themselves to quantitative analysis. Surveys rely on structured questions used to map out reality, using samples from a population frame, into data that can be statistically analyzed. In drawing a sample several sampling schemes can be applied, they range from probability samples such as cluster, stratified, systematic or simple random sampling, to non-probably samples such as quota, convenience, judgment or snowball sampling. The survey process consists of four main stages: 1) Planning, 2) Collecting, 3) Analyzing and 4) Presenting. Today's surveys are conducted in a wide variety of techniques including phone interviews, self report paper questionnaires, email questionnaires, internet based surveys, SMS based surveys, face to face interviews, videoconferencing etcÃ¢â‚¬Â¦(Kaplan, Kenett and Raanan, 2003). In this paper we focus on the planning and design of surveys and point out non-trivial differences between individual surveys, household surveys and establishment surveys. In many cases practitioners neglect to account for such differences creating problems in the data that even sophisticated data analysis cannot overcome. The paper makes the distinction between these types of surveys by focusing on the planning and design stage.

### 4A. Six Sigma, process capability

**18. Practical problems in applying Capability Indices for Non-Normal data.**

*Authors:* Rakhi Baj (GSK Cork) and Neil Walker, Max Porter *Keywords:* Non-Normal, Capability Indices *Format:* presentation (Six Sigma and quality improvement) *Contact:* rakhi.x.baj@gsk.com

The quality of the product has become one of the most important elements in the selection of competing products and services in any industry. As a result, improving quality is a key factor leading to business growth and success. One of the more common ways to express information about the performance or capability of a process is through the use and calculation of Process Capability Indices. These determine how the inherent variability in a process compares with the specification or the customer requirements for the product. The underlying assumption in these is that (1) there is sufficient data available to obtain reasonable estimates and (2) the available data are Normally distributed. In practice these assumptions are not always met. In this talk, we will discuss the issues when dealing with real data. We will then use case studies to review the existing approaches for the calculations of process capability indices for non-Normal data and their effectiveness. Finally, we will describe a process that overcomes these issues and provides a reasonably good estimate of the process capability.

**72. Process Capability Assessment Using Nonconformity Ratio Based Desirability**

*Authors:* Ramzi Telmoudi (Dortmund University) and Claus Weihs, Franz Hering *Keywords:* Process Capability, Desirability, Box Cox Transformation *Format:* presentation (Six Sigma and quality improvement) *Contact:* telmoudi@statistik.uni-dortmund.de

The ability of a process to meet the customer requirements is connected to the proportion of nonconformig items. The process capability indices are used to summarize the ability of a process to meet the customer requirements. However, the process capability indices fail in comparing between the capability of several processes when the normal distribution hypothesis does not hold. The purpose of this work is to present a desirability function which is considered as a metric for capability assessment in the univariate case. In the multivariate case the presented desirability function is used to compute the desirability index, moreover, an algorithm is prtesented in order to use the desirability index for assessing the potential process capability and to determine the optimal operating conditions allowing the minimization of the nonconformity ratio. The validity of this approach is evaluated through a simulation study using nonnormal distributions.

**96. A GRAPHICAL TOOL USEFUL IN CAPABILITY ANALYSIS**

*Author:* Kerstin Vännman (LuleÃƒÂ¥ University of Technology) *Keywords:* capability index, process capability plots, confidence regions, graphical methods. *Format:* poster (Six Sigma and quality improvement) *Contact:* kerstin.vannman@ltu.se

When measuring the capability of a manufacturing process some form of process capability index is often used, designed to quantify the relation between the actual performance of the process and its specified requirements. To assess the capability, using a random sample, it is common to apply confidence intervals or hypothesis tests for the process capability index. Here an alternative approach is presented. Usually a process is defined to be capable if the capability index exceeds a stated threshold value, e.g. Cpm > 4/3. This inequality can be expressed graphically as a region in the plane defined by the process parameters (m, s), obtaining a process capability plot. In this plot a safety region, similar to a confidence regions for (m, s), can be plotted to test for capability taking into consideration the uncertainty introduced by the random sample. The region is constructed so that it can be used, in a simple graphical way, to draw conclusions about the capability at a given significance level. This graphical approach is at the same time helpful when trying to understand if it is the variability, the deviation from target, or both that need to be reduced to improve the capability. Furthermore, with this method it is possible to monitor, in the same plot, several characteristics of a process. Under the assumption of normality two different regions, a rectangle region and a circle region, are investigated for the capability index Cpm. The suggested regions are compared with respect to power. Examples are presented.

### 4B. Process models and controllers

**8. Monte Carlo Markov Chain Analysis of Time-Changed Levy Processes of Stock Return Dynamics**

*Authors:* Long Yu (Cornell University) and Haitao Li, Martin Wells *Keywords:* Bayes Factor, Levy Process, Log Stable Model,Monte Carlo Markov Chain, Poisson Jump, Variance Gamma Model, Volatility *Format:* presentation (Business and economics) *Contact:* ly34@cornell.edu

We develop Monte Carlo Markov Chain techniques for estimating time-changed Levy processes of stock return dynamics. The models exhibit stochastic volatility and jumps. Unlike Poisson jumps considered in most existing studies, jumps in our models follow Levy-type of distributions, such as Variance Gamma and Log Stable distribution. While Poisson jumps are typically large and happen rarely, Levy jumps can be both large and small and can happen all the time. Special techniques are needed for estimating Levy processes because for certain models the probability density does not have analytic form and certain moments do not exist. The MCMC methods developed in our paper have excellent performance in estimating Levy processes. Empirically we show that for S&P 500 and Nasdaq 100 indices, stochastic volatility models with jumps follow Variance Gamma and Log Stable distribution perform much better than stochastic volatility models with Poisson jumps. Bayes factor analysis shows that the improvements are much more significant than that of the model in Eraker, Johannes and Polson (2003), which also allows Poisson jumps in stochastic volatility. In fact, once Levy jumps are included, jumps in stochastic volatility play a much less significant role.

**76. Application of DMC controllers to a continuous polymerization process**

*Authors:* Susana BarcelÃƒÂ³ (Universidad Politécnica de Valencia) and Barcelo-Cerdá, S.; Sanchis-Saez, J.; Ferrer-Riquelme, A. *Keywords:* multivariate transfer function model; engineering process control; model predictive control; dynamic matrix control *Format:* presentation (Process modelling and control) *Contact:* sbarcelo@eio.upv.es

In this talk a case study of the application of DMC (Dynamic Matrix Control) to a continuos polymerization process is presented. The multivariable process manipulated variables are reactor temperature (T) and ethylene flow (E) whose changes represent negligible cost when compared to off-target viscosity or low productivity. The controlled variables are the key quality characteristic, that is polymer viscosity, which is measured by melt index (MI) and a productivity index (APRE), worked out by energy balance. Model Predictive Control (MPC) refers to a class of control algorithms that uses an explicit process model to predict the future response of a plant to be controlled. The DMC algorithm (Cutler and Ramaker, 1979) represents the first generation algorithms of MPC technology developed in the industry. Key features of the DMC control algorithm include a linear model for the plant; the optimization of a quadratic performance objective over a finite prediction horizon; the prediction of the future plant output behavior and the calculation of the manipulated variables as the solution to a least-square problem trying to follow the setpoint as closely as possible. In this paper, a discrete linear model, that characterizes the process dynamics, is previously identified from data collected in closed loop operation. Furthermore, the effect of DMC design parameters on closed loop performance is studied by experimental design. The performance of the designed controller is compared to other regulation strategies developed in a previous research.

**77. Process Improvement in the Microelectronic Industry by State Space Modelling**

*Authors:* Kostas Triantafyllopoulos (University of Newcastle) and Ed Godolphin (Royal Holloway, University of London, UK) *Keywords:* Statistical process control; process improvement; quality control; state space models; time series; dynamic models; Kalman filtering. *Format:* presentation (Process modelling and control) *Contact:* kostas.triantafyllopoulos@ncl.ac.uk

In this paper for oral presentation we discuss novel aspects of feedback adjustment for process improvement. The exponentially weighted moving average (EWMA) model has been applied to a process controlling the thickness of nitride layers in the manufacture of microelectronic devices, involving the standard use of the notions of EWMA control chart, EWMA full adjustment and EWMA deadband adjustment charts for process monitoring and process improvement, as described in Box and Luceno (1997). We suggest that a dynamic step forward to process improvement is gained by considering basic state space models, which are known as local level models. Such models are documented in West and Harrison (1997) and are illustrated in Godolphin (2001) for process control in a seasonal context. Local level models have been applied with success to network security and software engineering, see Triantafyllopoulos and Pikoulas (2002). Since the EWMA predictor is a limiting form of the forecast of the local level model, we are able to propose replacing the EWMA by local level models and thus develop relevant feedback adjustment schemes. This proposed forecasting scheme is found to have a better performance than the usually applied EWMA especially when a small number of data is available. A benefit of the proposed model is that the entire forecast distribution is obtained easily, thus providing further insights into process control with state space models. A detailed development of the proposed state space adjustment scheme is given to the nitride layers process and a number of conclusions and recommendations are made. References Box, G.E.P. and Luceno, A. (1997) Statistical Control by Monitoring and Feedback Adjustment. Wiley, New York. Godolphin, E.J. (2001) Observable trend projecting state space models. J. Appl. Stat., 28, 379-389. Triantafyllopoulos, K. and Pikoulas, J. (2002) Multivariate Bayesian regression applied to the problem of network security. J. Forecast, 21, 579-594. West, M. and Harrison, P.J. (1997) Bayesian Forecasting and Dynamic Models, 2nd edition. Springer-Verlag, New York.

### 4C. Business and economics

**5. Methods to Collect and Analyze Organizational Change Management Data; The BEST Approach**

*Authors:* Henrik Buhl (BYGÃ‚Â·DTU Ã¢â‚¬â€œ Department of Civil Engineering Technical University of Denmark) and Ron S. Kenett,(KPA Ltd); Sebastiano Lombardo (SINTEF); Nel Wognum, (University of Twente) *Keywords:* Enterprise Systems Implementation, scoring methods, statistical analysis *Format:* presentation (Business and economics) *Contact:* hb@byg.dtu.dk

Statistical analysis is a data based problem solving methodology. Quantifying a problem is a non trivial task translating reality into numbers. In this talk we will describe a methodology developed to handle the complexity of Enterprise System Implementations. We will focus on how to capture and analyze this complexity using the concept of patterns that lend them to statistical analysis. For a related approach to map patterns in statistical consulting see Kenett and Shade (2001). Enterprise Systems like ERP (Enterprise Resource Planning), CRM (Customer Requirement Management), and PDM (Product Data Management) have gained great significance for most companies on an operational as well as a strategic level. The implementation of such systems is a major effort for companies in all industrial, government and service areas. Davenport has claimed ERP as Ã¢â‚¬Å“ the most important development in the corporate use of information technology in the1990sÃ¢â‚¬Â (Davenport 2000). However, despite the high promise, approximately one-half of all ERP projects fail to achieve the anticipated benefits (Appleton 1997), putting potentially a whole company at risk. The major challenge to day is to manage the process of change in the ongoing Enterprise System Implementation (ESI). In order to develop a better understanding of the change processes of ESI, a European FP5 project, Better Enterprise SysTem implementation (BEST) was launched in 2002. The BEST methodology is based on social science research, case studies and critical incident methods and includes identification of issues formulated as a chain of Cause-Event-Action-Outcome. In this way, knowledge on what really happened in an implementation project is documented. CEAO chains are analysed in terms of people, organisation and technological aspects. Recurring themes are identified, while differences between characteristic contexts are taken into account. Using a tailored scoring method the methodology produces a graphical and numerical evaluation of gaps between the organization's current status and benchmark data. These gaps generate applicable CEAO chains to support consultant in formulating intervention plans designed to address the problems identified in the gap analysis. Collecting and classifying the CEAO-issues has been a challenging tasks posing non-standard problems in data collection and data analysis. We will illustrate the approach developed in BEST to gather knowledge on the process and dynamics of ESI with examples. General conclusions on the applicability of research methodologies and statistical techniques to organizational change management will be presented with an emphasis on opportunities for new research areas. References: Appleton, E. L.(1997): How to survive ERP; Datamation. Boudreau, M.; Gefen, D.; et al (2001): Validation in information systems research: A State-Of-The-Art Assessment; MIS Quarterly 25(1): pp. 1-16. Davenport, T.: Mission Critical: Realizing the promise of Enterprise Systems; Harvard Business School Press, Boston, 2000. Kenett, R.S. and Shade, J. (2001), Ã¢â‚¬Å“The Statistical Consulting Patterns ModelsÃ¢â‚¬Â, ENBIS Report, Statistical Consulting Working Group.

**21. Some Aspects of Teaching Quality to Business Students**

*Authors:* Christopher McCollin (The Nottingham Trent University) and Shirley Coleman, Oystein Evandt *Keywords:* Quality, QFD, Design of Experiments, Teaching Business students *Format:* presentation (Business and economics) *Contact:* Christopher.McCollin@ntu.ac.uk

Some aspects of the NTU degree scheme for BA Business and Quality Management are described with reference to the similarities within the taught methodologies of QFD and DOE. Issues arising from student case studies in QFD are presented detailing where strengths and weaknesses lie.

**67. 'ANTECEDENTS OR CONSEQUENCES OF INNOVATION ACTIVITIES: A CAUSALITY TEST'**

*Authors:* Giuliana Battisti (Aston University) and M. Colombo, L. Rabbiosi (University Politecnico of Milano) *Keywords:* Technology, Innovations, Causality test , Markov process *Format:* presentation (Business and economics) *Contact:* g.battisti@aston.ac.uk

In this paper, we test the existence of complementarities between the adoption of technologically advanced equipment (computer aided design and manufacture equipment, CAD) and an innovative managerial practice (the establishment of joint design teams with customers and suppliers, JOD) using the Granger causality test of binary time series recently developed by Mosconi and Seri (2004). Using a paned data set composed of 438 Italian metalworking firms observed from 1970 up to 1997, we estimate the factors affecting the adoption decision of CAD and JOD and the likelihood of their joint adoption. The main findings suggest that complementarities exist, however 1) there are substantial differences in the factors affecting the adoption decision of the two innovations under scrutiny; 2) there is no clear precedence in the adoption sequence and 3) simultaneous adoption is unlikely.

### 5A. DOE cases

**23. Process Optimization in Sheet Metal Spinning by Adaptive, Sequential Design of Experiments**

*Authors:* Roland Göbel (University of Dortmund, Chair of Forming Technology) and Nadine Henkenjohann (University of Dortmund, Department of Statistics); Matthias Kleiner (University of Dortmund, Chair of Forming Technology) *Keywords:* Adaptive Design of Experiments, Process Optimization, Metal Spinning *Format:* presentation (Design of experiments) *Contact:* goebel@lfu.mb.uni-dortmund.de

In terms of optimizing technical processes the stable region of the process usually is clearly restricted. Especially with respect to setting up new processes with a large number of parameters it is difficult to foresee if all parameter combinations are within the stable region of the process. At the worst, the design cannot be used in the end due to an unacceptable high number of missing values. As a consequence, the center point has to be shifted and the range of the factor-levels has to be reduced. But for this a large number of preliminary experiments is necessary which afterwards often cannot be used within the experimental design. Further, the final design only covers a small region of the interesting parameter space. This contribution focuses on a newly developed approach for a sequential, adaptive design of experiments avoiding the above mentioned problems. In a first step, a space filling design based on a Latin Hypersquare with a minimum number of runs is designed, comprising existing experiments in the interesting region of the parameter space. The factor-response-relationships are then modelled using Gaussian stochastic process. After analysing the results regions with bad results are excluded, and regions that are either expected to be near the optimum or that have a great uncertainty as to the result are refined. This adaptation is carried out sequentially until a stop-criterion is fulfilled. Thus an optimized spacefilling design with maximum information near the optimum and reduced information at the boundary regions can be generated. This method has been successfully applied to the sheet metal spinning process. For this, the approach has been implemented in a process planning system that allows a very efficient combination of knowledge-based pre-set-up and statistically-based optimization of the spinning process. An example of setting-up a process for a new demonstrator workpiece using this tools is presented.

**24. A sequential approach to constrained optimization - Incorporating experiments stored in databases**

*Authors:* Nadine Henkenjohann (Universität Dortmund) and Roland Göbel, Chair of Forming Technology, Universität Dortmund *Keywords:* Sequential design, gaussian stochastic process, database information *Format:* presentation (Design of experiments) *Contact:* henkenjo@statistik.uni-dortmund.de

The classical approach of Response Surface Methodology (RSM) is a very powerful tool in process optimization. Despite its popularity there are many situations where the use of classical RSM is limited. Especially in the field of mechanical engineering, classical RSM may not be an appropriate choice because input-output relationships may be highly nonlinear or multiple constraints restrict the design space. Furthermore, missing observations due to workpiece failures may have a severe impact on the experiment, which may result in a loss of estimability of some factors. In this paper, a flexible sequential approach is presented, which allows to deal with the problems stated above. Additionally, this approach permits experiments stored in databases to be incorporated in the analysis. In the first step, an initial design is constructed which allows to get a first impression of the process. We decided to use a space filling design because it permits complex relationships to be fitted and multiple constraints to be included. The higher uncertainty of the database points is taken into account by a suitable indicator function which allocates a higher prediction variance to the database points. This indicator function decreases monotonously with increasing distance from these points. A Gaussian stochastic process is then fitted to guarantee smooth, data-faithful approximations of the unknown response surface. In the next step, k design points are added sequentially based on a criterion, which balances the need to exploit the approximating surface with the need to improve the approximation. Hence, a new design point is chosen that either optimizes the predicted response or that possesses a high uncertainty. In each step, the new data point is included in the analysis and the response surface model is refitted. A simulation study shows the good performance of this combined approach. References Jones, R. D; Schonlau, M. and Welch, W. (1998). Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization, 13, 455-492.

**60. Using evolutionary operation for improving yield in biotechnologi-cal processes**

*Authors:* Trine Kvist (Novozymes A/S) and Peter Thyregod (Novozymes) *Keywords:* EVOP, experimentation in running production *Format:* presentation (Design of experiments) *Contact:* tkv@novozymes.com

In the biotechnological industry, production is often characterised by relatively few larger batches. In the design stages of a new process, use of statistical methods for experimentation can provide invalu-able information about the process. However, it is frequently found that optimum conditions in lab or pilot plant give lower yields when transferred to full scale production. This fact is due to scale up ef-fects and to the large inherent variations when dealing with biological material and processes such as fermentation. In full scale production there is not the same freedom to experimentation as in the lab scale. Simple one factor at a time trials are predominant when attempting to improve the yield in pro-duction scale. In this way it is possible to control the outcome such that it still meets the requirements. However, this is a very inefficient way to perform experiments. The requirements for an alternative ex-perimental procedure is that it should be robust to non controllable variations, it should contain automatic safeguards to ensure that unsatisfactory material is not manufactured and it should be possible to make decisions during the trial. Furthermore, it is important that the planning stage is short and the interpretation straightforward. The method of evolutionary operation (EVOP) suggested by George Box fulfilled these requirements. We present how EVOP was implemented in a major Danish biotechnological company. An example will be presented where the method was used in the fermentation process of an industrial enzyme. In the particular example process yield was improved by 45%.

**75. Robust calibration of automotive OBD systems combining physical and simulated experiments**

*Authors:* Stefano Barone (University of Palermo) and Pasquale Erto (University of Naples), Alessandro Riegel (ELASIS) *Keywords:* On-Board Diagnostics, Robust Design, Experimental Calibration *Format:* presentation (Design of experiments) *Contact:* stbarone@dtpm.unipa.it

On-Board Diagnostic (OBD) systems, installed on new motor vehicles, assess the state of health of critical components and inform the driver of any malfunction in real time. The main manufacturers' concern is to minimise the risks of erroneous detections. Hence, during development phases, the OBD systems must be finely calibrated in order to ensure reliable functioning. This article presents a robust calibration approach, aiming to make the OBD system as insensitive as possible to external and internal sources of variation occurring during the real use of the vehicle. This approach combines physical and simulated experiments, by using specific software reproducing the OBD logic. Adopting this extensive and integrated experimentation, both the optimal calibration and reduced risks of erroneous detection are obtained, at a very limited experimental cost. An applicative example concerning a new car model, during its development phase, is presented.

**87. Experimental Design On A Reflow Soldering Process Measuring Qualitative And Quantitative Responses**

*Authors:* Shane ONeill (Institute of Technology, Sligo) and John Donovan *Keywords:* Design of Experiments, Reflow Soldering, Lead-free *Format:* presentation (Design of experiments) *Contact:* oneill.shane@itsligo.ie

The impending introduction of lead-free solder in the manufacture of electrical and electronic equipment has presented the electronics industry with many challenges. Manufacturing processes need to be investigated to evaluate lead-free solder against tin-lead solder, the standard solder alloy of use. This paper investigates whether the quality of ball grid array (BGA) solder joints manufactured using lead-free solder is comparable with BGA solder joints manufactured using tin-lead solder. The study compares the two solders by conducting an experimental design on a reflow soldering process. The reflow soldering process parameters were selected as the experimental factors and the experiment was conducted using a tin-lead solder and a lead-free solder. Two responses, a qualitative response and a quantitative response, were analysed. A quick method of assessing solder joint quality was developed using qualitative data. This involved collecting qualitative data through x-ray analysis and cross section analysis of certain solder joint quality characteristics. The characteristics included solder defects, solder joint formation, wetting, void frequency, void area and solder joint alignment. The data was scored and weighted using a technique developed in this study. A more detailed mechanical test was then conducted to assess the solder joint quality of the tin-lead and lead-free solders. This test used a mechanical deflection system that provided quantitative data on the quality of the solder joints. Experimental techniques were used to assess the data. This paper intends to show correlation between the two sets of results and therefore correlation between the two test techniques.

### 5B. Reliability and safety, cases

**55. Signature analysis of motor current**

*Authors:* Talia Figarella (EURANDOM) and Alessandro Di Bucchianico (EURANDOM), H.P. Wynn (EURANDOM and London School of Economics, U.K.), Wicher Bergsma (EURANDOM), Vladimir Kulikov (EURANDOM) *Keywords:* Signature analysis; motor current; condition monitoring; statistical analysis *Format:* presentation (Reliability and safety) *Contact:* t.r.f.g.figarella@tue.nl

Information on the condition and degradation of electrical appliances, like digital copiers, can be obtained from signature analysis of the motor current. Nevertheless, the main problems are the extraction of different characteristics from the current signal and relate them to the machine's (or component's) performance. We discuss statistical methods to identify these characteristics in the current signal, highlighting methods to distinguish several machine conditions.

**65. Evaluating the Reliability of a University Course Through Quantitative FTA**

*Author:* Laura Grassini (Dipartimento di Statistica, UniversitÃƒÂ di Firenze,Italy) *Keywords:* FTA, reliability *Format:* presentation (Reliability and safety) *Contact:* grassini@ds.unifi.it

In this paper, the concepts and tools of quantitative FTA are applied for measuring the performance of the Economics and Business Administration Course (University of Florence). More specifically, we are concerned with a non-physical system which is composed of the six first year exams, which represent the basic events of our simplified system. The failure of the system is identified as the non-completion of the six exams within the established duration of the course (four years). The failure of the single component is represented by the event: Ã¢â‚¬Å“not passing the exam within four yearsÃ¢â‚¬Â. In the evaluation of components' unreliability, one must consider that the outcome of an exam depends both on its specific features but also on the individual student's characteristics. And, even if we succeeded in the identification of the exam's unreliability, the problems of dependent failure still remain due to the presence of any similarities among exams (for example: math and statistics both deals with mathematical and numerical methods). In the paper the use of a subgroup of students, which are more homogeneous in terms of skill and ability, is experimented on current data. Additional variables related to students' skill are used to build up homogeneous groups of units. Moreover, the problem of dependent failures is discussed. References Høyland A., M. Rausand (1994), System Reliability Theory, Wiley & Sons, New York. Kovalenko I.N., N.Y.Kuznetsov, P.A.Pegg (1997), Mathematical Theory of Reliability of Time Dependent Systems with Practical Applications, Wiley & Sons, England.

**44. A Performance Measure Independent of Adjustment (PerMIA) to evaluate the estimation methods for the quantitative FTA**

*Author:* Rossella Berni (Department of statistics- University of Florence) *Keywords:* Quantitative FTA, PerMIA, top-event *Format:* presentation (Reliability and safety) *Contact:* berni@ds.unifi.it

Quantitative FTA (Fault Tree Analysis) is a reliability technique for the analysis of an engineering system. Our aim is to use this method to analyse the university Italian system: we examine two cohorts of students recruited in 1990-91 and 1991-92 for the Economic and Business Administration Course (University of Florence). One of the main features to be dealt with is the construction of the complex university system and the computation of the top-event, which in general represents the system's failure and defined here as the probability of not taking six exams of the first year within four years. This paper is focused on the use of the off-line quality control techniques as a possible tool to evaluate methodological aspects and problems tied to this specific situation. In particular, a main issue is the estimation method for the top-event, just because in this case the assumption of independence between events, where the event in this case is the single exam, will lead to unrealistic results. Therefore two alternative estimation methods (ÃƒÂ¢-method and square-root method) are evaluated defining a Performance Measure Independent of Adjustment (PerMIA), which is created ad hoc for establishing the best estimation method for FTA. This evaluation is performed jointly with the student's characteristics. Students are stratified according with two variables: the high school final score and the final score of the university course. This last variable takes care of the score and also of the Ã¢â‚¬Å“non-graduationÃ¢â‚¬Â case. The variable related to the high school situation is considered as noise factor. Notably, in this case, the main aim is the evaluation of the best estimation method for the FTA given the particular system analysed.

### 5C. Statistical modelling in pharma

**9. Setting specifications for drug products**

*Authors:* Henrik Melgaard (Novo Nordisk A/S) and Jørgen Iwersen (Novo Nordisk A/S) *Keywords:* specifications, stability, statistical modelling *Format:* presentation (Statistical modelling) *Contact:* hmel@novonordisk.com

In the pharmaceutical industry a drug must conform to certain limits through its shelf life period. The purpose is to ensure high quality, low variability and predictability of the products. To ensure compliance in practice we need manufacturing processes to be robust and in control, measurements systems to be in control and the measurements must be traceable. Storage conditions must be under control. In this paper we discuss the practical implications involved in setting and maintaining specifications for drugs in the pharmaceutical industry. These include statistical process control limits, release limits, shelf life limits and in-use limits. The challenge here is to make this chain of limits consistent and at the same time be practical for use. The scientific approach to establishing a chain of specifications involves normal linear mixed models and Arrhenius model, a kinetic model, describing e.g. the temperature dependence of drug degradation. These models are applied to data from stability studies as well as data from batch release.

**64. Explorative use of statistical methods. A case study on statistical evaluation of the particle size distribution**

*Author:* Heli Rita (Orion Pharma) *Keywords:* Non-pharma, statistical evaluation *Format:* presentation (Statistical modelling) *Contact:* heli.rita@orionpharma.com

In the development of tablet products in the pharmaceutical industry, it is often noticed that the particle size distribution of the raw material of the active ingredient, as well as that of the mass to be compressed strongly affects the quality of the final product. To unveil the presence or absence of the desired distributional properties both the response variables and the statistical methods require careful consideration. This need applies both to the results of traditional measurements using different sized screenes and to the modern measurements based e.g. on laser diffraction. Benefits and drawbacks of some statistical approaches which have been used in the development process are evaluated.

**89. Statistical Evaluation of a Microarray Platform**

*Author:* Edwin van den Heuvel (N.V. Organon) *Keywords:* Analysis of variance, variance components, validation, gene expression *Format:* presentation (Statistical modelling) *Contact:* edwin.vandenheuvel@organon.com

Pharmaceutical industry and medical institutes are performing microarray experiments to learn about genes that are associated with diseases and treatments. These experiments are performed in laboratories and measure the gene expression of thousands of genes simultaneously. The gene expression is positively correlated with the amount of DNA copies active in a particular biological sample. The microarray platform introduces technological variation (additional to the biological variation present in the biological material) as any measurement process. An microarray experiment for one-channel microarray chips was set-up to estimate the contribution of several technological sources of variation. A statistical analysis of this experiment will be presented. The data will be described by a complex mixed effects analysis of variance model. The goodness-of-fit of this model will be discussed too. The literature describes statistical normalization methods to remove or reduce the influence of technological variation. The influence of the scaling normalization procedure will be discussed for this particular microarray experiment. The consequences of the proposed statistical model on the selection of genes in experiments with treatments will be discussed too.

### Poster presentations

**22. Kansei a methodology for translating emotions into design**

*Authors:* Carolyn van Lottum (Industrial Statistics Research Unit) and Shirley Coleman (ISRU), Erik Monness (Hedmark University College, Norway), LluÃƒÂs Marco (UPC, Technical University of Catalunya, Spain),Joe Chan (University of Newcastle upon Tyne), Maggie. Q. Ren (University of Newcastle upon Tyne) *Keywords:* Kansei Engineering, Factor Analysis *Format:* poster (Statistical consulting) *Contact:* c.e.vanlottum@btopenworld.com

The Kansei Engineering methodology is widely used in Japan but is less known in Europe. It is a technique that attempts to incorporate consumers' emotional feelings into the process of product design. Central to Kansei is the analysis of consumer opinion through the use of statistical tools. The aim of Kansei Engineering is to link consumers' emotional responses to actual design elements. For example, what makes a design appear Ã¢â‚¬Å“freshÃ¢â‚¬Â or Ã¢â‚¬Å“comfortableÃ¢â‚¬Â. To achieve this, the semantic universe of descriptors relating to a product must be collected and narrowed down to a representative set. Semantic Differential Scales are created from this reduced semantic universe, these are used to collect consumer opinion on existing product designs. Factor Analysis on the resulting data provides an insight into how consumers interpret descriptive words, and how the existing designs are perceived. By breaking the product down into design elements it is possible to investigate the relationship between individual elements and the consumer's responses recorded on the semantic scales. Through this process, design elements are mapped to the individual words and phrases. Armed with this information, manufacturers should be able to develop prototype products that evoke specific feelings in the consumer i.e. products with instant customer appeal. In this poster we present an overview of the progress of KENSYS, a European research project into the application of Kansei Engineering in SME's currently underway at ISRU at the University of Newcastle upon Tyne.

**39. Experimental Strategy for Screening Factors in LTA Zeolite Synthesis**

*Authors:* anthony cossari (Dept. of Economics and Statistics - University of Calabria) and Paolo Cozzucoli (Dept. of Economics and Statistics - University of Calabria) *Keywords:* active factors, bayesian analysis, follow-up design *Format:* poster (Design of experiments) *Contact:* a.cossari@unical.it

In this paper the design and the analysis of a chemical experiment from Katovic et al. (2000) is reconsidered. The purpose of the experiment was to study the influence of four candidate two-level factors on the synthesis of LTA (Linde Type A) zeolite, a material with an extensive industrial use. The design chosen was an unreplicated 2^4-1 fractional factorial augmented with 3 center points, to be used to estimate error. A conventional formal analysis suggested, rather dubiously, that one of the factors, the crystallization time, had no influence on the content of zeolite. The 8-run design was reanalysed using the Bayesian analysis of Box and Meyer (1993). The marginal posterior probabilities that each factor is active corfirm the ambiguity on the activity of the crystallization time, suggesting that a small follow-up design is needed to resolve this ambiguity. We suggest to choose a 3-run design as proposed, within the Bayesian construct, by Meyer et al. (1996), being confident that this 3-run design, suitable for factor screening, will help to clearly identify the active factors. We believe that potential use of such follow-up designs, combined with the Bayesian analysis, is a better alternative to the practice of replicated center points, providing much more evidence of the factor activity than a conventional formal analysis based on an estimate of error with too few degrees of freedom.

**42. CONTROL CHARTS FOR THE PARETO DISTRIBUTION**

*Authors:* Stelios Psarakis (Athens University of Economics and Business - Dept of Statistics) and K. Atsarou *Keywords:* Pareto distribution, control chart, ARL *Format:* poster (Process modelling and control) *Contact:* psarakis@aueb.gr

Pareto distribution is considered to be a topic with various applications, some of which are personal incomes, occurrence of natural resources, error clustering in communication circuits and others. The interest in Pareto distribution is not only based on its various applications, but also on the fact that a mixture of exponential distributions with parameter è^(-1) having a gamma distribution and with origin at zero, gives rise to a Pareto distribution. As a result of its variety of applications, it appears to be important that control charts to detect shifts in the mean and variability be constructed on the assumption that the distribution of the quality characteristic under study is the Pareto distribution. In this paper Shewhart control charts with fixed sampling interval for controlling the mean and variability on the assumption that the distribution of the quality characteristic under study is the Pareto distribution are developed. The case of variable sampling interval is also examined. CUSUM procedures are developed for testing the mean and variance in the case of the Pareto distribution. In addition, the Average Run Length (ARL) is computed in order to evaluate the performance of the new charts.

**46. Intercalibration study of length measurements on device components**

*Author:* Birger Madsen (Novo Nordisk) *Keywords:* Intercalibration, generalised linear models, weighted linear model *Format:* poster (Statistical modelling) *Contact:* bsm@novonordisk.com

An intercalibration study on length measurements on device components was performed. Two laboratories were involved, one in Ireland and one in Denmark. Four device components were initially selected. For each component 3-4 batches were selected. For each batch, a number of items were selected, typically app. 200. Items were taken, so that all cavities of the producing machine were represented. Each item was then measured once in each laboratory. The purpose of the intercalibration study was to establish: 1. If items in each laboratory were measured with the same variance. 2. If items in each laboratory were measured with the same mean. Question 1 was investigated using a generalised linear model with gamma distribution and log link. It was found, that the variance depends significantly on batch as well as cavity. In order to investigate question 2 pair-wise differences from the two laboratories were taken. It was then investigated, if the variances of these pair-wise differences were constant. Since this was not the case, a weighted linear model was used in order to investigated, which factors influence the mean difference of measurements. In the resulting linear model least squares means were calculated expressing the mean differences between measurements from each laboratory. Calculations were done using SAS procedures GENMOD and GLM.

**57. Statistical Thinking, Statistical Education, Statistical Consulting: Invitation to Broad Discussion**

*Authors:* Vladimir Shper (REI) and Adler Yu. *Keywords:* Statistical thinking, future, statisticians and non-statisticians *Format:* (Statistical consulting) *Contact:* shper@vei.ru

*Note: the poster will be introduced on Monday, at 10:30-10:40*

On the one hand we consider this work as the natural continuation of our presentations in Rimini and Barselona. On the other hand we see it as a first step in a large international project. The goal of the project is to discuss the future of the notions mentioned in the head of this paper from different points of view. To be more exact we'd like to initiate at this Conference a broad discussion of very simple question: how statistical community may facilitate the coming of the day when the prophetic words of H.Wells Ã¢â‚¬â€œ "Statistical thinking will one day be as necessary for effective citizenship as ability to read and write" - will become a reality ?!

To this end we have prepared a version of Ishikava diagram where we presented our understanding of the problem. We plan if organizing committee supports this idea, to hang this diagram (only as a starting point) at the hall of our Conference and to ask all participants to add any arrows they consider necessary.

Then returning from the Conference we'll treat this picture and send it to all participants in order to collect everybody's ranks for each arrow. This will give us the expert estimates of what we should do from the viewpoint of statistical society. Simultaneously we plan to present this diagram at some Russian plants and companies in order to get the analogous estimates from the non-statisticians. We hope that some of our colleagues will be able to do the same at some companies in other European (and may be not only European) countries. And then at the next Conference we'll be able to present the results of such analysis and to discuss them with all interested in. Our hypothesis for this work is as follows: in a new and quickly changing world we need new approaches to implement statistical thinking into common sense and to this end we must find out both professional's and non-professional's opinions as completely as possible.

**58. Modelling of non-linear reference standard curves**

*Authors:* Kim Westi Jørgensen (Novo Nordisk A/S) and - *Keywords:* Non-linear regression, 4 Parameter Logistic, Weibull, New models, ELISA assays *Format:* poster (Statistical modelling) *Contact:* kwj@novonordisk.com

Data from three analytical methods were evaluated. The methods were all ELISA assays, i.e. methods based on binding of antigens to antibodies. The extent of the binding is detected via a spectroscopic analysis. To obtain quantitative data signal-response calibration curves are applied for the translation of detector output to concentration of the analysed agent. These curves are non-linear and the reliability of the results generated are dependant on the quality of the model for the non-linear regression. Five mathematical models were evaluated with respect to their ability for fitting the ELISA calibration curves. For the goodness-of-fit evaluation residual analyses were performed on fit (observed data, y=response, x=concentration) and back fit (x=response and result=corresponding concentration).Furthermore a Ã¢â‚¬Å“design of experiment approachÃ¢â‚¬Â was introduced for testing model robustness. Results: The model evaluation revealed that the commonly used model, 4PL (a four parametric logistic model) was superior only for one of the three assays whereas the Weibull distribution and a self constructed model gave the best fit for the two other assays. Robustness did not differentiate the models essentially but the disposition of the points on the calibration curve had a major impact on the robustness.

**59. Training statistical super users at a pharmaceutical production site**

*Authors:* Antje Christensen (Novo Nordisk) and Charlotte Verdier, Christian Fangel, Uffe Clausen, Ellinor Marina Damgaard (Novo Nordisk) *Keywords:* teaching, statistical skills in production, pharmaceutical industry *Format:* poster (Statistical consulting) *Contact:* antc@novonordisk.com

Novo Nordisks Ã¢â‚¬ÂStatistical super user courseÃ¢â‚¬Â is a training concept for spreading the use of statistics among chemists working in the production environment of a pharmaceutical company. The concept has been described earlier at an ENBIS annual meeting. In this presentation, a recent implementation at a specific production site is reviewed. The concept and training syllabus are described, and examples from the teaching materials are given. Part of the training are practical projects conducted by the participants. Several of the projects are presented briefly.

**61. Optimal 2^2 Factorial Designs for Binary Response Data with and without the Presence of Interactions**

*Authors:* Enrique González-Dávila (Universidad de La Laguna) and Dorta-Guerra, R. (Universidad de La Laguna); Ginebra, J. (Universitat Politècnica de Catalunya) *Keywords:* D-optimal designs, Logistic model, Probit model, Factorial designs. *Format:* poster (Design of experiments) *Contact:* egonzale@ull.es

Two-level factorial experiments are very useful in the early screening stages of an investigation and as building blocks for response surface exploration. Under normal linear models, the amount of information gathered through these experiments, as measured through the determinant of their information matrix, does neither depend on where the experiment is centered, nor on how it is oriented relative to the contour lines of the surface, balanced allocations are always more informative than unbalanced ones with the same number of runs, and including interaction terms in the model does not lead to an alternative choice of a factorial experiment. As a consequence, when planning for two-level factorial experiments for continuous responses and any number of factors, the only thing that matters is the range of variation of the factors involved. Instead, for binary responses none of these properties hold, and therefore planning for two-level factorial experiments is not so easy. In particular, this paper searches for the designs that maximize the determinant of the information matrix, within the class of 2^2 factorial experiments centered at a given point, for binomial models with main effects and second order interaction. That allows one to explore how the performance of these experiments on binary responses, depends on wether one is interested in estimating that interaction term or not.

**66. Justification of dissolution specification limits for solid dosage forms**

*Author:* Ken Sejling (Novo Nordisk A/S) *Keywords:* Simulation, specification limits, two-stage acceptance *Format:* poster (Statistical modelling) *Contact:* KeSe@NovoNordisk.com

The acceptance criterion for dissolution consists of a scheme having two stages. In the first stage six tablets are analyzed and evaluated individually. If either of them is below an acceptance value, another six tablets are analyzed. Acceptance on the second stage is obtained if none of the 12 dissolution results is below a lower acceptance value, and the average of the 12 results is not below another acceptance value. Evaluation of the probabilities of acceptance on either of the two levels is done by simulation. The simulation is based on a model describing the variation among batches, analysis runs in the laboratory and among the individual dissolution results. These variance components have been estimated on data from 21 batches.

**70. Short term load forecasting in distribution electric systems**

*Author:* Antonio Pievatolo (CNR-IMATI) *Keywords:* Statistical forecasting; power demand management; power distribution *Format:* poster (Process modelling and control) *Contact:* marco@mi.imati.cnr.it

The demand of electricity is continuously increasing, introducing instability in the electric distribution and transmission systems. Apart from building new power plants and transmission lines, instability can be prevented by managing the demand. For example, new electronic power meters could permit to remotely limit the energy consumption of each single costumer. For an efficient management action, it is useful to have methods that estimate and predict the load absorbed by groups of customers. We review recent literature on demand management at the distribution level.

**78. Using estimated values of capability indices for batch acceptance**

*Authors:* Poul Thyregod (Technical Univ of Denmark) and Camilla Madsen, Ditlef Bucher (Novonordisk) *Keywords:* SPC, acceptance sampling *Format:* poster (Process modelling and control) *Contact:* pt@imm.dtu.dk

With the increased focus on SPC in production, it has become a widespread practice among suppliers to use SPC-data like Cp and Cpk for batch acceptance - without taking the sampling uncertainty into account. In the paper we relate this practice to theories for acceptance sampling by variables, and discuss determination of batch acceptance rules that are based upon Cp and Cpk and with a specified risk.

**79. On the sensitivity of questionnaires for measuring customer satisfaction**

*Author:* Poul Thyregod (Technical Univ of Denmark) *Keywords:* Customer satisfaction, Likert Scale, Logistic regression *Format:* poster (Business and economics) *Contact:* pt@imm.dtu.dk

Customer satisfaction is often assessed using questionnaires with a five-point qualitative scale for the response. Because of the variation in respondents attitudes towards using extreme statements one will not expect perfect agreement among all respondents. The paper reports a study of customer responses to a question on their satisfaction with the punctuality of the bus. In the study some of the buses were delayed. We show that a logistic model for an ordered categorical response gives an adequate description of the variation in responses as function of the delay, and we illustrate how this model may be used to characterize the change in customer satisfaction with increasing delay.

**84. Computer Aided Modelling and Pollution Control in Cement Plants**

*Authors:* FLORIN POPENTIU (UNIVERSITY OF ORADEA) and Poul Thyregod (Technical University of Denmark), Florin Popentiu (University of Oradea), Grigore Albeanu (University of Oradea) *Keywords:* measurement management, statistical modeling, process modelling and optimization, pollution control *Format:* presentation (Process modelling and control) *Contact:* Florin.Popentiu@lis.jussieu.fr

Cements production requires intensive use of natural raw materials and energy, which results both in emissions to the atmosphere and soil. Controlling both the cement production process and the pollution level is difficult enough. This work presents a computer aided modelling and pollution control tool, with the following functionalities: (a) the achievement for representatives' mathematical models for both environmental pollution process and environment regeneration process in the absence of the pollution factors; (b) the monitoring of the production process with pollution influence; (c) the searching for the optimal solutions for production planning which minimize pollution effects. Three principal modules (statistical modelling module, measurement management module and optimization module) are integrated in a reliable software architecture. The modelling module is able to assist the end user in experimentally determining the parameters of the environment pollution. Also a logistic module provides reports for monitoring the pollution level and the contribution of each specific item of equipment to the pollution level. Based on management principle of large system, the optimization module implements algorithms to minimize the production costs while fulfilling the production plan under conditions of process restrictions. A large collection of nonlinear models and constrained optimization techniques is investigated in order to select the most suitable procedures for the cement industry case. Practical experience in designing and implementing such a system for a cement plant will be reported.

**88. Statistical process control in clinical practice**

*Authors:* Henrik Spliid (IMM, DTU) and Søren Lophaven, Søren Merser, Arne Borgwardt *Keywords:* quality control, exponential smoothing, hospital procedures *Format:* poster (Process modelling and control) *Contact:* hs@imm.dtu.dk

A control system for monitoring the quality of hospital procedures, based on methods from statistical process control, is presented. The system is developed at Informatics and Mathematical Modelling, Technical University of Denmark and implemented at Frederiksberg Hospital where it is used for monitoring the frequency of complications, e.g. infections, bleedings and a number of other deviatives from normal conditions, occurring in connection with operations of knees and hips. All complications are registered in a database, and for each type of registration a control chart, based on an exponential smoothing adaptation, is computed. Furthermore, a scoring system, which identifies the most likely cause of complication is implemented. This scoring system should be used to give support to the corrective actions taken. The advantage of this quality system compared to many other systems is the simultaneous registration of complications, computation of frequency and identification of most likely causes. This makes a fast implementation of correcting activities possible. The first results are promising and suggest that the system can be utilized within a wider range of clinical practices.

**94. Estimating 4 different prevalence values of pig carcasses from pooled samples**

*Authors:* Helle M. Sommer (Danish Institute for Food and Veterinary Research, DFVF) and Jens Strodl Andersen *Keywords:* Binary data, pooled samples, maximum likelihood function *Format:* poster (Statistical modelling) *Contact:* hms@dfvf.dk

A risk assessment on Salmonella DT104 in slaughter pigs was conducted in 2003. For this risk assessment prevalence values of carcasses originating from different herd groups (level I, II, III and IV) had to be obtained. Due to the sampling procedure (surveillance data) the samples were pooled independently of original herd levels. The challenge of this estimation was to obtain estimates of level-specific prevalences (one for each level) from the pooled samples. The pooled samples consisted of 5 swab samples taken from 5 different pig carcasses at the end of the slaughter line. The pooled samples were analyzed for the presence of Salmonella. For each pooled sample the level of the 5 individual swab samples were known. In a pooled sample the 5 swab samples come from one or several different herd levels. For example, a pool-combination could be samples from level I, I, I, II, IV. In total, 5,130 pooled samples with different combinations of levels were available. A maximum likelihood function consisting of binomial frequency functions was set up and used in estimating the 4 prevalence values simultaneously. The covariance values of the parameter estimates were found from the reverse Hessian matrix. For each pool-combination a frequency function was given by: f[j](y[j],p[i])=(1-(1-p[I]){n[I,j]}*(1-p[II]){n[II,j]}*(1-p[III]){n[III,j]}*(1-p[IV]){n[IV,j]}){y[j]} * ((1-p[I]){n[I,j]}*(1-p[II]){n[II,j]}*(1-p[III]){n[III,j]}*(1-p[IV]){n[IV,j]}){N[j]-y[j]} and the likelihood function was given by: L('theta')='capital pi'{m}[j=1]f[j](y[j],'theta') where 'theta' is a parameter vector ('theta'=p[i]) i is an index for the levels (i=I, II, III, IV) j is an index for the different pool-combinations (j=1, ..., m) m is the number of pool-combinations (m=26) y[j] is the number of positive pooled samples with the pool-combination j N[j] is the number of pooled samples with the pool-combination j n[ij] is the number of individual samples in a pool-combination j that originate from level i To editor: The following symbol in the equations should be replaces. [] means in subscribt {} means in supscribt * multiplication and should be replaced by a dot 'theta' means the Greek symbol for theta 'capital pi' means the multiplication sign given by the Greek symbol for capital pi

**97. Linkage Pattern Differences between European Countries and the Relationship with Innovation**

*Authors:* Shirley Coleman (ISRU) and Kim Pearce; Dave Stewardson *Keywords:* Innovation, co-operation *Format:* poster (Data mining) *Contact:* shirley.coleman@ncl.ac.uk

Analysis of a major database arising from Eurostat's second European Community Innovation Survey (CIS II) for the period 1994-1996 has uncovered many interesting findings as regards innovation characteristics of enterprises throughout Europe. This poster illustrates the apparent differences between certain countries as regards innovation co-operation in the manufacturing sector and demonstrates the link between co-operation and an enterprise's innovative strength. Enterprise size differences are also taken into consideration. It appears that innovation co-operation is indeed beneficial to the enterprise. The extent of co-operation differs between countries. Mediterranean countries (Italy, Spain and Portugal) and Germany co-operate the least for innovation and yet they are generally still as successful in innovation as countries which co-operate to a greater extent (i.e. Sweden, Ireland, France and Norway). It is found that Germany is very `internalised' and is quite different to Sweden which co-operates for innovation much more than other countries. This is also observed when small, medium and large enterprises are analysed separately. It is found that Germany has very weak links with the USA as regards innovation partners; Sweden has, by far, the most innovative enterprises with specific partners located in the USA, Ireland also has strong links with the USA. When size of enterprise is taken into consideration, it is demonstrated that generally larger enterprises are more `successful' as regards innovation. However, there are notable exceptions which are detailed for Germany and Ireland.

### 6A. DOE methodology

**43. Sixteen run designs ogf high projectivity for factor screening.**

*Authors:* John Tyssedal (The Norwegian University of Science and Technology) and George E. P. Box *Keywords:* Orthogonal array, factorial design, projectivity, screening design *Format:* presentation (Design of experiments) *Contact:* tyssedal@stat.ntnu.no

There are five sixteen run orthogonal arrays as discovered by Hall (1961). One of them is the normally used Plackett-Burman design with orthogonal main effect and interaction columns. The four others are in the class of non-geometric designs. In this work we investigate and discuss the projective properties of these designs as screening designs. In particular we look into how they project into three and four dimensions. From the 16 run Plackett-Burman design it is well known that it is possible to obtain a resolution four design, and hence a projectivity P=3 designs for eight factors. From the four others we show that designs of projectivity P=3 for 12 factors can be obtained for three of them and the fourth one even allow fourteen of its columns to be used as a projectivity P=3 design.

**47. Balanced Asymmetrical Nearly Orthogonal Designs for First and Second Order Effects Estimation**

*Authors:* Alberto Lombardo (University of Palermo) and Stefano Barone (University of Palermo) *Keywords:* Balancing, Interaction estimability, Asymmetrical (Mixed-Level) Designs, Nearly Orthogonal Arrays, Optimality, Two and Three-Level Designs *Format:* presentation (Design of experiments) *Contact:* lombardo@dtpm.unipa.it

Experimental practice sometimes suggests the use of balanced designs in order to optimise experimental resources, especially when factors have different numbers of levels. Furthermore researchers are often interested in a deeper knowledge of the phenomenon under study than that obtainable by the main effects estimation only. In this paper the authors propose a method for constructing asymmetrical (mixed-level) designs, satisfying the above requirements in a number of runs as small as possible, by means of a heuristic procedure based on a new optimality criterion. A complete collection of such designs with two and three-level factors is provided. A technological application is finally presented

**62. Using Taguchi's orthogonal arrays as a 3 level full factorial design**

*Authors:* Zorica Veljkovic (Faculty of Mechanical Enginering) and Slobodan Radojevic, Faculty of Mechanical Engineering, Belgrade *Keywords:* orthogonal array, latin square, full factorial design *Format:* presentation (Design of experiments) *Contact:* zoricave@ptt.yu

Paper discuss use of Taguchi's orthogonal arrays L(3m) as a full factorial designs for tree level factors, i.e. 3n. It is shown that orthogonal arrays L(34) and L(313) correspond 32 and 33 full factorial designs, respectfully. For that purpose comparision is done for full factorial designs in with factor level combination given by Yates, as well as a method using orthogonal Latin squares. On that way columns for main effects and two and tree level factor interactions are identified for shown orthogonal arrays. Further more, reorganization of columns in L(313) orthogonal array are made in order to follow conventionality of effect allocation correspond 2n designs.

**95. Analysis of Split Plot Designs with Mirror Image Pairs as Subplot**

*Authors:* murat kulahci (Arizona State University) and John Tyssedal *Keywords:* DOE, Split plot experiments *Format:* poster (Design of experiments) *Contact:* kulahci@asu.edu

Split-plot experiments are common in industry when experiments are used for process and product improvement. Split-plot designs are divided into whole-plots and sub-plots. For a randomly chosen level combination of the whole-plot factors, typically a design in the sub-plot factors is run in random order. Therefore the total amount of runs needed, is the number of whole-plot level combinations times the number of runs in the design for the sub-plot factors. When there are many factors, a two level experimental plan will often help to keep the number of runs to an acceptable level. Tyssedal, Kulahci and Bisgaard (2004) investigated two level split plot designs where for each level combination of the whole-plot factors, two runs as mirror image pairs are used at the sub-plot level. These designs can be constructed both from usual fractional factorials and from non-geometric designs. These designs have very economical run sizes and although being highly fractionated in the sub-plot factors they exhibit considerable structure. In addition, their projective properties are quite attractive compared to their run sizes making them very appropriate for screening purposes. In this paper we suggest a way of analyzing these designs in situations when a small subset of the factors investigated is of importance i.e. a typical screening situation. The proposed method is a two-step procedure where active effects are separately identified in both steps and subsequently estimates of the whole plot and subplot variances are obtained.

### 6B. Statistical modelling

**26. Cluster analysis of random point processes**

*Author:* Petr Volf (Technical University in Liberec) *Keywords:* Random point process, intensity, regression, cluster analysis *Format:* presentation (Statistical modelling) *Contact:* petr.volf@vslib.cz

Statistical reliability analysis often uses random point processes models for the description of sequences of failures and other events characterizing the state of a device. Except the proper modeling and subsequent prediction of the process development, the problem could consist in retrieval the similarities in observed data and, consequently, certain grouping of examined objects. Essentially, two approaches to searching the homogeneous groups in data or models can be considered. The first one searches for the similarities of models, by direct analysis of their parameters and components, e.g. by statistical testing. Similar possibility consists in considering model with latent factors, heterogeneities, and in analysis of these factors. The second approach is based on the direct use of cluster analysis methods. In the present contribution, we utilize the model based clustering, yielding simultaneously the set of models for distinct clusters and detecting outlied data. This methodology, though well developed, still suffers one problem, namely the assessing the number of clusters, though a set of more-less ad-hoc criteria have been proposed. An alternative is based on the consistent Bayes approach and takes the number of clusters as a random variable, too. With the aid of intensive MCMC computations the maximally aposteriori probable number of clusters is obtained. We compare both approaches and use them to the analysis in the framework of regression model for intensity of random point process in a real data case.

**36. Semi-parametric estimation of shifts**

*Authors:* Elie Maza (LSP & IRINCOM) and F. Gamboa (LSP), J-M. Loubes (CNRS) *Keywords:* semi-parametric estimation, shift estimation, Fourier transform *Format:* presentation (Statistical modelling) *Contact:* Elie.Maza@math.ups-tlse.fr

The purpose of the general study is the road traffic forecasting on highway networks. This work is supported by the TrafficFirst company (http://www.traffic-first.com) and his manager Christophe Communay. More precisely, the purpose is the short term forecasting of travel time on the Parisian highway network. Here, road traffic is described by the velocities of the vehicles. So, we aim at estimating road traffic at all the points of the observation grid. This observation grid is composed by all the measurement stations located on the road network. The methodology is based on a classification method. The Parisian road network infrastructure is composed by measurement stations, located approximately every 500 meters on the road network. These stations measure the road traffic evolution by calculating, for every fixed period and all the day, the mean velocity of the vehicles crossing them. These speed curves can be modelized as continuous functions. After a classification method is used, the speed curves are gather in a reduced number of clusters. So, the aim of this study is to calculate, for each cluster, the best representative profile. Indeed, because of shifts of user behaviors, the average curve isn't representative enough. So, we modelize the speed functions of each cluster by a Functional Shift Model and we use the Fourier Transformation to estimate the shift parameters of each curve. After that, a structural average can be calculated (see for example Functional Data Analysis, Ramsay J.O. and Silverman B.W., Springer Series in Statistics, 1997). This structural average has better representativeness than the average curve.

**71. Ensemble Methods and Partial Least Squares Regression**

*Authors:* Bjørn-Helge Mevik (Matforsk) and Vegard H. Segtnan (Matforsk), Tormod NÃƒÂ¦s (Matforsk) *Keywords:* ensemble methods, bootstrap aggregating (bagging), data augmentation, noise addition, partial least squares regression (PLSR) *Format:* presentation (Statistical modelling) *Contact:* bhx2@mevik.net

Recently, there has been an increased attention in the literature on the use of ensemble methods in multivariate regression and classification. These methods have been shown to have interesting properties both for regression and classification. In particular, they can improve the accuracy of unstable predictors. Ensemble methods have so far, been little studied in situations that are common for calibration and prediction in chemistry, i.e., situations with a large number of collinear $x$-variables and few samples. These situations are often approached by data compression methods such as principal components regression (PCR) or partial least squares regression (PLSR). The presentation presents results from an investigation of the properties of different types of ensemble methods used with PLSR in situations with highly collinear $x$-data. Bagging and data augmentation by simulated noise are studied. The focus is on the robustness of the calibrations. Real and simulated data is used. The results show that ensembles trained on data with added noise can make the PLSR robust against the type of noise added. In particular, the effects of sample temperature variations can be eliminated. Bagging does not seem to give any improvement over PLSR for small and intermediate number of components. It is, however, less sensitive to over-fitting.

### 6C. Measurement processes

**73. Adapting Statistical Process Control to Monitor Irradiation Facilities**

*Authors:* John Donovan (School of Engineering) and John Donovan (Institute of Technology Sligo Ireland) Eamonn Murphy (University of Limerick, Ireland) Peter Sharpe (National Physical Laboratory, UK) *Keywords:* Statistical Process Control, Irradiation, Standardisted control charts *Format:* presentation (Six Sigma and quality improvement) *Contact:* jjdonovn@iol.ie

Irradiation is used as a sterilisation method for medical devices, medical disposables and foodstuffs. There are a number of unique issues associated with introducing Statistical Process Control (SPC) to monitor the Irradiation facilities that do not exist in the more traditional manufacturing processes. Uncertainties inherent in the irradiation process and uncertainties associated with measuring the irradiation dose in addition to the method that the product is packaged all add to the difficulties of precisely identifying the irradiation dosage that an individual product receives. Formally these uncertainties could have been accounted for by irradiating at a sufficiently high level to ensure that each product receives the minimum required sterilisation dose. However more recently upper limits are now being placed on the maximum required sterilisation dosage that make this approach untenable. This paper looks at the measurement uncertainties associated with the irradiation process. Process guidance tables have been developed that incorporate these measurement uncertainties and facilitate the determination of a target irradiation dose. These tables are based on a predefined level of quality that the irradiation facility wishes to achieve. A standardised Statistical Process Control chart has been developed to monitor this target dose. Simulations have been performed that demonstrate the effectiveness of the strategy. A major benefit of the technique is that it is suitable for both electron beam and gamma irradiation facilities. It also allows a single control chart to monitor the entire process irrespective of the complexity of the product arrangement within the irradiation chambers.

**15. Statistical modelling of measurement uncertainty**

*Author:* Thomas Svensson (Fraunhofer Chalmers Research Centre for Industrial Mathematics) *Keywords:* measurement uncertainty, systematic error *Format:* presentation (Statistical modelling) *Contact:* thomas.svensson@fcc.chalmers.se

The evaluation of measurement uncertainty is commonly based on an analysis of the different components of uncertainty influences. In industrial practice this type of analysis is often a very difficult task and the analyst is forced to make large approximations on the conservative side. I addition there is a tendency to avoid the problem of identifying the difference between systematic and random contributions. In consequence, the uncertainty analysis often becomes useless for other purposes than the fulfilment of formal quality standards. In measurement situations, such as scientific developments of new products, quality control in a production plant, or comparable measurements within a limited population it is of utmost importance to keep the measurement uncertainty to a minimum. Three methods may be identified for the minimization of uncertainty: 1) improvement of the measurement system, 2) elimination of systematic errors by comparison measurements, and 3) reducing random errors by repeated measurements. The two last methods require that systematic and random errors are distinguished. A problem at hand is that a systematic error at a operators or instrument level may be regarded as a random error on the laboratory level, in turn a laboratory systematic error may be appear random at the global level, and so forth. Hence, there is a need for a rational model description of measurements and their errors. Here, such a description will be proposed where systematic errors are identified as the bias between specific measurement situations. With the aid of a simplified mathematical description the problem is described and related to the practical methods 2 and 3 above for minimization of measurement uncertainty.

**40. Calibration of a non-destructive measurement method using mixture designs.**

*Authors:* Frøydis Bjerke (Matforsk AS) and Bjørg Egelandsdal, Torunn Thauland *Keywords:* uncertainty, prediction ability, multi-constrained region, correlated regressors, model input precision *Format:* presentation (Statistical modelling) *Contact:* froydis.bjerke@matforsk.no

Measuring sodium content in refined meat (e.g. salted ham, cured meat) by traditional wet-chemical methods is cumbersome and requires physical meat samples. Computer tomography (CT) is proposed as an alternative, non-destructive method for measuring salt content. However, the CT method seems to be influenced by the presence of other chemical components in the samples. The variation and prediction ability of the CT method is therefore investigated in meat samples of varying chemical composition (protein, fat, salt, water), by the use of a mixture design in a multi-constrained region. Such designs often have correlated regressors, in particular if interactions and other higher order terms are included in the statistical model. The presentation discusses the practical results as well as the statistical challenges related to the construction of the design and the calibration modelling and testing. The CT alone does not predict the salt content very well, but prediction improves substantially when information on one more component is added. It is therefore interesting to study how precise the model input on, say, protein level have to be, in order to obtain a reasonably good prediction of the salt level for a certain sample.

**90. Confidence Intervals on Measures of Precision in an Interlaboratory Study**

*Authors:* Michiel Nijhuis (N.V. Organon) and Edwin van den Heuvel (N.V. Organon) *Keywords:* analysis of variance, variance components, confidence intervals *Format:* presentation (Statistical modelling) *Contact:* michiel.nijhuis@organon.com

Within the pharmaceutical industry it is common practice to transfer analytical methods between laboratories. From these interlaboratory studies the repeatability, the intermediate precision and the reproducibility of the analytical method are calculated. These measures are appropriate sums of variance components from an analysis of variance model describing the structure of the data. In the literature several methods are described for calculating (approximate) confidence intervals on sums of variance components, i.e. Satterthwaite, Welch and Modified Large-Sample. Comparison studies between these methods have been performed for one or two-way classification analysis of variance models only. Interlaboratory studies often need higher order classifications. We will discuss one specific three-way classification analysis of variance model for the transfer of analytical methods that is used frequently. In practice one or more variance components may be estimated negatively. These negative estimates can be handled in several ways. For instance these values can be used in the calculation of the measures of precision or these values can be set equal to zero before the measures of precision are calculated. By means of a simulation study the performance of the different approximation methods was compared for both calculation procedures.