Overview of all Abstracts

The following PDF contains the Abstractbook as it will be handed out at the conference. It is only here for browsing and maybe later reference. All abstracts as PDF

My abstracts

 

The following abstracts have been accepted for this event:

  • A Corrected Likelihood-Based Confidence Area for Weibull Distribution Parameters and Large-Scale Life Time Data

    Authors: Haselgruber, Nikolaus
    Primary area of focus / application:
    Submitted at 7-Sep-2007 06:48 by
    Accepted
    The Weibull distribution is, in particular for technical applications, a common life time model and data often will be observed in large-scale experiments. Several methods are available to estimate the distribution parameters and confidence areas usually are computed applying the large sample theory for maximum likelihood estimators. Large-scale life time experiments are expensive, consequently samples tend to be small and of short duration which causes right-censored data. The large sample theory looses its applicability.

    This presentation suggests a correction of the likelihood-based confidence area which significantly increases its accuracy for small and moderately censored samples.
  • The use of intelligent Experimental Designs for Optimal Automotive Engine Calibration Online at Engine Test Bench.

    Authors: Thierry Dalon (Siemens VDO Automotive AG, Regensburg, Germany)
    Primary area of focus / application:
    Submitted at 7-Sep-2007 07:31 by
    Accepted
    Control-unit calibration for modern internal combustion engines is currently facing a conflict caused by the additional effort needed to calibrate increasingly complex engine data with a growing number of parameters, together with extremely ambitious objectives regarding the period of time and the resources needed for calibration, performance, consumption, and comfort expected by the customer and emissions levels which are more and more stringent.

    To reduce costs we look for reducing testing time at test bench and hence use minimal number of measurements. That leads to Optimal Experimental Design approaches. Designing experiments often leads to trade-offs between local and global search: local criteria encompass achieving best calibration i.e. the optimization of a target (for example performance) under many constraints (emissions, consumption), whereas global criteria tend to explore the whole domain or improve model quality.

    We present here the context and methods investigated at Siemens VDO Automotive for optimal engine calibration online at the test bench.
    The approach will be illustrated on a practical industrial engine calibration example.

    Keywords: Automotive Engine Calibration, Design of Experiments, Online Optimization, Model-based/ surrogate optimization

    Specifics: It is a presentation related to Dr. Karsten Roepke expertise field.
  • Efficient experimental designs in the presence of more than one hard-to-change variable

    Authors: Heidi Arnouts, Peter Goos (University of Antwerp, Antwerp, Belgium)
    Primary area of focus / application:
    Submitted at 7-Sep-2007 08:04 by Heidi Arnouts
    Accepted
    In ''real-life'' experiments, especially in an industrial environment, experimental
    factors are often not independently reset for each run. This is often due to time
    and/or cost restrictions in the production process. A lot of research has been done
    for the situation in which there is only one hard-to-change variable in the experi-
    ment, the so called ''split-plot'' experimental design. In industrial settings however
    there are often more factors that are ''hard-to-change'' and therefore it is also inter-
    esting to search for optimal designs that involve several hard-to-change variables.
    Some published research deals with this topic but under the restriction that all the
    hard-to-change variables are reset at the same time which reduces this problem to
    a split-plot experiment. In our research, we relax this constraint and look for D-
    optimal designs allowing the various hard-to-change variables to be reset at different
    points in time.
  • Designs for first-order interactions in choice experiments with binary attributes

    Authors: Heiko Grossmann, Rainer Schwabe, Steven G. Gilmour
    Primary area of focus / application:
    Submitted at 7-Sep-2007 11:17 by
    Accepted
    Choice experiments aim at understanding how preferences for goods or services are influenced by the features of competing options and applications in marketing, health economics and other fields abound. In recent years, the efficient design of choice experiments has attracted considerable interest. Typically, these designs have been derived within the framework of the multinomial logit (MNL) model. When it is assumed that the choice probabilities within each choice set are equal, the design problem for the MNL model is equivalent to the corresponding problem for an approximating linear model. By using the correspondence between the design problems, in this talk for choice experiments involving pairs of options described by a common set of two-level factors new exact designs are derived which allow the efficient estimation of main effects and first-order interactions. These designs compare favorably with available alternatives in the literature in that for high efficiencies they usually require the same or a considerably smaller number of choice sets. Similarly, for the same number of choice sets they possess the same or a higher efficiency.
  • Local Models in Data Mining

    Authors: Gero Szepannek, Julia Schiffner and Claus Weihs (University of Dortmund, Dortmund, Germany)
    Primary area of focus / application:
    Submitted at 7-Sep-2007 12:40 by
    Accepted
    In classification tasks it may sometimes not be meaningful to build
    single rules on the whole data. This may especially be the case if the
    classes are composed of several subclasses.

    This talk gives
    an overview over several proposed methods to solve this problem. These
    methods can be subdivided into methods that either need the subclasses
    to be specified in advance (see e.g. Weihs et al., 2006) or methods that
    determine the locality in the data itself in an unsupervised manner (see
    e.g. Hastie et al., 1996 or Czogiel et al., 2007). Some new issues are
    also presented. All methods are evaluated and compared on several
    real-world classification problems.

    References:

    Czogiel, I., Luebke, K., Zentgraf, M., Weihs, C. (2007): Localized
    Linear Discriminant Analysis. In: Decker,R., Lenz, H. Gaul W. (eds):
    Advances in Data Analysis, Springer-Verlag, Heidelberg, 133-140.

    Hastie, T., Tibshirani, R., Friedman, J. (1996). Discriminant Analysis
    by Gaussian Mixtures , JRSS B 58, 158-176.

    Weihs, C., Szepannek, G., Ligges, U., Luebke, K. and Raabe, N. (2006):
    Local Models in Register Classification by Timbre. In: V.Batagelij,
    H.Bock, A.Ferligoj and A.Ziberna (eds): Data Science and Classification,
    Springer-Verlag, Heidelberg, 315-322.
  • Load Shedding: a new proposal

    Authors: R. Faranda, A. Pievatolo and E. Tironi
    Primary area of focus / application:
    Submitted at 7-Sep-2007 13:31 by
    Accepted
    During overloads in the mains, the load curtailment applied to
    interruptible loads is often the only solution to keep the network in
    operation. Normally, in contingencies, the difference between the power
    absorbed and the power produced is very low, often less than 1% of the
    latter. Therefore if all the loads participated in the load shedding
    program, the discomfort would be minimal, considering its usually short
    duration. According to this point of view, we present a new approach to
    the load shedding program to guarantee the correct electrical system
    operation by increasing the number of participants. This new load
    control strategy is named Distributed Interruptible Load Shedding
    (DILS). Indeed, it is possible to split every user's load into
    interruptible and uninterruptible parts, and to operate on the
    interruptible part only. The optimal load reduction request is found by
    minimizing the expected value of an appropriate cost function, thus
    taking the uncertainty about the power absorbed by each customer into
    account.
    Presently, several users such as hospitals, data centres, supermarkets,
    universities, industries, etc. might be very interested in typical
    shedding programs as a way to spare money in their electrical account.
    However, in the future, when the domotic power plants are likely to be
    used widely, the distributors could interest the end users in
    participating in DILS programs for either economic or social reasons.
    By adopting the DILS program, the distributors can resort to the
    interruptible loads not only in case of emergency conditions but also
    during normal and alert operations.

    Key Words - Black out, Demand Side Management, Load Shedding,
    Interruptible Load, Stochastic Approximation, Uncertain System
  • On-line diagnostics tools in the Mobile Spatial coordinate Measuring System (MScMS)

    Authors: Franceschini F. 1, Galetto M. 1, Maisano D. 1, Mastrogiacomo L. 1
    Primary area of focus / application:
    Submitted at 7-Sep-2007 15:58 by
    Accepted
    Keywords: mobile measuring system, wireless sensor networks, dimensional measurements, diagnostics, localization algorithms, physical and model redundancy.
    Abstract
    Mobile Spatial coordinate Measuring System (MScMS) is a wireless-sensor-network based system developed at the industrial metrology and quality engineering laboratory of DISPEA – Politecnico di Torino. It has been designed to perform simple and rapid indoor dimensional measurements of large-size volumes.
    It is made up of three basic parts: a “constellation” of wireless devices (Crickets), liberally distributed around the working area; a mobile probe to register the coordinate points of the measured object (using the constellation as a reference system); a PC to store data sent – via Bluetooth – by the mobile probe and to elaborate them utilising an ad hoc application software, created in Matlab. Crickets and mobile probe utilize ultrasound (US) transceivers in order to communicate and evaluate mutual distances.
    The system makes it possible to calculate the position – in terms of spatial coordinates – of the object points “touched” by the probe. Acquired data are then available for different types of elaboration (determination of distances, curves or surfaces of measured objects).
    In order to protect against causes of error such as, for example, US signal diffraction and reflection, external uncontrolled US sources (key jingling, neon blinking, etc...), or software algorithms non-acceptable solutions, MScMS implements some statistical tests for on-line diagnostics. Three of them are analyzed in this paper: “energy model diagnostics”: based on the “mass-spring system” localization algorithm; “distance model diagnostics”: based on the use of a distance reference standard embedded in the system; “sensor physical/model diagnostics”: based on the redundancy of Crickets’ US transceivers. For each measurement, if all these tests are satisfied at once, the measured result may be considered acceptable with a specific confidence level. Otherwise, the measurement is rejected.
    This paper, after a general description of the MScMS, focuses on the description of these three on-line diagnostic tools. Some preliminary results of experimental tests carried out on the system prototype in the industrial metrology and quality engineering laboratory of DISPEA – Politecnico di Torino are also presented and discussed.
  • Robust estimation of the variogram in computer experiments

    Authors: O. Roustant , D. Dupuy, C. Helbert (Ecole des Mines, Saint-Etienne, France)
    Primary area of focus / application:
    Submitted at 7-Sep-2007 16:04 by Olivier Roustant
    Accepted
    This article deals with the estimation of the spatial correlation of kriging models in
    computer experiments. Coming from geostatistics, the kriging model is a Gaussian
    stochastic process
    $$Y(x) = m(x) + Z(x)$$
    where $x$ is a $d$-dimensional vector, $m(x)$ is a deterministic trend, and $Z(x)$ a stationary
    centered stochastic Gaussian process with spatial correlation function $R(h)$. Both trend
    and spatial correlation should be estimated from data. However, this is not the case in
    computer experiments, since a specific parametric form for $R$ is assumed. The most
    common choice is the anisotropic power-exponential function:
    $$R(h) = exp(-\sum_{k=1}^d \theta_k |h_k|^{pk})$$, with $$0 < p_k \leq 2, k=1,...d$$

    This contrasts with geostatistics where the spatial correlation is estimated through the
    variogram:
    $$2\gamma(h) = var(Z(x+h)-Z(x))$$
    Defined for intrinsic processes, the variogram is equivalent to $R(h)$ for stationary
    processes. Using the variogram instead of the correlation function is recommended even
    if the process is stationary, because of possible contaminations by trend estimate
    residuals.
    The estimation of
    $\gamma(h)$ from a given design
    $x^{(1)},...,x^{(n)}$ is not an easy task since the
    random variables
    $(Z(x + h) - Z(x))^2$ are not independent and strongly skewed. In
    particular, large values may affect the estimation. For this reason, robust estimation is
    encouraged. Two estimators were proposed by Cressie-Hawkins (1980) and Genton
    (1998). In this paper, we compare the properties of these estimators with a trimmed
    mean. Simulations with various amounts of outliers are done, in the same way as
    Genton's. We observe that both estimators give similar results, and both are
    outperformed by the trimmed mean. In addition, we extend the study by analyzing the
    robustness of these estimators to the deviations from normality. To achieve this, a 3-
    dimensional industrial problem is considered.

    References:
    Chilès J-P., Delfiner P. (1999), Geostatistics. Modeling Spatial Uncertainty, Wiley & Sons

    Cressie N. (1993), Statistics for Spatial Data, Wiley & Sons

    Cressie N., Hawkins D.H. (1980), ''Robust estimation of the variogram: I'', Mathematical
    Geology, 12 (2), 115-125

    Genton M. (1998), ''Highly Robust Variogram Estimation'', Mathematical Geology, 30 (2), 213-
    221

    Huber P.J. (1977), Robust Statistical Procedures, SIAM

    Rousseuw P.J., Croux C. (1993), ''Alternatives to the Median Absolute Deviation'', JASA, 88
    (424), 1273-1283

    Santner T.J., Williams B.J., Notz W.I. (2003). The Design and Analysis of Computer
    Experiments, Springer.

    Keywords: Computer experiments Variogram, Kriging model, Anisotropy, Robustness.
  • Model-robust designs for assessing the uncertainty of simulator outputs with linear metamodels

    Authors: B. Gauthier, L. Carraro, O. Roustant (Ecole des Mines, Saint-Etienne, France)
    Primary area of focus / application:
    Submitted at 7-Sep-2007 16:37 by
    Accepted
    This articles addresses the industrial problem of quantifying the
    distribution $Y_{sim}(x)$ of the output of a costly simulator when
    the inputs $x$ are random variables with known distribution $\mu$.
    Due to the computing time, a Monte Carlo method cannot be applied
    directly to the simulator but only to an approximate model
    $Y_{app}(x)$. This $metamodel$ is built with few experiments
    $X =(x^{(1)},..., x^{(n)})$. The question is: how to choose the design
    of experiments $X$, so that the distributions of $Y_{app}(x)$ and $Y_{sim}(x)$ are close?

    Consider a deterministic simulator. In many situations, it is approached by a linear
    combination of known basis functions $g_0,...,g_p$

    $$Y_{sim}(x) = \sum_{i=0}^{p}\beta_ig_i(x) + h(x)$$

    with $\beta_0,...,\beta_p$ (unknown) real coefficients, and $h$ an unknown function standing for a
    model deviation. The corresponding metamodel is:

    $$Y_{app}(x) = \sum_{i=0}^{p}\hat{\beta}_ig_i(x) + \eta(x)$$

    where, conditionaly to spatial random variables,
    $( \eta (x))$ is a centered Gaussian
    process representing the estimation error. The parameters
    $\hat{\beta}_0,...,\hat{\beta}_p,\hat{\sigma}^2$ have to be
    estimated with the $n$ simulator values calculated for
    $x \in X$ , for instance by ordinary
    least-squares.
    In this framework, one can compute the two spreads
    $|E(Y_{app}(x))-E(Y_{sim}(x))|$ and
    $|var(Y_{app}(x))-var(Y_{sim}(x))|$. We show that with poor conditions on the model deviation
    $h$, it is possible to choose $X$ to minimize these quantities. We assume that $h$ belongs to a
    reproducing kernel Hilbert space $H$: in usual cases, this only implies regularity
    conditions to $h$. Following Yue and Hickernell (1998), both criteria can be bounded by
    expressions depending only on
    $||h||_H$. Optimal designs are then obtained by minimizing
    the largest eigenvalue of positive definite matrices. Finally, this methodology is
    extended to stochastic simulators of the form

    $$Y_{sim}(x) = \sum_{i=0}^p \beta_i g_i(x) + h(x) + \varepsilon(x) $$

    where
    $(\varepsilon(x))$ is a Gaussian process modelling the numerical error.

    References:

    Carraro L., Corre B., Helbert C., Roustant O., Josserand S. (2007). Optimal designs for the
    propagation of uncertainty in computer experiments, Chemometrics and Intelligent
    Laboratory Systems, to appear.

    Carraro L., Corre B., Helbert C., Roustant O. (2005). Construction d'un critère d'optimalité
    pour plans d'expériences numériques dans le cadre de la quantification d'incertitudes,
    Revue de Statistique Appliquée.

    Santner T.J., Williams B.J., Notz W.I. (2003). The Design and Analysis of Computer
    Experiments, Springer.

    Wahba G. (1990). Spline Models for Observational Data, SIAM, Philadelphia.

    Yue R.-X., Hickernell F.J. (1998). Robust designs for fitting linear models with
    misspecification, Statistica Sinica 9, p. 1053-1069.

    Keywords: Computer experiments, uncertainty propagation, metamodeling, model-robust
    designs, reproducing kernel Hilbert space.
  • Genetic algorithms and grid technologies in clustering

    Authors: Cs. Hajas, Zs. Robotka, Cs. Seres and A. Zempléni (Loránd Eötvös University, Budapest, Hungary)
    Primary area of focus / application:
    Submitted at 7-Sep-2007 20:38 by
    Accepted
    Loránd Eötvös University, Budapest

    In our days quite often very large data sets have to be processed. Data mining is
    definitely an important and rapidly developing area for such problems. In this
    presentation we focus on an important part of such work, namely clustering several
    thousand objects of high dimensionality.

    For the clustering, we used a version of the genetic algorithm. Such algorithms
    imitate the natural selection process by random coupling of pairs of candidates for
    the best (fittest) clustering and avoid the convergence to a local maximum by rare,
    random mutations. In clustering applications the objective function is based on the
    sum of the squared distance between all pairs in the clusters, with a suitable
    compensation, which prefers the small number of clusters.

    For large datasets and algorithms, which can easily be parallelised, the use of a
    grid of computers is a natural, widely used idea. We compared the performance of the
    grid-based results of our algorithm to the traditional, single-processor version.
    Our data base consisted on 10000 images of medium resolution, so the total size was
    around 0.5GB. Such problems may arise in industrial setup as well, such as in
    welding processes or in character recognitions for applications such as car
    manufacturing (see [1]).

    The preprocessing constructs a Gaussian Mixture Model (GMM) representation of the
    images. The GMMs are estimated with an improved Expectation Maximization (EM)
    algorithm that avoids convergence to the boundary of the parameter space, see [2].
    Image clustering is done by matching the representations with a distance-measure,
    based on the approximation of the Kullback-Leibler divergence.

    References:

    [1] Content based threshold adaptation for image processing in industrial application
    Aiteanu, D.; Ristic, D.; Graser, A. Control and Automation, 2005. ICCA apos;05.
    International Conference on Volume 2, Issue , 26-29 June 2005 Page(s): 1022 - 1027
    Vol.2

    [2] Zs. Robotka and A. Zempléni: Image Retrieval using Gaussian Mixture Models.
    SPLST Symposium, Budapest, 2007.