ENBIS-8 in Athens

21 – 25 September 2008 Abstract submission: 14 March – 11 August 2008

My abstracts


The following abstracts have been accepted for this event:

  • On Applications of the Relative Linkage Disequilibrium

    Authors: Ron S. Kenett and Silvia Salini
    Affiliation: KPA Ltd., Raanana, Israel and University of Torino, Torino, Italy, email: ron@kpa.co.il, Department of Economics, Business and Statistics. University
    Primary area of focus / application:
    Submitted at 21-Apr-2008 14:15 by SALINI SILVIA
    Accepted (view paper)
    23-Sep-2008 15:00 On Applications of the Relative Linkage Disequilibrium
    Relative Linkage Disequilibrium (RLD) was originally proposed as an approach to analyse both quantitatively and graphically general two way contingency tables (Kenett 1983). It was later expanded to the Data Mining context to evaluate Association Rules (Kenett and Salini, 2008). RLD can be interpreted graphically using a simplex representation leading to powerful graphical display of association relationships. Moreover the statistical properties of RLD are known so that confirmatory statistical tests of significance or basic confidence intervals can be applied. In this work we present several applications of RLD such as Risk Management, Kansei Engineering, Text mining, web clickstream analysis.

    (Some) References

    Hahsler, M., Gr¨un, B., and Hornik, K. (2005). arules – A computational environment for mining association rules and frequent item sets. Journal of Statistical Software, 14(15):1–25. ISSN 1548-7660. URL http://www.jstatsoft.org/v14/i15/.

    Kenett, R. (1983). On an Exploratory Analysis of Contingency Tables. The Statistician, 32, pp. 395-403.

    Kenett, R and Salini, S., "Relative Linkage Disequilibrium: A New measure for association rules" (March 2008). UNIMI - Research Papers in Economics, Business, and Statistics. Statistics and Mathematics. Working Paper 32. 

    Multi-Industry Semantic-Based Business Intelligence Solutions (2008), http://www.musing.eu/download-area/musing-public-documentation/it-operational-risk-bi-system-executive-summary-mar-2007-v1-0-kpa/at_download/file

    Nagamachi, M. (1995), Kansei Engineering: a new ergonomic consumer-oriented technology for product development. International Journal of Industrial Ergonomics, 15: 3-11

    Omiecinski, E. (2003). Alternative interest measures for mining associations in databases. IEEE Transactions on Knowledge and Data Engineering, 15(1):57–69.

    Shimada, K., Hirasawa K, and Hu J. (2006) Association Rule Mining with Chi-Squared Test Using Alternate Genetic Network Programming, ICDM2006.

    Van Lottum, C., Pearce, K., Coleman, S. (2006), Features of Kansei Engineering Characterizing its Use in Two Studies: Men’s Everyday Footwear and Historic Footwear. Quality and Reliability Engineering International, 22: 629-650.
  • Advances in Operational Risk Management: Statistical aspects of the MUSING project on combining semantic and quantitative data in risk assessment

    Authors: Ron Kenett and Yossi Raanan
    Affiliation: KPA Ltd., Univ of Torino, Italy and College of Management, Buisness School, Rishon LeZion, Israel
    Primary area of focus / application:
    Submitted at 21-Apr-2008 15:01 by Ron Kenett
    Accepted (view paper)
    22-Sep-2008 14:40 Advances in Operational Risk Management: Statistical aspects of the MUSING project on combining semantic and quantitative data in risk assessment
    Operational Risks are rising out of the ever-growing Information and Communications Technology systems. Operational Risk (OpR) is everywhere in the business world and, indeed, even in our households. When computing technologies encompass so much of our daily work life the risks associated with them lead frequently to unwanted and hazardous results. In extreme cases, these risks may become catastrophic and cause bankruptcy or other grave consequences. Thus, it is a very important to address OpR in a systematic, scientific, experience-based and results-driven approach.

    MUSING (MUlti-industry, Semantic-based next generation business INtelliGence) is an R&D initiative co-funded by the European Commission in the context of the VI Framework Programme (http://www.musing.eu). The main objective of the 4 year project, started in 2006, is to design, develop, and test Business Intelligence services combining unstructured data such as text, with structured data, such as balance sheets. The innovation in MUSING is in the use of advanced statistical and data mining methods, such as Bayesian Networks, with new sophisticated technologies based on the semantic analysis, such as ontologies and XBRL. MUSING tools have been applied to data and information gathered from internal databases, as well as external data sources (Internet, financial newspapers, etc.), thus supporting and helping several time and money consuming activities, such as
    - Automatic extraction of information from several digital information sources
    - Identification and mapping of Operational Risks (OpR)
    - Development of practical solutions for SME for OpR mitigation
    - Web based services in support of SME internationalisation

    The talk will cover the main innovation in MUSING, with an emphasis on the role of statistical methods in integrating qualitative, semantic based, data and quantitative information, to produce a comprehensive Operational Risk Management System. Specifically, we will cover a case study based on an application of MUSING methodology to a telecom operator.
  • Kernel based confidence intervals for survival function estimation

    Authors: Bagkavos Dimitris, Ioannides Dimitris Kalamatianou Aglaia.
    Affiliation: Accenture Marketing in Athens, University of Macedonia and Pantion University .
    Primary area of focus / application:
    Submitted at 22-Apr-2008 07:49 by Dimitris Ioannides
    23-Sep-2008 11:25
    The survival function is one of the most important functions in large variety of industrial problems such as reliability analysis, industrial strength testing etc, as it addresses concepts such as scheduling, maintenance, improved system design, cost analysis, etc. Kernel based estimation of the survival function imposes minimal assumptions on the data and thus offers large flexibility. Based on an existing kernel survival function estimate that admits censored data, we develop confidence intervals to help assess the validity of the estimate. Practical issues of estimation are discussed and then the developments are applied to a real data set. The results are analyzed and discussed further.
  • The r-out-of-m S Control Chart

    Authors: Antzoulakos D.L. and Rakitzis C.A.
    Affiliation: Department of Statistics & Insurance Science, University of Piraeus, Greece
    Primary area of focus / application:
    Submitted at 22-Apr-2008 17:58 by Athanasios Rakitzis
    Accepted (view paper)
    23-Sep-2008 12:00 The r-out-of-m S Control Chart
    Control charts with supplementary runs rules for detecting shifts in the process variance have not received as much attention as their counterparts for detecting shifts in process average. Monitoring the variance of a manufacturing process is as important as monitoring its average. In this paper we examine the performance of S control charts accompanied with runs rules for monitoring increases or decreases in process variance. The proposed one-sided charts overcome the weakness of high false alarm rates when runs rules are used to supplement a control chart. The average run length performance and the design of the proposed schemes are studied thoroughly. Finally, the performance of appropriately designed two-sided schemes is investigated as well.
  • Quality improvement of land cover data bases: a sequential approach via agreement measures

    Authors: E. Carfagna J. Marzialetti
    Affiliation: University of Bologna
    Primary area of focus / application:
    Submitted at 23-Apr-2008 14:13 by elisabetta carfagna
    Accepted (view paper)
    22-Sep-2008 14:00 Quality improvement of land cover data bases: a sequential approach via agreement measures
    Land cover data bases are frequently produced through photo-interpretation of remote sensing data according to a legend of land cover types. During the photo-interpretation process, the photo-interpreter outlines polygons and can make mistakes concerning the borders of polygons as well as the land cover type. Another much more experienced photo-interpreter (the controller) performs a quality control on a sample of polygons in order to test if some mistakes have been made by the previous photo-interpreter. The result of the two photo-interpretations is a confusion matrix with the classes used in the photo-interpretation by the photo-interpreter (rows - i) and by the controller (columns - j). We assume that the same classes are used. The matrix is based on the number of polygons (or the area of polygons divided by the total area) classified to class i by the photo-interpreter and to class j by the controller.
    In this paper, we use quality control for continuously improving the data base production process and to this purpose we propose an adaptive sequential sample design which allows reaching high precision of estimates with the smallest sample size and in the shortest time. However, adaptive sequential procedures do not allow unbiased estimates of the quality parameters because their efficiency is due to sample selection dependent on previously selected units and stopping rules based on the quality parameter.
    In Carfagna and Marzialetti (2007) we have proposed an adaptive sequential procedure with permanent random numbers which allows unbiased and efficient estimates of the percentage of area correctly photo-interpreted. In this procedure the sample size per stratum is dependent on the previously selected units but the sample selection is not, and the stopping rule is not based on the estimates of the quality parameter.
    In this paper we apply the same adaptive sequential procedure to some measures of agreement, such as the Cohen’s Kappa and its weighted version, where the weights are an inverse function of the distance among the classes of the photo-interpretation legend. Both agreement measures can be applied to confusion matrices based on the area of polygons as well as on the number of polygons.
    We have performed several simulations corresponding to different selections of polygons from a land cover data base of photo-interpreted satellite data. The behaviour of the standard deviations of the Cohen’s Kappa and the Weighted Kappa has showed that the sequential procedure allows us to reach a specified precision of the estimates with a smaller sample size with respect to the classical two-step procedure where the sample size necessary to obtain that specific precision is computed on the basis of an initial sample. However, for some simulations, the trend of the standard deviations is not strictly monotone. Some simulations show a small increase of the standard deviation, some others show jumps of the standard deviation to higher values.
    With reference to confusion matrices based on the number of polygons, we have used also a log-linear model to describe the “structure” of the agreement (Tanner and Young 1985), where we consider two terms expressing respectively the agreement obtained by chance alone and the effective agreement existing between the photo-interpreter and the controller. If we hypothesize that the level of the agreement does not change among the classes of the legend we can use the homogeneous model, otherwise a non-homogeneous model is more appropriate. We propose a sequential approach for the log-linear model too.
    Carfagna E. and Marzialetti J. (2007) Sequential Design in Quality Control and Validation of Land Cover Data Bases, Proc. of the ENBIS-DEINDE Conference “Computer Experiments versus Physical Experiments”, Torino, G. Vicario and E. D. Isaia (eds.).
    Tanner M. A. and Young M. A. (1985) Modeling agreement among raters, Journal of the American Statistical Association, 80, 175-180.
  • INFORMS: Quality Statistics and Reliability Section Panel Discussion

    Authors: Russell Barton, Irad Ben-Gal, Yu Ding, and Jan Shi
    Affiliation: Institute for Operations Research and the Management Sciences (INFORMS)
    Primary area of focus / application:
    Keywords: Quality, Reliability, Teaching
    Submitted at 23-Apr-2008 16:07 by Irad Ben-Gal
    23-Sep-2008 16:10 INFORMS: Quality Statistics and Reliability Section Panel Discussion
    The Quality Statistics and Reliability (QSR) Section is an interdisciplinary section of the Institute for Operations Research and the Management Sciences (INFORMS), comprised of members from industrial engineering, statistics, business communities, and quality/reliability practitioners from different regions. The INFORMS QSR Section will sponsor a panel discussion session during the ENBIS annual conference in September 2008. A panel of QSR affiliated researchers will discuss the organization of the QSR section as well as QSR’s research and education activities. The panel includes Professor Russell Barton (QSR advisory board member), Professor Irad Ben-Gal (QSR liaison officer), Professor Yu Ding (2008 QSR chair), and Professor Jan Shi (QSR founding chair).