ENBIS-12 in Ljubljana

9 – 13 September 2012 Abstract submission: 15 January – 10 May 2012

My abstracts


The following abstracts have been accepted for this event:

  • Cyborg Spring: Social/Media/Revolution

    Authors: Robert Kozinets (York University)
    Primary area of focus / application: Mining
    Keywords: social media, netnography
    Submitted at 11-Jun-2012 10:22 by Robert Kozinets
    10-Sep-2012 09:20 Netnography in a Nutshell
    Social media are here to stay, a corrosive and constructive dynamic influence that is destabilizing the worlds of business, consumption, and even nationhood. Through social media, communities transform organizations, and organizations transform them. In this wide-ranging presentation, anthropologist, marketing professor, and social media pioneer Robert Kozinets conceptualizes the new social media landscape, revealing the five major integrative trends in personal media deployment. By showing how sources of meaning making and loci of border spanning are changing, the presentation offers rich examples and many ideas about how social media and marketing are changing and will change in the next few years. The presentation illuminates paths of development for a future where media realities integrate in new and unexpected ways. In the process, Robert Kozinets also orients the audience to the netnography, a social media research method based upon the anthropological approach of ethnography.
  • On the Many Faces of Text Processing

    Authors: Marko Grobelnik (Inštitut Jožef Stefan)
    Primary area of focus / application: Mining
    Keywords: textual data, text processing
    Submitted at 11-Jun-2012 10:25 by Marko Grobelnik
    11-Sep-2012 09:00 On the Many Faces of Text Processing
    Why do people process textual data with computers? It all started many years ago, with the main goal in minds of researchers, to understand the text. In the meantime, the area of text processing developed in many different directions whereby the original goals were often forgotten. Funny enough, it seems, in several decades of computerized processing of textual data, the solution to the 'text understanding' problem didn't evolve much compared to some other, easier and often more profitable problems to deal with (such as information retrieval/search, machine translation or information extraction). We will touch various aspects of text processing along several dimensions: (a) how we represent the textual data, (b) what kind of algorithms and techniques we use, and (c) what kind of problems we solve on the top of text. Finally, it is interesting to observe various research communities dealing with textual data in different ways. Most of them are still rather fragmented and don't learn enough from each other ‐ many of the ideas developed within one community don't cross borders of that community for too long.
  • Bayesian Analysis of Short-term Directional Data for Wind Potential Assessment

    Authors: Pasquale Erto (University of Naples Federico II), Antonio Lanzotti (University of Naples Federico II), Antonio Lepore (University of Naples Federico II)
    Primary area of focus / application: Modelling
    Keywords: Bayesian analysis, directional analysis, wind-farm layout, wind speed distribution
    Submitted at 12-Jun-2012 10:51 by Antonio Lepore
    Accepted (view paper)
    11-Sep-2012 17:10 Bayesian Analysis of Short-term Directional Data for Wind Potential Assessment
    Properly and timely assessment of wind potential is required to evaluate wind-farm project viability. In fact, the economic profitability is strongly influenced not only by the wind speed, which is needed to define the turbine type to be installed, but also by its direction, which represent the dominant parameter of the wind-farm layout design. Timely decisions have to be based on short-run anemometric data from the site under investigation. Unfortunately, such data could risk to be poor if performed when the wind is not blowing from the prevailing direction(s). This paper proposes a Bayesian analysis which exploits the prior information about wind distribution elicited by using consolidated data usually available at a neighboring survey station. In particular, wind direction is grouped in sectors and modeled with the multinomial distribution. and the Dirichlet distribution is chosen as prior. The latter is set on the basis of the Fisher’s angular-angular association between the candidate site and the neighboring survey station. Then, the Bayesian approach proposed by Erto et al. (2010), which involves MCMC (Markov chain Monte Carlo) method, is opportunely adjusted in order to supply the estimates of the wind speed distribution for each sector. Such analysis is proposed to cope with actual problems faced by renewable energy companies as encouragingly shown by an application to real anemometric data from a Southern Italian site.
  • Nonparametric Control Charts: The Data Depth Approach

    Authors: Giovanni Porzio (University of Cassino), Giancarlo Ragozini (University of Naples Federico II)
    Primary area of focus / application: Process
    Keywords: control charts, processes, nonparametric statistics, simplicial depth, convex hull probability depth
    Submitted at 12-Jun-2012 11:04 by Giovanni Porzio
    11-Sep-2012 17:30 Nonparametric Control Charts: The Data Depth Approach
    A basic assumption commonly underlying the development of control charts is that the process is well described by a normal distribution. However, in practice this assumption rarely holds. In addition, when many variables must be jointly monitored, knowledge of the process distribution may be hardly available. Hence, any specific parametric model proves inappropriate and some alternative solutions should be adopted. This motivates the need for nonparametric control charts.
    Aim of this work is thus to provide an overview of a special type of nonparametric control charts, the data depth control charts. Data depth is a function that measures the centrality of a point with respect to a given multivariate distribution. The deepest points lie at the core of the distribution, while points with lower depths are located in the distribution tails. In the multivariate statistical process control setting, the deepest points will correspond to items of higher quality, under the assumption that the center of the process distribution is the quality target to be achieved.
    Although very attractive, the use of data depth charts has been somewhat limited by the computational efforts required to implement them in practice. At least, this seems a drawback for the charts based on the simplicial depth. Alternatively, the adoption of charts based on the convex hull probability depth is encouraged.
  • A Nonparametric Multivariate Location Control Chart for Angular Symmetric Distributions

    Authors: Amor Messaoud (University of Tunis), Giovanni Porzio (University of Cassino), Giancarlo Ragozini (University of Naples Federico II)
    Primary area of focus / application: Process
    Keywords: control charts, processes , nonparametric statistics, directional symmetric distribution
    Submitted at 12-Jun-2012 11:13 by Amor Messaoud
    11-Sep-2012 17:50 A Nonparametric Multivariate Location Control Chart for Angular Symmetric Distributions
    When a multivariate process distribution is unknown, one may rely on nonparametric control charts to monitor the process behavior. In such a case, a possible approach is to assume weak features for the unknown distribution. For instance, very recently Zou and Tsung (2011) assumed that this latter is directional symmetric. Based on this hypothesis, the authors developed a nonparametric location EWMA chart exploiting a multivariate sign test statistic based on a proper corresponding median estimator.
    We highlight that multivariate normal is a special case of a directional symmetric distribution. Hence, it seems worth to develop nonparametric charts that merely assume the process distribution is directional symmetric. Furthermore, we note that directional symmetry is equivalent to half-space symmetry in the case of continuous distributions (Zuo and Serfling, 2000). Consequently, a whole class of alternative median estimators may be considered. Furthermore, one may exploit alternative geometrical properties of such distributions to design a proper control chart.
    In this work, we observe that in the case of directional symmetry the width of the angles that each consecutive observation yields in a sequence of in-control data are uniformly distributed. This allows us to design a simple angle chart that seems to be very effective in detecting shift in the process location. In addition, we note that this chart is insensitive to change in the process covariance structure. Consequently, the proposed chart may be also used after a signal in a nonparametric scale chart in order to discriminate between possible out-of-control causes.

    Zou, C. and Tsung, F. (2011) A Multivariate Sign EWMA Control Chart. Technometrics, 53, pp. 84-97.

    Zuo, Y. and Serfling, R. (2000). On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. Journal of Statistical Planning and Inference, 84, pp. 55-79.
  • Use of Targeted Bayesian Network Learning for Suspects Identification

    Authors: A. Gruber (Tel Aviv University, Department of Industrial Engineering and Management), S. Yanovski (Tel Aviv University, Department of Industrial Engineering and Management), I. Ben-Gal (Tel Aviv University, Department of Industrial Engineering and Management)
    Primary area of focus / application: Mining
    Keywords: Bayesian classifier, Targeted Bayesian Network, suspect activities, communication network
    Submitted at 18-Jun-2012 09:14 by Irad Ben-Gal
    We present a data mining application for detecting suspected activities in a communication network. The identification is based on learning and characterizing behavioral patterns based on taged data from telecommunication objects. One of the main endeavors in this arena is to extract information from meta-data only, namely avoiding access to the contents of the calls or the messages. In addition to privacy issues, the motivation is driven by the technical, legal and financial procedures involved in content captures that should be minimized.

    The application was performed using a learning-based model by the targeted Bayesian network learning (TBNL) method. The underlying principle of this method is that it attempts to best approximate the marginal probability distribution of a predetermined target variable, depending on other attribute variables within the domain. This TBNL algorithm enables an efficient management of the tradeoff between the model’s complexity and the model’s classification accuracy, by using information theory metrics.
    Our results show that the TBNL fulfills the requirement of 50% sensitivity rate (Recall) with at most 1% false positive rate (FPR). These results, which reflect behavioral patterns, show that the method reduces the FPR by 50% for a required sensitivity level compared with other Bayesian classifiers.