ENBIS Spring Meeting 2015

4 – 5 June 2015 Abstract submission: 1 February – 30 April 2015

My abstracts


The following abstracts have been accepted for this event:


    Authors: Ron Kenett (KPA Ltd.), Uri Shafrir (University of Toronto)
    Primary area of focus / application: Education & Thinking
    Keywords: Statistical Education, Massive Online Open Courses, Formative Assessment, Conceptual Understanding, Meaning Equivalent Reusable Learning Objects (MERLO)
    Submitted at 21-Feb-2015 06:37 by Ron Kenett
    Accepted (view paper)
    The talk will discuss general aspects of MOOC including the concept of “on air” and “on ground” activities. It will then expand on the need to have a structure formative assessment activity providing participants and instructors with feedback on the level of understanding reached by the participants with respect to the key domain knowledge elements covered by the specific MOOC.
    We suggest a MOOC weekly quiz based on MERLO items in an “on air” or “on ground” format. If conducted “on ground”, it provide learners with opportunities to cooperate with their peers in small groups; to compare representations of the conceptual situation under consideration; and to present to the group alternative responses regarding equivalence-of-meaning among different representations.
    Experience supported by systematic video recordings of these small group sessions in MERLO weekly quizzes reveals enhanced students’ engagement and peer cooperation. Such videos often show members of a group listening intently – indeed, sometimes being overwhelmed - by one of their colleagues presenting a convincing point-of-view. This is sometimes accompanied by lively arguments among students who call attention to different aspects of meaning-equivalent representations in the MERLO assessment item being considered; and resemble scenes often observed when academics are ‘talking shop’, namely, engaging in arguments about a particular representation of a conceptual situation.
    Weekly MERLO formative quizzes provide students with regular opportunities to develop habits of conceptual thinking, and demonstrate the effectiveness of pedagogy for conceptual thinking and peer cooperation in motivating and engaging students. This is particularly important in MOOCs, where weekly MERLO quizzes can provide individual students with regular classroom opportunities to cooperate in small groups and learn with and from their peers.
    Feedback of individual scores of these regular weekly formative MERLO assessments identify ‘soft conceptual spots’ in students’ current understanding of important conceptual issues, and guide future class activities to remedy conceptual misunderstandings.
  • Bayesian estimation of complex networks and dynamic choice in the music industry

    Authors: Stefano Nasini (IESE-Business-School), Víctor Martínez-de-Albéniz (IESE-Business-School)
    Primary area of focus / application: Modelling
    Secondary area of focus / application: Business
    Keywords: Music industry, Complex networks, Panel data, Dynamic choice, Bayesian inference, MCMC
    Submitted at 3-Mar-2015 16:29 by Stefano Nasini
    Accepted (view paper)
    Panel data from the music industry are often associated to complex patterns of internal dependency between the dynamic choices of broadcasting companies (TV channels and radio stations). Dynamic complex networks are highly dimensional and combinatorial objects, capable of capturing such correlation patterns. However, even when few broadcasting companies are taken into account, high dimensionality arises from the quadratic growth of their connections within the network. We present an exponential random model to jointly deal with the dynamic and structural aspect of such complex statistical setting, along with a Bayesian estimation framework. We argue that the intractability of the normalizing constants of such probabilistic models entails a ‘double intractability’ of the posterior distribution. This drawback can be overcome by embedding the defined model into a Bayesian estimation framework and applying a specialized MCMC procedure, based on the joint simulation both from the parameter and the sample spaces. After a detailed analysis of the proposed statistical methodology, we present an empirical application to a large data set of song diffusion on the radio, where stations may have pairwise spillover effects. Our dynamic model has substantial predictive capability for the music industry, and allows estimating the pairwise dependency of the choices made by radio stations.
  • Big Data - My view

    Authors: Jonathan Smyth-Renshaw (Jonathan Smyth-Renshaw & Associates Ltd)
    Primary area of focus / application: Business
    Keywords: Bayes, Models, Population v sample, Case studies
    Submitted at 5-Mar-2015 17:55 by Jonathan Smyth-Renshaw
    Accepted (view paper)
    'For big data to be useful it must be seen to serve society as a whole and we should help (ENBIS)'

    The concept of big data is a 'new area' for all aspects of business and society in general. The use/analysis of data is not new, remember the last try: "There are three kinds of lies: lies, damned lies and statistics." Mark Twain. Therefore, does big data present a new beginning for statisticians/data analysers? I wish to share my view on big data and propose some ideas and models presenting positive and negative aspects of big data. I wish to examine the following areas:

    A model for big data

    Are models tailored to the individual as a result of big data?

    Can graphical methods engage lay users to make sense of big data?

    Can Bayesian Thinking play a bigger role?

    Use of probability trees to understand the use of big data?

    Population v samples? (Much statistical thinking is based on sampling theory)

    Understanding target/variation from big data

    If there is time I also wish to examine a couple of case studies/examples where 'expected models' seem to have failed to make the available data more understandable and failed the individuals involved.
  • Monetising company big data

    Authors: Shirley Coleman (ISRU, Newcastle University), Sophie Whitfield (ISRU, Newcastle University), Joanna Berry (NUBS, Newcastle University), Garth Johnson (ADL Smartcare Ltd), Peter Gore (ADL Smartcare Ltd)
    Primary area of focus / application: Business
    Secondary area of focus / application: Mining
    Keywords: new-product-development, SME, assisted living, scoring
    Submitted at 9-Mar-2015 21:52 by Shirley Coleman
    It is common knowledge that most companies have enormous amounts of data in their archives and at their disposal. Increasingly, companies are aware that the data contain vivid pointers for business advantage. In short, their company data can be monetised; gaps in the market can be uncovered for mutual benefit of customers, companies and suppliers.
    As people live longer, businesses and research communities are looking for ways to increase our period of self-sufficiency and greater quality of life using equipment for assisted living. Newcastle University are working in a Knowledge Transfer Partnership with a small to medium enterprise (SME) concerned with activities of daily living called ADL Smartcare Ltd. There is a wide choice of assistive living equipment and characteristics of different people's needs have been collected through an expert system that elicits a person's characteristics and requirements through a self-completed online questionnaire concerned with both their physical environment and personal functional needs. Items which satisfy all the environmental, personal and functional criteria are identified; a massive dataset of needs has been collected over a ten year period. Monetisation of such big data presents exciting business opportunities and also a way to address the challenges of changing income streams and societal focus.
    This case study describes the monetisation of many millions of searches conducted by many thousands of people looking for help with activities of daily living. The data reveal the prevalence of different types of problems, which problems are poorly accommodated, which products are most often recommended and what opportunities there are for new product development. We will review some of the problematic aspects of the data and the various difficulties that we have encountered, and present some successful business examples and our evolving guidelines of good practice for monetising company big data.
  • Intelligent energy tips applied to a massive data set of household electricity consumption using clustering techniques

    Authors: Gerard Mor (CIMNE-International Center of Numerical Methods in Engineering), Xavier Cipriano (CIMNE-International Center of Numerical Methods in Engineering), Jordi Cipriano (CIMNE-International Center of Numerical Methods in Engineering)
    Primary area of focus / application: Business
    Secondary area of focus / application: Mining
    Keywords: Clustering, electricity markets, customer classes, power eletricity
    Submitted at 11-Mar-2015 10:27 by Xavier Cipriano
    Accepted (view paper)
    This paper summarizes a method to generate intelligent tips about potential electricity savings over a large number of household customers on the basis of their electricity consumption behaviour. Starting from an extensive field measurement-based electricity consumption time series of 7909 customers, over a period of 12 months at hourly frequency, we searched for similar groups of users according to: the mean relative energy consumption in 7 periods of a day, the weather dependency of the electric consumption, and the variance of the hourly power load factor of each household customer. The process concluded with 13 user clusters which were grouped using a combination of two clustering methodologies, self-organizing maps (SOM) and K-means. In parallel to the clustering process, a set of 100 electrical saving tips are defined and weighted based on 10 variables. The assignment of the tips to each customer is performed considering the user score calculated over the previous 10 variables within its cluster and the corresponding weight vector of each tip. In order to avoid repetition and to assure a minimum of motivation of the user, the tips are calculated every month and only a set of the 5 highest scored tips is delivered each time. The application of the method over the users in their real operation environment will start within the next months and the energy savings to be achieved will be measured, however, some initial validation of the coherence of the generated tips has been already performed over the available time series data
  • How to produce predictive models on a assembly line

    Authors: Andrea Ahlemeyer-Stubbe (ahlemeyer-stubbe)
    Primary area of focus / application: Mining
    Secondary area of focus / application: Business
    Keywords: predictive modeling,, automation, Big Data, development speed
    Submitted at 11-Mar-2015 19:36 by Andrea Ahlemeyer-Stubbe
    To detect fast changes in customer behavior or to react in as focused a manner as possible, predictive modeling must be done in good quality to get effective predictions of customer behavior and it has to be done fast to be relevant under business aspects. Modeling speed is of great importance in industry as time is a crucial factor. This necessity requires a different technical set up for model development to fulfill both needs: quality and development speed. Today most companies like to develop their models individually with the help of specialists. But for a lot of companies, this way takes too long; even though the models are excellent, the time to develop them sometimes kills the advantages of a better prediction. This article describes the general structure and ideas how to implement industry-focused model production that will help to react quickly to changing behavior. We will discuss the key success factors and the pitfalls of this assembly line model product.