ENBIS Spring Meeting 2015

4 – 5 June 2015 Abstract submission: 1 February – 30 April 2015

The programme of the meeting is as follows:

  • Thursday, June 4th in the morning:         keynotes followed by an open discussion
  • Thursday, June 4th in the afternoon:       oral presentations
  • Friday, June 5th in the morning:              keynotes followed by an open discussion
  • Friday, June 5th in the afternoon:            optional workshop on predictive analytics with JMP

Thursday and Friday mornings will be devoted to keynotes followed by an open discussion. An initial list of keynote speakers include Chris Gotwalt (JMP), Frode Huse Gjendem (Accenture), Geoff Vining (VirginiaTech) and Tony Greenfield (Greenfield Research), and Petra Perner (ibai).

Robert Gimeno  Chris Gotwalt  Petra Perner Kristof Mertens  Geoff Vining  Tony Greenfield

Thursday afternoon will accomodate oral presentations, so we cordially invite you to prepare and submit an abstract for the meeting.

On Friday afternoon, a workshop with the title "Effective Statistical Modelling with Messy Data", lead by Volker Kraft, PhD, JMP Academic Ambassador, will take place. You can register to this workshop here.

Keynote presentations

 

Robert Gimeno, Accenture
"Journeys in Analytics"; how companies are approaching analytics and what makes their journey successful

Most companies leadership are developing Analytics roadmaps or at least claim to have a vision around Analytics. They understand analytics can provide them with competitive advantage through increasing market share or reducing costs… But how are the successful companies addressing Analytics? In this presentation we will  explore how successful companies approaches  Big Data & Analytics in the Supply Chain/Operations Function. We will look at how they address the complete chain from Issue to Outcome and the success factors that accompanies their investments. These includes for example: Choosing the right level of Analytics, Organizational Adaptation, Technical decisions, Information System approach, Value of Visualization….
We will also exemplify some of the success factors through practical case studies and we will round off the presentation with a vision of where the usage of analytics is going.


Chris Gotwalt, JMP Division, SAS Institute
Penalized Regression – A General Regression Framework for Variable Selection

In the last decade there has been a tremendous amount of statistical research into regression methods that incorporate variations of a sum of absolute deviations (l1) penalty into the estimation procedure.  An important consequence of the geometry of the l1 penalty is that the model fitting algorithm simultaneously does parameter estimation and model selection.  In this presentation, we will introduce and motivate a variety of l1 penalized regression methods, including the Lasso, Adaptive Lasso, and Elastic Net, and show their connection to more established procedures such as Forward Selection and Ridge Regression.  We will provide a brief overview of their statistical properties and related inferential procedures.  We will then show how these methods can be used to analyze a diverse array of applications, including both designed experiments as well as observational data.  We will also show how these procedures can be combined with some simple design matrix augmentation tricks to yield effective procedures for outlier and change point detection.  A portion of the presentation will be a live software demonstration of penalized regression methods using JMP Pro, which features a highly interactive and easy to use interface making these modern regression methods easily accessible to a wide audience. 


Petra Perner, Institute of Computer Vision and applied Computer Sciences IBaI
Complementarities and Differences between Machine Learning and Data Mining and Statistics in Analytics and Big Data

The analysis of complex data has been studied for decades by the Machine Learning and Data Mining Community and the Statistics Community. Substantial effort has been spend in developing new and effective analysis methods for different data types ranging from numerical and symbolical data to multimedia data. These methods have been applied to different fields such as medicine, marketing, process control data and many other fields, where the results often lead to a big return of invest for the application experts.

Analytics and Big Data are nowadays big challenges for further research effort considering the fact, that there have to be processed distributed mass data of different types, often in an incremental way. Therefore, different issues arise such as data handling, cloud computing, incremental data analysis, outlier detection and reporting of the results in an understandable manner.

In the talk, we will discuss the complementarities and the differences between Machine Learning and Data Mining and Statistics in Analytics and Big Data.


Kristof Mertens, Porphyrio
Industrial statistics in modern commercial livestock production – opportunities and challenges

Modern livestock production operations of poultry, pigs and cows are typically increasing in size and are highly automated. With the automation, modern livestock farms are equipped with an increasing amount of sensors that register all important technical and production parameters (climate, water consumption, feed consumption, production, growth, etc.) on a regular basis. These sensor data contain important (in)direct information about the physiological state and welfare of the animals. Hence they are a source for a farm manager to understand how his animals are doing.

Turning these sensor data into actionable management information is very challenging. Livestock production processes are complex: they are typically non-stationary with a high degree of interaction and variability, within and between the animals.

This paper presents the application of industrial statistics techniques on these biological production processes, in a commercial context.


Geoff Vining, Department of Statistics, VirginiaTech & Tony Greenfield, Greenfield Research
Predictive Analytics with Big and Complex Data:  Contrarian Views

Predictive analytics is a bold and exciting field within statistical analysis.  Many extremely enthusiastic people embrace this area as the future for data analysis.  This enthusiasm does lead to some interesting, in some cases unrealistic, claims about the power of predictive analytics to solve real problems of interest.

Developing techniques to deal with large, even massive, data sets is not new.  Computational ability simply expands the size of the data sets that people may study.  Clearly, people are developing many interesting computational approaches for dealing with the very serious issues that massive data sets provide.  However, the fundamental issues are much the same as they were twenty-five years ago.  In fact, one can argue that Fisher faced similar issues when he developed factorial experiments.

This talk articulates very serious statistical issues that predictive analytics must face.  In reality, predictive analytics deal almost exclusively with observational studies, many of which involve historical data.  It is vital to understand the limitations observation data place on the analysis and, more importantly, on the interpretation of the results.  This talk raises the issue of “information rich” versus “information poor” data.  This talk then suggests some approaches for tackling information rich observational data sets.