Advanced Data Mining with Weka

This course follows on from Data Mining with Weka and More Data Mining with Weka. It provides a deeper account of specialized data mining tools and techniques. Again the emphasis is on principles and practical data mining using Weka, rather than mathematical theory or advanced details of particular algorithms. Students will analyse time series data, mine data streams, use Weka to access other data mining packages including the popular R statistical computing language, script Weka in Python, and deploy it within a cluster computing framework. The course also includes case studies of applications such as classifying tweets, functional MRI data, image classification, and signal peptide prediction.

A new session starts on 10 July 2017, and the course will run in self-paced unsupported mode until 2nd October whereupon Statements of Completion will be produced and mailed out. Students should have completed Data Mining with Weka and More Data Mining with Weka, or have equivalent knowledge of the subject.

The course features:

Subscribe to the Announcements forum for updates and reminders.

Please read the Terms of Service and Participant Information Sheet before registering.

The course is given by the data mining group in the Department of Computer Science, University of Waikato:

  • Ian Witten, Mark Hall, Peter Reutemann, Eibe Frank, Albert Bifet, Pamela Douglas, Geoff Holmes, Mike Mayo, Bernhard Pfahringer, Tony Smith.


  • Pre-course survey

  • Class 1 - Time series forecasting

  • Class 2 - Data stream mining with Weka and MOA

  • Mid-course assessment

  • Class 3 - Interfacing to R and other data mining packages

  • Class 4 - Distributed processing with Apache SPARK

  • Class 5 - Scripting Weka

  • Post-course assessment

  • Post-course survey