Developed by the Data Mining Group, an independent, vendor led committee, PMML provides an open standard for representing data mining models. In this way, models can easily be shared between different applications avoiding proprietary issues and incompatibilities. Currently, all major commercial and open source data mining tools already support PMML

PMML is an XML-based language which follows a very intuitive structure to describe data pre- and post-processing as well as predictive algorithms. Not only does PMML represent a wide range of statistical techniques, but it can also be used to represent input data as well as the data transformations necessary to transform raw data into meaningful features.

As part of the Data Mining Group, Zementis is committed to the continual development of PMML. It is our vision for the community that users will be free to share models among many solutions, benefiting from an environment in which interoperability is truly attainable.

You can use the PMML converter to validate your PMML file against the specification for versions 2.1, 3.0, 3.1, and 3.2. If validation is not successful, the converter will give you a file back with explanations for why the validation failed (click on the "details" hyper-link).

Before actual conversion takes place, the validation phase needs to be successful, i.e. your file needs to conform to the PMML specification as published by the DMG (for any of the older PMML versions listed above).

The PMML converter currently converts the following model elements to PMML 3.2:

Association Rules
Clustering Models
Decision Trees
General Regression Models Regression
Naive Bayes Classifiers
Neural Networks Regression Models
Support Vector Machines

It will also convert pre- and post-processing PMML elements.

For more information on how to use the converter, please watch our VIDEO TUTORIAL.

Please note that our converters were built to the best of our knowledge to make the transition from previous PMML versions to version 3.2 as transparent as possible. However, please make sure to validate the resulting file in ADAPA after conversion by comparing expected and computed results.

If problems arise, e.g., due to various custom PMML extensions used by other vendors, please do not hesitate to contact us. We are happy to work with you to extend our converters and are planning to release the code under an open source license in the future.

PMML Examples

To experiment with the PMML example files, please follow these steps:

1.  Save model (.XML) and data file (.CSV) to your local computer.
2.  Upload a model file in ADAPA.
3.  Validate the model by executing the respective data file.

The examples we provide are based on publicly available datasets. The DMG publishes a list of PMML sample models which inspired our collection of PMML 3.2 examples.

For more information on the Iris and El Nino datasets, please refer to: Asuncion, A. & Newman, D.J. (2007). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science.

The audit dataset is available through the R rattle package. For more information on rattle, please refer to http://rattle.togaware.com.

The shuttle O-ring data is based on a number of O-ring failures for each shuttle flight preceding the Challenger disaster. The cause of the explosion was determined to be an O-ring failure in the right solid rocket booster. The Challenger disaster has become a case study in the possible consequences of poor data analysis.

PMML Community Forum

For an on-going discussion and to read about the latest PMML news, we would like to invite you to join the PMML group in LinkedIn or the discussion forum in the PMML group on Analytic Bridge, a social network community for analytics professionals.

PMML Links

We have compiled a list of useful PMML links below. Please, make sure to check them if you would like to become a PMML pro.