root

Packages

package org.pmml4s

Core PMML4S functionality. A PMML (Predictive Model Markup Language) scoring library in Scala. PMML is the leading standard for statistical and data mining models and supported by over 20 vendors and organizations. With PMML, it is easy to develop a model on one system using one application and deploy the model on another system using another application, simply by transmitting an XML configuration file.

Core PMML4S functionality. A PMML (Predictive Model Markup Language) scoring library in Scala. PMML is the leading standard for statistical and data mining models and supported by over 20 vendors and organizations. With PMML, it is easy to develop a model on one system using one application and deploy the model on another system using another application, simply by transmitting an XML configuration file.

See also:

http://dmg.org/ for details about PMML

Place for all extensions from well known vendors

Place for all extensions from well known vendors

PMML is a standard for XML documents which express trained instances of analytic models. The following classes of model are addressed:

PMML is a standard for XML documents which express trained instances of analytic models. The following classes of model are addressed:

At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.

At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.

PMML defines various kinds of simple data transformations:

  • Normalization: map values to numbers, the input can be continuous or discrete.
  • Discretization: map continuous values to discrete values.
  • Value mapping: map discrete values to discrete values.
  • Text Indexing: derive a frequency-based value for a given term.
  • Functions: derive a value by applying a function to one or more parameters
  • Aggregation: summarize or collect groups of values, e.g., compute average.
  • Lag: use a previous value of the given input field.