root
Packages
Core PMML4S functionality. A PMML (Predictive Model Markup Language) scoring library in Scala. PMML is the leading standard for statistical and data mining models and supported by over 20 vendors and organizations. With PMML, it is easy to develop a model on one system using one application and deploy the model on another system using another application, simply by transmitting an XML configuration file.
Core PMML4S functionality. A PMML (Predictive Model Markup Language) scoring library in Scala. PMML is the leading standard for statistical and data mining models and supported by over 20 vendors and organizations. With PMML, it is easy to develop a model on one system using one application and deploy the model on another system using another application, simply by transmitting an XML configuration file.
- See also:
http://dmg.org/ for details about PMML
Place for all extensions from well known vendors
Place for all extensions from well known vendors
PMML is a standard for XML documents which express trained instances of analytic models. The following classes of model are addressed:
PMML is a standard for XML documents which express trained instances of analytic models. The following classes of model are addressed:
- Association Rules, implemented by org.pmml4s.model.AssociationModel
- Baseline Models, NOT IMPLEMENTED
- Bayesian Network, NOT IMPLEMENTED
- Center-Based & Distribution-Based Clustering, implemented by org.pmml4s.model.ClusteringModel
- Gaussian Process, NOT IMPLEMENTED
- General Regression, implemented by org.pmml4s.model.GeneralRegressionModel
- k-Nearest Neighbors, implemented by org.pmml4s.model.NearestNeighborModel
- Naive Bayes, implemented by org.pmml4s.model.NaiveBayesModel
- Neural Networks, implemented by org.pmml4s.model.NeuralNetwork
- Regression, implemented by org.pmml4s.model.RegressionModel
- Ruleset, implemented by org.pmml4s.model.RuleSetModel
- Scorecard, implemented by org.pmml4s.model.Scorecard
- Sequences, NOT IMPLEMENTED
- Text, NOT IMPLEMENTED
- Time Series, NOT IMPLEMENTED
- Decision Trees, implemented by org.pmml4s.model.TreeModel
- Support Vector Machine, implemented by org.pmml4s.model.SupportVectorMachineModel
At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.
At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.
PMML defines various kinds of simple data transformations:
- Normalization: map values to numbers, the input can be continuous or discrete.
- Discretization: map continuous values to discrete values.
- Value mapping: map discrete values to discrete values.
- Text Indexing: derive a frequency-based value for a given term.
- Functions: derive a value by applying a function to one or more parameters
- Aggregation: summarize or collect groups of values, e.g., compute average.
- Lag: use a previous value of the given input field.