Apply defines the application of a function.
Apply defines the application of a function. The function itself is identified by name with the function attribute. The actual parameters of the function application are given in the content of the element. Each actual argument value is given by an EXPRESSION and are mapped by position to the formal parameters in the corresponding function definition.
Constant values can be used in expressions which have multiple arguments.
Constant values can be used in expressions which have multiple arguments. . The actual value of a constant is given by the content of the element. For example, <Constant>1.05</Constant> represents the number 1.05. The dataType of Constant can be optionally specified.
Defines new (user-defined) functions as variations or compositions of existing functions or transformations.
Defines new (user-defined) functions as variations or compositions of existing functions or transformations. The function's name must be unique and must not conflict with other function names, either defined by PMML or other user-defined functions. The EXPRESSION in the content of DefineFunction is the function body that actually defines the meaning of the new function. The function body must not refer to fields other than the parameter fields.
Provides a common element for the various mappings.
Provides a common element for the various mappings. They can also appear at several places in the definition of specific models such as neural network or Naive Bayes models. Transformed fields have a name such that statistics and the model can refer to these fields.
Discretization of numerical input fields is a mapping from continuous to discrete values using intervals.
Trait of Expression that defines how the values of the new field are computed.
Field references are simply pass-throughs to fields previously defined in the DataDictionary, a DerivedField, or a result field.
Field references are simply pass-throughs to fields previously defined in the DataDictionary, a DerivedField, or a result field. For example, they are used in clustering models in order to define center coordinates for fields that don't need further normalization.
A missing input will produce a missing result. The optional attribute mapMissingTo may be used to map a missing result to the value specified by the attribute. If the attribute is not present, the result remains missing.
LocalTransformations holds derived fields that are local to the model.
Any discrete value can be mapped to any possibly different discrete value by listing the pairs of values.
Any discrete value can be mapped to any possibly different discrete value by listing the pairs of values. This list is implemented by a table, so it can be given inline by a sequence of XML markups or by a reference to an external table.
Normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 ..
Normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 .. 1]. Normalization is used, e.g., in neural networks and clustering models.
Defines how to normalize an input field by piecewise linear interpolation. The mapMissingTo attribute defines the value the output is to take if the input is missing. If the mapMissingTo attribute is not specified, then missing input values produce a missing result.
Encode string values into numeric values in order to perform mathematical computations.
Encode string values into numeric values in order to perform mathematical computations. For example, regression and neural network models often split categorical and ordinal fields into multiple dummy fields. This kind of normalization is supported in PMML by the element NormDiscrete.
An element (f, v) defines that the unit has value 1.0 if the value of input field f is v, otherwise it is 0.
The set of NormDiscrete instances which refer to a certain input field define a fan-out function which maps a single input field to a set of normalized fields.
If the input value is missing and the attribute mapMissingTo is not specified then the result is a missing value as well. If the input value is missing and the attribute mapMissingTo is specified then the result is the value of the attribute mapMissingTo.
The TextIndex element fully configures how the text in textField should be processed and translated into a frequency metric for a particular term of interest.
The TextIndex element fully configures how the text in textField should be processed and translated into a frequency metric for a particular term of interest. The actual frequency metric to be returned is defined through the localTermWeights attribute.
A TextIndexNormalization element offers more advanced ways of normalizing text input into a more controlled vocabulary that corresponds to the terms being used in invocations of this indexing function.
A TextIndexNormalization element offers more advanced ways of normalizing text input into a more controlled vocabulary that corresponds to the terms being used in invocations of this indexing function. The normalization operation is defined through a translation table, specified through a TableLocator or InlineTable element.
The TransformationDictionary allows for transformations to be defined once and used by any model element in the PMML document.
- allHits: count all hits - bestHits: count all hits with the lowest Levenshtein distance
- termFrequency: use the number of times the term occurs in the document (x = freqi).
- termFrequency: use the number of times the term occurs in the document (x = freqi). - binary: use 1 if the term occurs in the document or 0 if it doesn't (x = χ(freqi)). - logarithmic: take the logarithm (base 10) of 1 + the number of times the term occurs in the document. (x = log(1 + freqi)) - augmentedNormalizedTermFrequency: this formula adds to the binary frequency a "normalized" component expressing the frequency of a term relative to the highest frequency of terms observed in that document (x = 0.5 * (χ(freqi) + (freqi / maxk(freqk))) )
<DefineFunction name="SAS-EM-String-Normalize" optype="categorical" dataType="string"> <ParameterField name="FMTWIDTH" optype="continuous"/> <ParameterField name="AnyCInput" optype="categorical"/> <Apply function="trimBlanks"> <Apply function="uppercase"> <Apply function="substring"> <FieldRef field="AnyCInput"/> <Constant>1</Constant> <Constant>FMTWIDTH</Constant> </Apply> </Apply> </Apply> </DefineFunction>
<DefineFunction name="SAS-FORMAT-$CHARw" optype="categorical" dataType="string"> <ParameterField name="FMTWIDTH" optype="continuous"/> <ParameterField name="AnyCInput" optype="continuous"/> <Apply function="substring"> <FieldRef field="AnyCInput"/> <Constant>1</Constant> <Constant>FMTWIDTH</Constant> </Apply> </DefineFunction>
<DefineFunction name="SAS-FORMAT-BESTw" optype="categorical" dataType="string"> <ParameterField name="FMTWIDTH" optype="continuous"/> <ParameterField name="AnyNInput" optype="continuous"/> <Apply function="formatNumber"> <FieldRef field="AnyNInput"/> <Constant>FMTWIDTH</Constant> </Apply> </DefineFunction>
Defines several user-defined functions produced by various vendors, actually, well-defined "DefineFunction" is fully supported by pmml4s, while some could be not.
Defines several user-defined functions produced by various vendors, actually, well-defined "DefineFunction" is fully supported by pmml4s, while some could be not. Here is the place for those user-defined functions are not well defined.
At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.
PMML defines various kinds of simple data transformations: