Package

org.pmml4s

transformations

Permalink

package transformations

At various places the mining models use simple functions in order to map user data to values that are easier to use in the specific model. For example, neural networks internally work with numbers, usually in the range from 0 to 1. Numeric input data are mapped to the range [0..1], and categorical fields are mapped to series of 0/1 indicators.

PMML defines various kinds of simple data transformations:

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. transformations
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. class Apply extends Expression

    Permalink

    Apply defines the application of a function.

    Apply defines the application of a function. The function itself is identified by name with the function attribute. The actual parameters of the function application are given in the content of the element. Each actual argument value is given by an EXPRESSION and are mapped by position to the formal parameters in the corresponding function definition.

  2. trait BinaryArithmetic extends BinaryFunction

    Permalink
  3. trait BinaryBoolean extends BinaryFunction

    Permalink
  4. trait BinaryCompare extends BinaryFunction

    Permalink
  5. trait BinaryFunction extends Function

    Permalink
  6. trait BinaryString extends BinaryFunction

    Permalink
  7. class Constant extends LeafExpression

    Permalink

    Constant values can be used in expressions which have multiple arguments.

    Constant values can be used in expressions which have multiple arguments. . The actual value of a constant is given by the content of the element. For example, <Constant>1.05</Constant> represents the number 1.05. The dataType of Constant can be optionally specified.

  8. class DefineFunction extends Function with HasOpType with HasDataType with PmmlElement

    Permalink

    Defines new (user-defined) functions as variations or compositions of existing functions or transformations.

    Defines new (user-defined) functions as variations or compositions of existing functions or transformations. The function's name must be unique and must not conflict with other function names, either defined by PMML or other user-defined functions. The EXPRESSION in the content of DefineFunction is the function body that actually defines the meaning of the new function. The function body must not refer to fields other than the parameter fields.

  9. class DerivedField extends DataField with Expression with Clearable

    Permalink

    Provides a common element for the various mappings.

    Provides a common element for the various mappings. They can also appear at several places in the definition of specific models such as neural network or Naive Bayes models. Transformed fields have a name such that statistics and the model can refer to these fields.

  10. class Discretize extends FieldExpression

    Permalink

    Discretization of numerical input fields is a mapping from continuous to discrete values using intervals.

  11. class DiscretizeBin extends PmmlElement

    Permalink
  12. trait Expression extends Evaluator with PmmlElement

    Permalink

    Trait of Expression that defines how the values of the new field are computed.

  13. class FieldColumnPair extends PmmlElement

    Permalink
  14. trait FieldExpression extends UnaryExpression

    Permalink
  15. class FieldRef extends FieldExpression with MixedEvaluator

    Permalink

    Field references are simply pass-throughs to fields previously defined in the DataDictionary, a DerivedField, or a result field.

    Field references are simply pass-throughs to fields previously defined in the DataDictionary, a DerivedField, or a result field. For example, they are used in clustering models in order to define center coordinates for fields that don't need further normalization.

    A missing input will produce a missing result. The optional attribute mapMissingTo may be used to map a missing result to the value specified by the attribute. If the attribute is not present, the result remains missing.

  16. trait Function extends PmmlElement

    Permalink

  17. trait FunctionProvider extends AnyRef

    Permalink
  18. trait HasFunctionProvider extends AnyRef

    Permalink
  19. trait HasLocalTransformations extends AnyRef

    Permalink
  20. trait LeafExpression extends Expression

    Permalink
  21. class LinearNorm extends PmmlElement

    Permalink
  22. class LocalTransformations extends TransformationDictionary

    Permalink

    LocalTransformations holds derived fields that are local to the model.

  23. class MapValues extends Expression

    Permalink

    Any discrete value can be mapped to any possibly different discrete value by listing the pairs of values.

    Any discrete value can be mapped to any possibly different discrete value by listing the pairs of values. This list is implemented by a table, so it can be given inline by a sequence of XML markups or by a reference to an external table.

  24. trait MultipleArithmetic extends Function

    Permalink
  25. trait MultipleBoolean extends Function

    Permalink
  26. class MutableFunctionProvider extends FunctionProvider

    Permalink
  27. class NormContinuous extends NumericFieldExpression

    Permalink

    Normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 ..

    Normalization provides a basic framework for mapping input values to specific value ranges, usually the numeric range [0 .. 1]. Normalization is used, e.g., in neural networks and clustering models.

    Defines how to normalize an input field by piecewise linear interpolation. The mapMissingTo attribute defines the value the output is to take if the input is missing. If the mapMissingTo attribute is not specified, then missing input values produce a missing result.

  28. class NormDiscrete extends FieldExpression

    Permalink

    Encode string values into numeric values in order to perform mathematical computations.

    Encode string values into numeric values in order to perform mathematical computations. For example, regression and neural network models often split categorical and ordinal fields into multiple dummy fields. This kind of normalization is supported in PMML by the element NormDiscrete.

    An element (f, v) defines that the unit has value 1.0 if the value of input field f is v, otherwise it is 0.

    The set of NormDiscrete instances which refer to a certain input field define a fan-out function which maps a single input field to a set of normalized fields.

    If the input value is missing and the attribute mapMissingTo is not specified then the result is a missing value as well. If the input value is missing and the attribute mapMissingTo is specified then the result is the value of the attribute mapMissingTo.

  29. trait NumericFieldExpression extends FieldExpression

    Permalink
  30. class ParameterField extends AbstractField

    Permalink
  31. trait TernaryArithmetic extends TernaryFunction

    Permalink
  32. trait TernaryFunction extends Function

    Permalink
  33. class TextIndex extends NumericFieldExpression

    Permalink

    The TextIndex element fully configures how the text in textField should be processed and translated into a frequency metric for a particular term of interest.

    The TextIndex element fully configures how the text in textField should be processed and translated into a frequency metric for a particular term of interest. The actual frequency metric to be returned is defined through the localTermWeights attribute.

  34. class TextIndexNormalization extends PmmlElement

    Permalink

    A TextIndexNormalization element offers more advanced ways of normalizing text input into a more controlled vocabulary that corresponds to the terms being used in invocations of this indexing function.

    A TextIndexNormalization element offers more advanced ways of normalizing text input into a more controlled vocabulary that corresponds to the terms being used in invocations of this indexing function. The normalization operation is defined through a translation table, specified through a TableLocator or InlineTable element.

  35. class TransformationDictionary extends Dictionary[DerivedField] with Transformer with FunctionProvider with PmmlElement

    Permalink

    The TransformationDictionary allows for transformations to be defined once and used by any model element in the PMML document.

  36. trait UnaryArithmetic extends UnaryFunction

    Permalink
  37. trait UnaryBoolean extends UnaryFunction

    Permalink
  38. trait UnaryExpression extends Expression

    Permalink
  39. trait UnaryFunction extends Function

    Permalink
  40. trait UnaryString extends UnaryFunction

    Permalink

Value Members

  1. object ACos extends UnaryArithmetic

    Permalink
  2. object ASin extends UnaryArithmetic

    Permalink
  3. object ATan extends UnaryArithmetic

    Permalink
  4. object Abs extends UnaryArithmetic

    Permalink
  5. object Add extends BinaryArithmetic

    Permalink
  6. object And extends MultipleBoolean

    Permalink
  7. object Avg extends MultipleArithmetic

    Permalink
  8. object BuiltInFunctions extends FunctionProvider

    Permalink
  9. object Ceil extends UnaryArithmetic

    Permalink
  10. object Concat extends Function

    Permalink
  11. object Cos extends UnaryArithmetic

    Permalink
  12. object CosH extends UnaryArithmetic

    Permalink
  13. object CountHits extends Enumeration

    Permalink

    - allHits: count all hits - bestHits: count all hits with the lowest Levenshtein distance

  14. object DateDaysSinceYear extends BinaryFunction

    Permalink
  15. object DateSecondsSinceMidnight extends UnaryFunction

    Permalink
  16. object DateSecondsSinceYear extends BinaryFunction

    Permalink
  17. object Divide extends BinaryArithmetic

    Permalink
  18. object Equal extends BinaryBoolean

    Permalink
  19. object Erf extends UnaryArithmetic

    Permalink
  20. object Exp extends UnaryArithmetic

    Permalink
  21. object Expm1 extends UnaryArithmetic

    Permalink
  22. object Expression extends Serializable

    Permalink
  23. object Floor extends UnaryArithmetic

    Permalink
  24. object FormatDatetime extends BinaryFunction

    Permalink
  25. object FormatNumber extends BinaryFunction

    Permalink
  26. object GreaterOrEqual extends BinaryCompare

    Permalink
  27. object GreaterThan extends BinaryCompare

    Permalink
  28. object Hypot extends BinaryArithmetic

    Permalink
  29. object If extends Function

    Permalink
  30. object IsIn extends Function

    Permalink
  31. object IsMissing extends UnaryBoolean

    Permalink
  32. object IsNotIn extends Function

    Permalink
  33. object IsNotMissing extends UnaryBoolean

    Permalink
  34. object IsNotValid extends UnaryBoolean

    Permalink
  35. object IsValid extends UnaryBoolean

    Permalink
  36. object LessOrEqual extends BinaryCompare

    Permalink
  37. object LessThan extends BinaryCompare

    Permalink
  38. object Ln extends UnaryArithmetic

    Permalink
  39. object Ln1p extends UnaryArithmetic

    Permalink
  40. object LocalTermWeights extends Enumeration

    Permalink

    - termFrequency: use the number of times the term occurs in the document (x = freqi).

    - termFrequency: use the number of times the term occurs in the document (x = freqi). - binary: use 1 if the term occurs in the document or 0 if it doesn't (x = χ(freqi)). - logarithmic: take the logarithm (base 10) of 1 + the number of times the term occurs in the document. (x = log(1 + freqi)) - augmentedNormalizedTermFrequency: this formula adds to the binary frequency a "normalized" component expressing the frequency of a term relative to the highest frequency of terms observed in that document (x = 0.5 * (χ(freqi) + (freqi / maxk(freqk))) )

  41. object Log10 extends UnaryArithmetic

    Permalink
  42. object Lowercase extends UnaryString

    Permalink
  43. object Matches extends BinaryBoolean

    Permalink
  44. object Max extends MultipleArithmetic

    Permalink
  45. object Median extends MultipleArithmetic

    Permalink
  46. object Min extends MultipleArithmetic

    Permalink
  47. object Modulo extends BinaryArithmetic

    Permalink
  48. object Multiply extends BinaryArithmetic

    Permalink
  49. object NormalCDF extends TernaryArithmetic

    Permalink
  50. object NormalIDF extends TernaryArithmetic

    Permalink
  51. object NormalPDF extends TernaryArithmetic

    Permalink
  52. object Not extends UnaryFunction

    Permalink
  53. object NotEqual extends BinaryBoolean

    Permalink
  54. object Or extends MultipleBoolean

    Permalink
  55. object Pow extends BinaryArithmetic

    Permalink
  56. object Product extends MultipleArithmetic

    Permalink
  57. object RInt extends UnaryArithmetic

    Permalink
  58. object Replace extends TernaryFunction

    Permalink
  59. object Round extends UnaryArithmetic

    Permalink
  60. object SAS-EM-String-Normalize extends BinaryFunction

    Permalink

    <DefineFunction name="SAS-EM-String-Normalize" optype="categorical" dataType="string">
     <ParameterField name="FMTWIDTH" optype="continuous"/>
     <ParameterField name="AnyCInput" optype="categorical"/>
     <Apply function="trimBlanks">
       <Apply function="uppercase">
         <Apply function="substring">
         <FieldRef field="AnyCInput"/>
         <Constant>1</Constant>
         <Constant>FMTWIDTH</Constant>
         </Apply>
       </Apply>
     </Apply>
    </DefineFunction>
  61. object SAS-FORMAT-$CHARw extends BinaryFunction

    Permalink

    <DefineFunction name="SAS-FORMAT-$CHARw" optype="categorical" dataType="string">
     <ParameterField name="FMTWIDTH" optype="continuous"/>
     <ParameterField name="AnyCInput" optype="continuous"/>
     <Apply function="substring">
       <FieldRef field="AnyCInput"/>
       <Constant>1</Constant>
       <Constant>FMTWIDTH</Constant>
     </Apply>
    </DefineFunction>
  62. object SAS-FORMAT-BESTw extends BinaryFunction

    Permalink

    <DefineFunction name="SAS-FORMAT-BESTw" optype="categorical" dataType="string">
     <ParameterField name="FMTWIDTH" optype="continuous"/>
     <ParameterField name="AnyNInput" optype="continuous"/>
     <Apply function="formatNumber">
       <FieldRef field="AnyNInput"/>
       <Constant>FMTWIDTH</Constant>
     </Apply>
    </DefineFunction>
  63. object Sin extends UnaryArithmetic

    Permalink
  64. object SinH extends UnaryArithmetic

    Permalink
  65. object Sqrt extends UnaryArithmetic

    Permalink
  66. object StdNormalCDF extends UnaryArithmetic

    Permalink
  67. object StdNormalIDF extends UnaryArithmetic

    Permalink
  68. object StdNormalPDF extends UnaryArithmetic

    Permalink
  69. object Substring extends TernaryFunction

    Permalink
  70. object Subtract extends BinaryArithmetic

    Permalink
  71. object Sum extends MultipleArithmetic

    Permalink
  72. object Tan extends UnaryArithmetic

    Permalink
  73. object TanH extends UnaryArithmetic

    Permalink
  74. object TextIndex extends Serializable

    Permalink
  75. object Threshold extends BinaryArithmetic

    Permalink
  76. object TrimBlanks extends UnaryString

    Permalink
  77. object Uppercase extends UnaryString

    Permalink
  78. object UserDefinedFunctions extends FunctionProvider

    Permalink

    Defines several user-defined functions produced by various vendors, actually, well-defined "DefineFunction" is fully supported by pmml4s, while some could be not.

    Defines several user-defined functions produced by various vendors, actually, well-defined "DefineFunction" is fully supported by pmml4s, while some could be not. Here is the place for those user-defined functions are not well defined.

Inherited from AnyRef

Inherited from Any

Ungrouped