org.pmml4s.common
Type members
Classlikes
Describes the software application that generated the model.
Describes the software application that generated the model.
- Value parameters:
- name
The name of the application that generated the model.
- version
The version of the application that generated this model.
The data type representing Boolean
values.
The data type representing Boolean
values.
- Companion:
- object
Defines a categorical independent variable. The list of attributes comprises the name of the variable, the value attribute, and the coefficient by which the values of this variable must be multiplied.
Defines a categorical independent variable. The list of attributes comprises the name of the variable, the value attribute, and the coefficient by which the values of this variable must be multiplied.
-
absDiff: absolute difference c(x,y) = |x-y|
-
gaussSim: gaussian similarity c(x,y) = exp(-ln(2)zz/(s*s)) where z=x-y, and s is the value of attribute similarityScale (required in this case) in the ClusteringField
-
delta: c(x,y) = 0 if x=y, 1 else
-
equal: c(x,y) = 1 if x=y, 0 else
-
table: c(x,y) = lookup in similarity matrix
CompoundPredicate: an encapsulating element for combining two or more elements as defined at the entity PREDICATE. The attribute associated with this element, booleanOperator, can take one of the following logical (boolean) operators: and, or, xor or surrogate.
CompoundPredicate: an encapsulating element for combining two or more elements as defined at the entity PREDICATE. The attribute associated with this element, booleanOperator, can take one of the following logical (boolean) operators: and, or, xor or surrogate.
- Companion:
- object
Carries counters for frequency of values with respect to their state of being missing, invalid, or valid. The counts can be non-integer if they are weighted.
Carries counters for frequency of values with respect to their state of being missing, invalid, or valid. The counts can be non-integer if they are weighted.
- Value parameters:
- cardinality
The number of unique, or distinct, values that the variable has.
- invalidFreq
Counts the number of records with values other than valid. The total frequency includes the missing values and invalid values.
- missingFreq
Counts the number of records where value is missing.
- totalFreq
Counts all records, same as for statistics of all MiningFields.
A template trait for a data type.
A template trait for a data type.
The type dateDaysSince[aYear] is a variant of the type date where the values are represented as the number of days since aYear-01-01. The date aYear-01-01 is represented by the number 0. aYear-01-02 is represented by 1, aYear-02-01 is represented by 31, etc. Dates before aYear-01-01 are represented as negative numbers. For example, values of type dateDaysSince[1960] are the number of days since 1960-01-01. The date 1960-01-01 is represented by the number 0.
The type dateDaysSince[aYear] is a variant of the type date where the values are represented as the number of days since aYear-01-01. The date aYear-01-01 is represented by the number 0. aYear-01-02 is represented by 1, aYear-02-01 is represented by 31, etc. Dates before aYear-01-01 are represented as negative numbers. For example, values of type dateDaysSince[1960] are the number of days since 1960-01-01. The date 1960-01-01 is represented by the number 0.
- Companion:
- object
The type dateTimeSecondsSince[aYear] is a variant of the type date where the values are represented as the number of seconds since 00:00 on aYear-01-01. The datetime 00:00:00 on aYear-01-01 is represented by the number 0. The datetime 00:00:01 on aYear-01-01 is represented by 1, etc. Datetimes before aYear-01-01 are represented as negative numbers. For example, values of type dateTimeSecondsSince[1960] are the number of seconds since 00:00 on 1960-01-01. The datetime 00:00:00 on 1960-01-01 is represented by the number 0. The datetime 00:01:00 on 1960-01-01 is represented by 60.
The type dateTimeSecondsSince[aYear] is a variant of the type date where the values are represented as the number of seconds since 00:00 on aYear-01-01. The datetime 00:00:00 on aYear-01-01 is represented by the number 0. The datetime 00:00:01 on aYear-01-01 is represented by 1, etc. Datetimes before aYear-01-01 are represented as negative numbers. For example, values of type dateTimeSecondsSince[1960] are the number of seconds since 00:00 on 1960-01-01. The datetime 00:00:00 on 1960-01-01 is represented by the number 0. The datetime 00:01:00 on 1960-01-01 is represented by 60.
- Companion:
- object
The content is just one array of numbers representing the diagonal values.
The content is just one array of numbers representing the diagonal values.
The data type representing Double
values.
The data type representing Double
values.
- Companion:
- object
A common super-trait that accepts a series, then evaluates a single value.
A common super-trait that accepts a series, then evaluates a single value.
Identifies the boolean constant FALSE.
Identifies the boolean constant FALSE.
The PMML schema contains a mechanism for extending the content of a model. Extension elements should be present as the first child in all elements and groups defined in PMML. This way it is possible to place information in the Extension elements which affects how the remaining entries are treated. The main element in each model should have Extension elements as the first and the last child for maximum flexibility.
The PMML schema contains a mechanism for extending the content of a model. Extension elements should be present as the first child in all elements and groups defined in PMML. This way it is possible to place information in the Extension elements which affects how the remaining entries are treated. The main element in each model should have Extension elements as the first and the last child for maximum flexibility.
Holds common attributes of a PMML model.
Holds common attributes of a PMML model.
The data type representing Int
or Long
values.
The data type representing Int
or Long
values.
- Companion:
- object
Class represents common attributes of a PMML model.
Class represents common attributes of a PMML model.
Provides a basic framework for representing variable statistics.
Provides a basic framework for representing variable statistics.
Provides a dataset of model inputs and known results that can be used to verify accurate results are generated, regardless of the environment.
Provides a dataset of model inputs and known results that can be used to verify accurate results are generated, regardless of the environment.
The values for mean, minimum, maximum and standardDeviation are defined as usual. median is calculated as the 50% quantile; interQuartileRange is calculated as (75% quantile - 25% quantile).
The values for mean, minimum, maximum and standardDeviation are defined as usual. median is calculated as the 50% quantile; interQuartileRange is calculated as (75% quantile - 25% quantile).
Defines a numeric independent variable. The list of valid attributes comprises the name of the variable, the exponent to be used, and the coefficient by which the values of this variable must be multiplied. Note that the exponent defaults to 1, hence it is not always necessary to specify. Also, if the input value is missing, the result evaluates to a missing value.
Defines a numeric independent variable. The list of valid attributes comprises the name of the variable, the exponent to be used, and the coefficient by which the values of this variable must be multiplied. Note that the exponent defaults to 1, hence it is not always necessary to specify. Also, if the input value is missing, the result evaluates to a missing value.
Indicates which operations are defined on the values.
Indicates which operations are defined on the values.
- Companion:
- object
Pre-defined comparison operators.
Pre-defined comparison operators.
A Partition contains statistics for a subset of records, for example it can describe the population in a cluster. The content of a Partition mirrors the definition of the general univariate statistics. That is, each Partition describes the distribution per field. For each field there can be information about frequencies, numeric moments, etc.
A Partition contains statistics for a subset of records, for example it can describe the population in a cluster. The content of a Partition mirrors the definition of the general univariate statistics. That is, each Partition describes the distribution per field. For each field there can be information about frequencies, numeric moments, etc.
The attribute name identifies the Partition. The attribute size is the number of records. All aggregates in PartitionFieldStats must have size = totalFrequency in Counts if specified.
field references to (the name of) a MiningField for background statistics. The sequence of NUM-ARRAYs is the same as for ContStats. For categorical fields there is only one array containing the frequencies; for numeric fields, the second and third array contain the sums of values and the sums of squared values, respectively. The number of values in each array must match the number of categories or intervals in UnivariateStats of the field.
field references to (the name of) a MiningField for background statistics. The sequence of NUM-ARRAYs is the same as for ContStats. For categorical fields there is only one array containing the frequencies; for numeric fields, the second and third array contain the sums of values and the sums of squared values, respectively. The number of values in each array must match the number of categories or intervals in UnivariateStats of the field.
The base trait for all elements of PMML
The base trait for all elements of PMML
Contains one or more fields that are combined by multiplication. That is, this element supports interaction terms. The type of all fields referenced within PredictorTerm must be continuous. Note that if the input value is missing, the result evaluates to a missing value.
Contains one or more fields that are combined by multiplication. That is, this element supports interaction terms. The type of all fields referenced within PredictorTerm must be continuous. Note that if the input value is missing, the result evaluates to a missing value.
- Valid value: A value which is neither missing nor invalid.
- Invalid value: The input value is not missing but it does not belong to a certain value range. The range of valid values can be defined for each field.
- Missing value: Input value is missing, for example, if a database column contains a null value. It is possible to explicitly define values which are interpreted as missing values.
- Value parameters:
- quantileLimit
A percentage number between 0 and 100
- quantileValue
The corresponding value in the domain of field values.
The data type representing Float
or Double
values, the Real
is an extended type beyond PMML
The data type representing Float
or Double
values, the Real
is an extended type beyond PMML
- Companion:
- object
Lists the values of all predictors or independent variables. If the model is used to predict a numerical field, then there is only one RegressionTable and the attribute targetCategory may be missing. If the model is used to predict a categorical field, then there are two or more RegressionTables and each one must have the attribute targetCategory defined with a unique value.
Lists the values of all predictors or independent variables. If the model is used to predict a numerical field, then there is only one RegressionTable and the attribute targetCategory may be missing. If the model is used to predict a categorical field, then there are two or more RegressionTables and each one must have the attribute targetCategory defined with a unique value.
Comprises a method to list predicted values in a classification trees structure.
Comprises a method to list predicted values in a classification trees structure.
Defines a rule in the form of a simple boolean expression. The rule consists of field, operator (booleanOperator) for binary comparison, and value.
Defines a rule in the form of a simple boolean expression. The rule consists of field, operator (booleanOperator) for binary comparison, and value.
Checks whether a field value is element of a set. The set of values is specified by the array.
Checks whether a field value is element of a set. The set of values is specified by the array.
- Companion:
- object
Column-major sparse matrix.
Column-major sparse matrix.
- Companion:
- object
The data type representing String
values.
The data type representing String
values.
- Companion:
- object
A field inside a StructType.
A field inside a StructType.
- Value parameters:
- dataType
The data type of this field.
- name
The name of this field.
The content must be represented by Arrays. The first array contains the matrix element M(0,0), the second array contains M(1,0), M(1,1), and so on (that is the lower left triangle). Other elements are defined by symmetry.
The content must be represented by Arrays. The first array contains the matrix element M(0,0), the second array contains M(1,0), M(1,1), and so on (that is the lower left triangle). Other elements are defined by symmetry.
The type timeSeconds is a variant of the type time where the values are represented as the number of seconds since 00:00, that is, since midnight. The time 00:00 is represented by the number 0. No negative values are allowed.
The type timeSeconds is a variant of the type time where the values are represented as the number of seconds since 00:00, that is, since midnight. The time 00:00 is represented by the number 0. No negative values are allowed.
- Companion:
- object
Abstract class for transformers that transform one series into another.
Abstract class for transformers that transform one series into another.
Identifies the boolean constant TRUE.
Identifies the boolean constant TRUE.
- Companion:
- object