com.databricks.labs.automl.feature
Method for evaluating the percentage change to the score metric to normalize.
Method for evaluating the percentage change to the score metric to normalize.
Score of a parent feature
Score of an interaction feature
the percentage change
0.6.2
Method for determining feature interaction candidates, apply those candidates as new fields to the DataFrame, and return a configuration payload that has the information about the interactions that can be used in a Pipeline.
Method for determining feature interaction candidates, apply those candidates as new fields to the DataFrame, and return a configuration payload that has the information about the interactions that can be used in a Pipeline.
DataFrame to be used to calculate and potentially add feature interactions to
Fields from the DataFrame that were originally non-numeric (Character, String, etc.)
Fields from the DataFrame that were originally numeric, continuous types.
FeatureInteractionCollection -> the DataFrame with candidate feature interactions added in and the payload of interaction features and their constituent parents in order to recreate for a Pipeline.
0.6.2
Method for generating interaction candidates and re-building a feature vector
Method for generating interaction candidates and re-building a feature vector
DataFrame to interact features with (that has a feature vector already built)
Array of column names for nominal (string indexed) values
Array of column names for continuous numeric values
Name of the feature vector column
DataFrame with a re-built feature vector that includes the interacted feature columns as part of it.
0.6.2
Method for generating a pipeline-friendly feature interaction to support serialization of the automl pipeline properly.
Method for generating a pipeline-friendly feature interaction to support serialization of the automl pipeline properly. Utilizes the InteractionTransformer to generate the fields required for inference
DataFrame to be used for generating the interaction candidates and pipeline
Nominal type numeric fields that are part of the vector
Continuous type numeric fields that are part of the vector
Name of the current feature vector column
PipelineInteractionOutput which contains the pipeline to be applied to the automl pipeline flow.
0.6.2
Main method for generating a list of interaction candidates based on the configuration specified in the class configuration.
Main method for generating a list of interaction candidates based on the configuration specified in the class configuration.
The DataFrame to process interactions for
The nominal fields (String Indexed) to be used for interaction
The continuous fields (Original Numeric Types) to be used for interaction
Array[InteractionPayload] for candidate fields interactions that meet the acceptance criteria as set by configuration.
0.6.2
Method for generating a collection of Interaction Candidates to be tested and applied to the feature set if the tests for inclusion pass.
Method for generating a collection of Interaction Candidates to be tested and applied to the feature set if the tests for inclusion pass.
List of the columns that make up the feature vector
Array of InteractionPayload values.
0.6.2
Method for converting nominal interaction fields to a new StringIndexed value to preserve information type and eliminate the possibility of data distribution skew
Method for converting nominal interaction fields to a new StringIndexed value to preserve information type and eliminate the possibility of data distribution skew
FeatureInteractionCollection of the source parents and their interacted children fields
NominalDataCollecction payload containing a DataFrame that has new StringIndexed fields for nominal interactions and the fields that need to be seen as included in the final feature vector
0.6.2
Method for generating a product interaction between feature columns
Method for generating a product interaction between feature columns
A DataFrame to add a field for an interaction between two columns
InteractionPayload information about the two parent columns and the name of the new interaction column to be created.
A modified DataFrame with the new column.
0.6.2
Helper method for recreating the feature vector after interactions have been completed on individual columns
Helper method for recreating the feature vector after interactions have been completed on individual columns
DataFrame containing the interacted fields with the original feature vector dropped
Fields making up the original vector before interaction
Interaction candidate fields that have been selected to be included in the final feature vector
Name of the feature vector field
DataFrame with a new feature vector.
0.6.2