Class DataFrameJoiner


  • public class DataFrameJoiner
    extends AbstractJoiner
    Implements joins between two or more Tables
    • Constructor Detail

      • DataFrameJoiner

        public DataFrameJoiner​(Table table,
                               String... leftJoinColumnNames)
        Constructor.
        Parameters:
        table - The table to join on.
        leftJoinColumnNames - The join column names in that table to be used. These names also serve as the default for the second table, unless other names are explicitly provided.
    • Method Detail

      • type

        public DataFrameJoiner type​(JoinType joinType)
        Sets the type of join, which defaults to INNER if not provided.
        Specified by:
        type in class AbstractJoiner
        Parameters:
        joinType - The type of join to perform (INNER, LEFT_OUTER, RIGHT_OUTER, FULL_OUTER)
        Returns:
        This joiner object.
      • keepAllJoinKeyColumns

        public DataFrameJoiner keepAllJoinKeyColumns​(boolean keep)
        When the argument is true, the join columns of the second (and subsequent) tables are included in the results, even when they're identical in name and data with the first join table. When false, only one set of join columns is retained in the result.

        Note that if the second (or any subsequent) table has the same join column names as the first (or any prior) table, the same scheme used for non-join columns is used, and each column with a duplicate name gets a prefix of "Tn." where n is the number of the table in the join.

        If this method is not called, the default is false

        Specified by:
        keepAllJoinKeyColumns in class AbstractJoiner
        Parameters:
        keep - true or false
        Returns:
        this DataFrameJoiner instance
      • allowDuplicateColumnNames

        public DataFrameJoiner allowDuplicateColumnNames​(boolean allow)
        if false the join will fail if any columns other than the join column have the same name; if true the join will succeed and duplicate columns are renamed and included in the results. Specifically, the renamed columns are given a are give a prefix and the prefix used is "Tn." where n is the number of the table in the join. The second table is (T2.column_name), for example.

        See also keepAllJoinKeyColumns(boolean) to determine whether to retain the join columns from the second table

        Specified by:
        allowDuplicateColumnNames in class AbstractJoiner
        Parameters:
        allow - true, if columns with duplicate names are to be retained; false otherwise. Default is false
        Returns:
        this DataFrameJoiner instance
      • rightJoinColumns

        public DataFrameJoiner rightJoinColumns​(String... rightJoinColumnNames)
        The names of the columns to be joined on in the second (right) table. If this method is not called, they default to the names used for the left table.
        Specified by:
        rightJoinColumns in class AbstractJoiner
        Parameters:
        rightJoinColumnNames - The names to be used
        Returns:
        This DataFrameJoiner instance
      • with

        public DataFrameJoiner with​(Table... tables)
        The table or tables to be used on the right side of the join. If more than one table is provided, the join is executed repeatedly, merging the next right table with the prior results
        Specified by:
        with in class AbstractJoiner
        Parameters:
        tables - The table or tables to be used on the right side
        Returns:
        This DataFrameJoiner instance
      • join

        public Table join()
        Performs the actual join and returns the results
        Specified by:
        join in class AbstractJoiner
        Returns:
        The combined table