Class/Object

org.apache.spark.sql.catalyst.plans.logical

ColumnStat

Related Docs: object ColumnStat | package logical

Permalink

case class ColumnStat(distinctCount: BigInt, min: Option[Any], max: Option[Any], nullCount: BigInt, avgLen: Long, maxLen: Long) extends Product with Serializable

Statistics collected for a column.

1. Supported data types are defined in ColumnStat.supportsType. 2. The JVM data type stored in min/max is the internal data type for the corresponding Catalyst data type. For example, the internal type of DateType is Int, and that the internal type of TimestampType is Long. 3. There is no guarantee that the statistics collected are accurate. Approximation algorithms (sketches) might have been used, and the data collected can also be stale.

distinctCount

number of distinct values

min

minimum value

max

maximum value

nullCount

number of nulls

avgLen

average length of the values. For fixed-length types, this should be a constant.

maxLen

maximum length of the values. For fixed-length types, this should be a constant.

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ColumnStat
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ColumnStat(distinctCount: BigInt, min: Option[Any], max: Option[Any], nullCount: BigInt, avgLen: Long, maxLen: Long)

    Permalink

    distinctCount

    number of distinct values

    min

    minimum value

    max

    maximum value

    nullCount

    number of nulls

    avgLen

    average length of the values. For fixed-length types, this should be a constant.

    maxLen

    maximum length of the values. For fixed-length types, this should be a constant.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. val avgLen: Long

    Permalink

    average length of the values.

    average length of the values. For fixed-length types, this should be a constant.

  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. val distinctCount: BigInt

    Permalink

    number of distinct values

  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. val max: Option[Any]

    Permalink

    maximum value

  13. val maxLen: Long

    Permalink

    maximum length of the values.

    maximum length of the values. For fixed-length types, this should be a constant.

  14. val min: Option[Any]

    Permalink

    minimum value

  15. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  16. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. val nullCount: BigInt

    Permalink

    number of nulls

  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  20. def toMap(colName: String, dataType: DataType): Map[String, String]

    Permalink

    Returns a map from string to string that can be used to serialize the column stats.

    Returns a map from string to string that can be used to serialize the column stats. The key is the name of the field (e.g. "distinctCount" or "min"), and the value is the string representation for the value. min/max values are converted to the external data type. For example, for DateType we store java.sql.Date, and for TimestampType we store java.sql.Timestamp. The deserialization side is defined in ColumnStat.fromMap.

    As part of the protocol, the returned map always contains a key called "version". In the case min/max values are null (None), they won't appear in the map.

  21. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped