Object

com.astrolabsoftware.spark3d.spatial3DRDD

Loader

Related Doc: package spatial3DRDD

Permalink

object Loader

Put here routine to load data for a specific data format Currently available: all Spark DataSource V2 compatible format! i.e. CSV, JSON, TXT, Avro, Parquet, FITS, HDF5, ROOT (<= 6.11), ...

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Loader
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def Point3DRDDFromV2(spark: SparkSession, filename: String, colnames: String, isSpherical: Boolean, format: String, options: Map[String, String] = Map("" -> "")): RDD[Point3D]

    Permalink

    Construct a RDD[Point3D] from whatever data source registered in Spark.

    Construct a RDD[Point3D] from whatever data source registered in Spark. For more information about available official connectors: https://spark-packages.org/?q=tags%3A%22Data%20Sources%22

    That currently includes: CSV, JSON, TXT, FITS, ROOT, HDF5, Avro, Parquet...

    // Here is an example with a CSV file containing
    // 3 spherical coordinates columns labeled Z_COSMO,RA,Dec.
    
    // Filename
    val fn = "path/to/file.csv"
    // Spark datasource
    val format = "csv"
    // Options to pass to the DataFrameReader - optional
    val options = Map("header" -> "true")
    
    // Load the data as RDD[Point3D]
    val rdd = new Point3DRDD(spark, fn, "Z_COSMO,RA,Dec", true, format, options)
    spark

    : (SparkSession) The spark session

    filename

    : (String) File name where the data is stored.

    colnames

    : (String) Comma-separated names of (x, y, z) columns. Example: "Z_COSMO,RA,Dec".

    isSpherical

    : (Boolean) If true, it assumes that the coordinates of the Point3D are (r, theta, phi). Otherwise, it assumes cartesian coordinates (x, y, z).

    format

    : (String) The name of the data source as registered in Spark. For example:

    • text
    • csv
    • json
    • com.astrolabsoftware.sparkfits
    • org.dianahep.sparkroot
    • gov.llnl.spark.hdf or hdf5
    options

    : (Map[String, String]) Options to pass to the DataFrameReader. Default is no options.

    returns

    (RDD[Point3D])

  5. def Point3DRDDFromV2PythonHelper(spark: SparkSession, filename: String, colnames: String, isSpherical: Boolean, format: String, options: HashMap[String, String]): RDD[Point3D]

    Permalink

    Point3DRDDFromV2 version suitable for py4j.

    Point3DRDDFromV2 version suitable for py4j.

    Note that pyspark works with Python wrappers around the *Java* version of Spark objects, not around the *Scala* version of Spark objects. Therefore on the Scala side, we trigger the method Point3DRDDFromV2PythonHelper which is a modified version of Point3DRDDFromV2. The change is that options on the Scala side is a java.util.HashMap in order to smoothly connect to dictionary in the Python side.

  6. def SphereRDDFromV2(spark: SparkSession, filename: String, colnames: String, isSpherical: Boolean, format: String, options: Map[String, String] = Map("" -> "")): RDD[ShellEnvelope]

    Permalink

    Construct a RDD[ShellEnvelope] from whatever data source registered in Spark.

    Construct a RDD[ShellEnvelope] from whatever data source registered in Spark. For more information about available official connectors: https://spark-packages.org/?q=tags%3A%22Data%20Sources%22

    That currently includes: CSV, JSON, TXT, FITS, ROOT, HDF5, Avro, Parquet...

    // Here is an example with a CSV file containing
    // 3 cartesian coordinates + 1 radius columns labeled x,y,z,radius.
    
    // Filename
    val fn = "path/to/file.csv"
    // Spark datasource
    val format = "csv"
    // Options to pass to the DataFrameReader - optional
    val options = Map("header" -> "true")
    
    // Load the data as RDD[ShellEnvelope]
    val rdd = new SphereRDD(spark, fn, "x,y,z,radius", true, format, options)
    spark

    : (SparkSession) The spark session

    filename

    : (String) File name where the data is stored. Extension must be explicitly written (.cvs, .json, or .txt)

    colnames

    : (String) Comma-separated names of (x, y, z, r) columns to read. Example: "Z_COSMO,RA,Dec,Radius".

    isSpherical

    : (Boolean) If true, it assumes that the coordinates of the center of the ShellEnvelope are (r, theta, phi). Otherwise, it assumes cartesian coordinates (x, y, z). Default is false.

    format

    : (String) The name of the data source as registered in Spark. For example:

    • text
    • csv
    • json
    • com.astrolabsoftware.sparkfits
    • org.dianahep.sparkroot
    • gov.llnl.spark.hdf or hdf5
    options

    : (Map[String, String]) Options to pass to the DataFrameReader. Default is no options.

    returns

    (RDD[ShellEnvelope])

  7. def SphereRDDFromV2PythonHelper(spark: SparkSession, filename: String, colnames: String, isSpherical: Boolean, format: String, options: HashMap[String, String]): RDD[ShellEnvelope]

    Permalink

    SphereRDDFromV2 version suitable for py4j.

    SphereRDDFromV2 version suitable for py4j.

    Note that pyspark works with Python wrappers around the *Java* version of Spark objects, not around the *Scala* version of Spark objects. Therefore on the Scala side, we trigger the method SphereRDDFromV2PythonHelper which is a modified version of SphereRDDFromV2. The change is that options on the Scala side is a java.util.HashMap in order to smoothly connect to dictionary in the Python side.

  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  12. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  14. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  15. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  16. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  20. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  21. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  23. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped