com.astrolabsoftware.spark3d.spatial3DRDD
Constructor of Point3DRDD
which is suitable for py4j.
Constructor of Point3DRDD
which is suitable for py4j.
It calls Point3DRDDFromV2PythonHelper
instead of Point3DRDDFromV2
.
All args are the same but options
which is a java.util.HashMap
, and
storageLevel
which is removed and set to StorageLevel.MEMORY_ONLY
(user cannot set the storage level in pyspark3d for the moment).
Construct a RDD[Point3D] from whatever data source registered in Spark.
Construct a RDD[Point3D] from whatever data source registered in Spark.
For more information about available official connectors:
https://spark-packages.org/?q=tags%3A%22Data%20Sources%22
We currently include: CSV, JSON, TXT, FITS, ROOT, HDF5, Avro, Parquet...
// Here is an example with a CSV file containing // 3 spherical coordinates columns labeled Z_COSMO,RA,Dec. // Filename val fn = "path/to/file.csv" // Spark datasource val format = "csv" // Options to pass to the DataFrameReader - optional val options = Map("header" -> "true") // Load the data as RDD[Point3D] val rdd = new Point3DRDD(spark, fn, "Z_COSMO,RA,Dec", true, format, options)
: (SparkSession) The spark session
: (String) File name where the data is stored.
: (String) Comma-separated names of (x, y, z) columns. Example: "Z_COSMO,RA,Dec".
: (Boolean) If true, it assumes that the coordinates of the Point3D are (r, theta, phi). Otherwise, it assumes cartesian coordinates (x, y, z).
: (String) The name of the data source as registered in Spark. For example:
: (Map[String, String]) Options to pass to the DataFrameReader. Default is no options.
: (StorageLevel) Storage level for the raw RDD (unpartitioned). Default is StorageLevel.NONE. See https://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence for more information.
(RDD[Point3D])
Repartion a RDD[T] according to a custom partitioner.
Repartion a RDD[T] according to a custom partitioner.
: (SpatialPartitioner) Instance of SpatialPartitioner or any extension of it.
(RDD[T]) Repartitioned RDD[T].
RDD containing the initial data formated as T.
RDD containing the initial data formated as T.
Apply a spatial partitioning to this.rawRDD, and return a RDD[T] with the new partitioning.
Apply a spatial partitioning to this.rawRDD, and return a RDD[T] with the new partitioning. The list of available partitioning can be found in utils/GridType. By default, the outgoing level of parallelism is the same as the incoming one (i.e. same number of partitions).
: (String) Type of partitioning to apply. See utils/GridType.
: (Int) Number of partitions for the partitioned RDD. By default (-1), the number of partitions is that of the raw RDD. You can force it to be different by setting manually this parameter. Be aware of shuffling though...
(RDD[T]) RDD whose elements are T (Point3D, Sphere, etc...)
Apply any Spatial Partitioner to this.rawRDD[T], and return a RDD[T] with the new partitioning.
Apply any Spatial Partitioner to this.rawRDD[T], and return a RDD[T] with the new partitioning.
: (SpatialPartitioner) Spatial partitioner as defined in utils.GridType
(RDD[T]) RDD whose elements are T (Point3D, Sphere, etc...)
Constructor of spatialPartitioning
which is suitable for py4j.
Constructor of spatialPartitioning
which is suitable for py4j.
py4j does not handle generics, so we explicitly specify the types here.
See discussion here: https://github.com/bartdag/py4j/issues/328
Apply a spatial partitioning to this.rawRDD, and return a RDD[Point3D] with the new partitioning. The list of available partitioning can be found in utils/GridType. By default, the outgoing level of parallelism is the same as the incoming one (i.e. same number of partitions).
: (String) Type of partitioning to apply. See utils/GridType.
: (Int) Number of partitions for the partitioned RDD. By default (-1), the number of partitions is that of the raw RDD. You can force it to be different by setting manually this parameter. Be aware of shuffling though...
(RDD[Point3D]) RDD whose elements are Point3D.
Constructor of spatialPartitioning
which is suitable for py4j.
Constructor of spatialPartitioning
which is suitable for py4j.
py4j does not handle generics, so we explicitly specify the types here.
See discussion here: https://github.com/bartdag/py4j/issues/328
Apply any Spatial Partitioner to this.rawRDD[Point3D], and return a RDD[Point3D] with the new partitioning.
: (SpatialPartitioner) Spatial partitioner as defined in utils.GridType
(RDD[Point3D]) RDD whose elements are Point3D
Return a RDD whose elements are the lists of center coordinates.
Return a RDD whose elements are the lists of center coordinates.
: (RDD[T]) Input RDD[T]
(RDD[List[Double]]) RDD whose elements are the lists of center coordinates.
Constructor of toCenterCoordinateRDD
which is suitable for py4j,
i.e.
Constructor of toCenterCoordinateRDD
which is suitable for py4j,
i.e. it replaces Scala Lists with Java Lists.
Return a RDD whose elements are the lists of center coordinates.
: (RDD[T]) Input RDD[T]
(RDD[java.util.List[Double]]) RDD whose elements are the (Java) lists of center coordinates.