Class

com.sparkfits.fits

FitsContext

Related Doc: package fits

Permalink

implicit class FitsContext extends Serializable

Adds a method, fitsFile, to SparkSession that allows reading FITS data. Note that for the moment, we provide support only for FITS table. We will add FITS image later on.

The interpreter session below shows how to use basic functionalities:

scala> val fn = "src/test/resources/test_file.fits"
scala> val df = spark.readfits
 .option("datatype", "table")
 .option("HDU", 1)
 .option("printHDUHeader", true)
 .load(fn)
+------ HEADER (HDU=1) ------+
XTENSION= BINTABLE           / binary table extension
BITPIX  =                    8 / array data type
NAXIS   =                    2 / number of array dimensions
NAXIS1  =                   34 / length of dimension 1
NAXIS2  =                20000 / length of dimension 2
PCOUNT  =                    0 / number of group parameters
GCOUNT  =                    1 / number of groups
TFIELDS =                    5 / number of table fields
TTYPE1  = target
TFORM1  = 10A
TTYPE2  = RA
TFORM2  = E
TTYPE3  = Dec
TFORM3  = D
TTYPE4  = Index
TFORM4  = K
TTYPE5  = RunId
TFORM5  = J
END
+----------------------------+
df: org.apache.spark.sql.DataFrame = [target: string, RA: float ... 3 more fields]

scala> df.printSchema
root
 |-- target: string (nullable = true)
 |-- RA: float (nullable = true)
 |-- Dec: double (nullable = true)
 |-- Index: long (nullable = true)
 |-- RunId: integer (nullable = true)

scala> df.show(5)
+----------+---------+--------------------+-----+-----+
|    target|       RA|                 Dec|Index|RunId|
+----------+---------+--------------------+-----+-----+
|NGC0000000| 3.448297| -0.3387486324784641|    0|    1|
|NGC0000001| 4.493667| -1.4414990980543227|    1|    1|
|NGC0000002| 3.787274|  1.3298379564211742|    2|    1|
|NGC0000003| 3.423602|-0.29457151504987844|    3|    1|
|NGC0000004|2.6619017|  1.3957536426732444|    4|    1|
+----------+---------+--------------------+-----+-----+
only showing top 5 rows
Linear Supertypes
Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. FitsContext
  2. Serializable
  3. Serializable
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FitsContext(spark: SparkSession)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def checkSchema(listOfFitsFiles: List[String]): Unit

    Permalink

    Check that the schemas of different FITS files to be added are the same.

    Check that the schemas of different FITS files to be added are the same. Throw an AssertionError if not.

    listOfFitsFiles

    : (List[String]) List of files as a list of String.

  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. val conf: Configuration

    Permalink
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  12. def getListOfFiles(it: RemoteIterator[LocatedFileStatus], extensions: List[String] = List(".fits")): List[String]

    Permalink

    Load recursively all FITS file inside a directory.

    Load recursively all FITS file inside a directory.

    it

    : (RemoteIterator[LocatedFileStatus]) Iterator from a Hadoop Path containing informations about files.

    extensions

    : (List[String) List of accepted extensions. Currently only .fits is available. Default is List("*.fits").

    returns

    List of files as a list of String.

  13. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  14. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  15. def load(fns: List[String]): DataFrame

    Permalink

    Load the HDU data from several FITS file into a single DataFrame.

    Load the HDU data from several FITS file into a single DataFrame. The structure of the HDU must be the same, that is contain the same number of columns with the same name and element types. The schema of the DataFrame is directly inferred from the header of the fits HDU.

    fns

    : (List[String]) List of filenames with the same structure.

    returns

    (DataFrame) always one single DataFrame made from the HDU of one FITS file, or from the same kind of HDU from several FITS file.

  16. def load(fn: String): DataFrame

    Permalink

    Create a DataFrame from the data of one HDU.

    Create a DataFrame from the data of one HDU. The input can be either the path to one FITS file (path + filename), or the path to a directory containing FITS files. In the latter, the code will load all FITS files listed inside this directory and make the union of the HDU data. Needless to say that the FITS files must have the same structure, otherwise the union will be impossible. The format of the input must be a String with Hadoop format

    • (local) file://path/to/data
    • (HDFS) hdfs://<IP>:<PORT>//path/to/data

    The schema of the DataFrame is directly inferred from the header of the fits HDU.

    fn

    : (String) Filename of the fits file to be read, or a directory containing FITS files with the same HDU structure.

    returns

    (DataFrame) always one single DataFrame made from the HDU of one FITS file, or from the same kind of HDU from several FITS file.

  17. def loadOne(fn: String): DataFrame

    Permalink

    Load a BinaryTableHDU data contained in one HDU as a DataFrame.

    Load a BinaryTableHDU data contained in one HDU as a DataFrame. The schema of the DataFrame is directly inferred from the header of the fits HDU.

    fn

    : (String) Path + filename of the fits file to be read.

    returns

    : DataFrame made from one single HDU.

  18. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. def option(key: String, value: Double): FitsContext

    Permalink

    Adds an input options for reading the underlying data source.

    Adds an input options for reading the underlying data source. (key, Double)

    key

    : (String) Name of the option

    value

    : (Double) Value of the option.

  22. def option(key: String, value: Long): FitsContext

    Permalink

    Adds an input options for reading the underlying data source.

    Adds an input options for reading the underlying data source. (key, Long)

    key

    : (String) Name of the option

    value

    : (Long) Value of the option.

  23. def option(key: String, value: Boolean): FitsContext

    Permalink

    Adds an input options for reading the underlying data source.

    Adds an input options for reading the underlying data source. (key, boolean)

    key

    : (String) Name of the option

    value

    : (Boolean) Value of the option.

  24. def option(key: String, value: String): FitsContext

    Permalink

    Adds an input options for reading the underlying data source.

    Adds an input options for reading the underlying data source.

    In general you can set the following option(s): - option("HDU", <Int>) - option("datatype", <String>) - option("printHDUHeader", <Boolean>)

    Note that values pass as Boolean, Long, or Double will be first converted to String and then decoded later on.

    key

    : (String) Name of the option

    value

    : (String) Value of the option.

  25. def readfits: FitsContext

    Permalink

    Replace the current syntax in spark 2.X spark.read.format("fits") --> spark.readfits This is a hack to avoid touching DataFrameReader class, for which the constructor is private...

    Replace the current syntax in spark 2.X spark.read.format("fits") --> spark.readfits This is a hack to avoid touching DataFrameReader class, for which the constructor is private... If you have a better idea, bug me!

    returns

    FitsContext

  26. def schema(schema: StructType): FitsContext

    Permalink

    Adds a schema to our data.

    Adds a schema to our data. It will overwrite the inferred schema from the HDU header. Useful if the header is corrupted.

    schema

    : (StructType) The schema for the data (StructType(List(StructField)))

    returns

    return the FitsContext (to chain operations)

  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  28. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  29. var verbosity: Boolean

    Permalink
  30. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped