Object

it.agilelab.bigdata.wasp.consumers.spark.strategies.gdpr.utils.hdfs

HdfsUtils

Related Doc: package hdfs

Permalink

object HdfsUtils extends Logging

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. HdfsUtils
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. implicit class StringPrefix extends AnyRef

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def backupFiles(fs: FileSystem)(filesToBackup: Seq[Path], backupParentDir: Path, dataPath: Path): Try[Path]

    Permalink

    Create a new directory inside backupParentDir, called "backup_{randomUUID}".

    Create a new directory inside backupParentDir, called "backup_{randomUUID}". Each of the files inside filesToBackup will be copied in this directory, also maintaining the eventual HDFS partitioning. The new file path is created by removing the base directory (that is dataPath) from the file path, and replacing it with the path of the backup directory. Example: filesToBackup = ["/user/data/p1=a/p2=b/file.parquet"] backupParentDir = "/user" dataPath = "/user/data"

    • This function creates: backupDir = "/user/backup_123'
    • then it copies the file into this directory, replacing the prefix "/user/data" with "/user/backup_123": "/user/backup_123/p1=a/p2=b/file.parquet"
    filesToBackup

    Files that should be copied in the backup directory

    backupParentDir

    Base path where to create the backup directory

    dataPath

    Path containing the data that will be backup

    returns

    Path of the newly created backup directory

  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. def deletePath(fs: FileSystem)(sourcePath: Path): Try[Unit]

    Permalink
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. def findPartitionColumns(uri: String): List[(String, String)]

    Permalink
  12. def foldIterator[T, B](iterator: RemoteIterator[T], acc: Try[B])(f: (B, T) ⇒ B)(exitPath: (B) ⇒ Boolean): Try[B]

    Permalink
  13. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  14. def getRawModelPathToToLoad(rawModel: RawModel, sc: SparkContext): String

    Permalink
  15. def getRawModelPathToWrite(rawModel: RawModel): String

    Permalink
  16. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  17. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  18. val logger: WaspLogger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  19. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  20. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  22. def readRawModel(rawModel: RawModel, spark: SparkSession): Try[DataFrame]

    Permalink
  23. def replacePathPrefix(filePath: Path, prefixPathToChange: Path, newPrefix: Path): Path

    Permalink
  24. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  25. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  26. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. def writeRawModel(rawModel: RawModel, df: DataFrame): Try[Unit]

    Permalink

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped