org.hammerlab.magic
SequenceFileInputFormat guaranteed to be loaded in with the same splits it was written out with.
gunzip a file in HDFS.
Untar (and optionally unzip as well, where appropriate) an HDFS file.
No fancy parallelism is used, just a scan through the entire file on the driver node.
Untar (and optionally unzip as well, where appropriate) an HDFS file.
No fancy parallelism is used, just a scan through the entire file on the driver node.