Adapted from http://stackoverflow.com/a/7556307/544236.
Adapted from http://stackoverflow.com/a/7556307/544236.
Untar an input file into an output file.
The output file is created in the output folder, having the same name as the input file, minus the '.tar' extension.
the input .tar file
the output directory file.
Untar (and optionally unzip as well, where appropriate) an HDFS file.
No fancy parallelism is used, just a scan through the entire file on the driver node.