Class NestedJarHandler


  • public class NestedJarHandler
    extends Object
    Unzip a jarfile within a jarfile to a temporary file on disk. Also handles the download of jars from http(s) URLs to temp files.

    Somewhat paradoxically, the fastest way to support scanning zipfiles-within-zipfiles is to unzip the inner zipfile to a temporary file on disk, because the inner zipfile can only be read using ZipInputStream, not ZipFile (the ZipFile constructors only take a File argument). ZipInputStream doesn't have methods for reading the zip directory at the beginning of the stream, so using ZipInputStream rather than ZipFile, you have to decompress the entire zipfile to read all the directory entries. However, there may be many non-whitelisted entries in the zipfile, so this could be a lot of wasted work.

    FastClasspathScanner makes two passes, one to read the zipfile directory, which whitelist and blacklist criteria are applied to (this is a fast operation when using ZipFile), and then an additional pass to read only whitelisted (non-blacklisted) entries. Therefore, in the general case, the ZipFile API is always going to be faster than ZipInputStream. Therefore, decompressing the inner zipfile to disk is the only efficient option.

    • Method Detail

      • getInnermostNestedJar

        public Map.Entry<File,Set<String>> getInnermostNestedJar​(String nestedJarPath,
                                                                 LogNode log)
                                                          throws Exception
        Get a File for a given (possibly nested) jarfile path, unzipping the first N-1 segments of an N-segment '!'-delimited path to temporary files, then returning the File reference for the N-th temporary file.

        If the path does not contain '!', returns the File represented by the path.

        All path segments should end in a jarfile extension, e.g. ".jar" or ".zip".

        Returns:
        An Entry<File, Set<String>>, where the File is the innermost jar, and the Set<String> is the set of all relative paths of scanning roots within the innermost jar (may be empty, or may contain strings like "target/classes" or similar). If there was an issue with the path, returns null.
        Throws:
        Exception
      • getOutermostJar

        public File getOutermostJar​(File jarFile)
        Given a File reference for an inner nested jarfile, find the outermost jarfile it was extracted from.
      • sanitizeFilename

        public String sanitizeFilename​(String filename)
      • unzipToTempDir

        public File unzipToTempDir​(File jarFile,
                                   String packageRoot,
                                   LogNode log)
                            throws IOException
        Unzip a given package root within a zipfile to a temporary directory, starting several more threads to perform the unzip in parallel, then return the temporary directory. The temporary directory and all of its contents will be removed when NestedJarHandler#close()) is called.

        N.B. standalone code for parallel unzip can be found at https://github.com/lukehutch/quickunzip

        Throws:
        IOException
      • close

        public void close​(LogNode log)
        Delete temporary files and release other resources.