Class FilenameUtils


  • public class FilenameUtils
    extends Object
    General file name and file path manipulation utilities.

    When dealing with file names you can hit problems when moving from a Windows based development machine to a Unix based production machine. This class aims to help avoid those problems.

    NOTE: You may be able to avoid using this class entirely simply by using JDK File objects and the two argument constructor File(File,String).

    Most methods on this class are designed to work the same on both Unix and Windows. Those that don't include 'System', 'Unix' or 'Windows' in their name.

    Most methods recognize both separators (forward and back), and both sets of prefixes. See the Javadoc of each method for details.

    This class defines six components within a file name (example C:\dev\project\file.txt):

    • the prefix - C:\
    • the path - dev\project\
    • the full path - C:\dev\project\
    • the name - file.txt
    • the base name - file
    • the extension - txt
    Note that this class works best if directory file names end with a separator. If you omit the last separator, it is impossible to determine if the file name corresponds to a file or a directory. As a result, we have chosen to say it corresponds to a file.

    This class only supports Unix and Windows style names. Prefixes are matched as follows:

     Windows:
     a\b\c.txt           --> ""          --> relative
     \a\b\c.txt          --> "\"         --> current drive absolute
     C:a\b\c.txt         --> "C:"        --> drive relative
     C:\a\b\c.txt        --> "C:\"       --> absolute
     \\server\a\b\c.txt  --> "\\server\" --> UNC
    
     Unix:
     a/b/c.txt           --> ""          --> relative
     /a/b/c.txt          --> "/"         --> absolute
     ~/a/b/c.txt         --> "~/"        --> current user
     ~                   --> "~/"        --> current user (slash added)
     ~user/a/b/c.txt     --> "~user/"    --> named user
     ~user               --> "~user/"    --> named user (slash added)
     
    Both prefix styles are matched always, irrespective of the machine that you are currently running on.

    Origin of code: Excalibur, Alexandria, Tomcat, Commons-Utils.

    Since:
    1.1
    • Method Detail

      • normalize

        public static String normalize​(String fileName)
        Normalizes a path, removing double and single dot path steps.

        This method normalizes a path to a standard format. The input may contain separators in either Unix or Windows format. The output will contain separators in the format of the system.

        A trailing slash will be retained. A double slash will be merged to a single slash (but UNC names are handled). A single dot path segment will be removed. A double dot will cause that path segment and the one before to be removed. If the double dot has no parent path segment to work with, null is returned.

        The output will be the same on both Unix and Windows except for the separator character.

         /foo//               -->   /foo/
         /foo/./              -->   /foo/
         /foo/../bar          -->   /bar
         /foo/../bar/         -->   /bar/
         /foo/../bar/../baz   -->   /baz
         //foo//./bar         -->   /foo/bar
         /../                 -->   null
         ../foo               -->   null
         foo/bar/..           -->   foo/
         foo/../../bar        -->   null
         foo/../bar           -->   bar
         //server/foo/../bar  -->   //server/bar
         //server/../bar      -->   null
         C:\foo\..\bar        -->   C:\bar
         C:\..\bar            -->   null
         ~/foo/../bar/        -->   ~/bar/
         ~/../bar             -->   null
         
        (Note the file separator returned will be correct for Windows/Unix)
        Parameters:
        fileName - the fileName to normalize, null returns null
        Returns:
        the normalized fileName, or null if invalid. Null bytes inside string will be removed
      • normalizeNoEndSeparator

        public static String normalizeNoEndSeparator​(String fileName,
                                                     boolean unixSeparator)
        Normalizes a path, removing double and single dot path steps, and removing any final directory separator.

        This method normalizes a path to a standard format. The input may contain separators in either Unix or Windows format. The output will contain separators in the format specified.

        A trailing slash will be removed. A double slash will be merged to a single slash (but UNC names are handled). A single dot path segment will be removed. A double dot will cause that path segment and the one before to be removed. If the double dot has no parent path segment to work with, null is returned.

        The output will be the same on both Unix and Windows including the separator character.

         /foo//               -->   /foo
         /foo/./              -->   /foo
         /foo/../bar          -->   /bar
         /foo/../bar/         -->   /bar
         /foo/../bar/../baz   -->   /baz
         //foo//./bar         -->   /foo/bar
         /../                 -->   null
         ../foo               -->   null
         foo/bar/..           -->   foo
         foo/../../bar        -->   null
         foo/../bar           -->   bar
         //server/foo/../bar  -->   //server/bar
         //server/../bar      -->   null
         C:\foo\..\bar        -->   C:\bar
         C:\..\bar            -->   null
         ~/foo/../bar/        -->   ~/bar
         ~/../bar             -->   null
         
        Parameters:
        fileName - the fileName to normalize, null returns null
        unixSeparator - true if a unix separator should be used or false if a windows separator should be used.
        Returns:
        the normalized fileName, or null if invalid. Null bytes inside string will be removed
        Since:
        2.0
      • separatorsToUnix

        public static String separatorsToUnix​(String path)
        Converts all separators to the Unix separator of forward slash.
        Parameters:
        path - the path to be changed, null ignored
        Returns:
        the updated path
      • separatorsToWindows

        public static String separatorsToWindows​(String path)
        Converts all separators to the Windows separator of backslash.
        Parameters:
        path - the path to be changed, null ignored
        Returns:
        the updated path
      • separatorsToSystem

        public static String separatorsToSystem​(String path)
        Converts all separators to the system separator.
        Parameters:
        path - the path to be changed, null ignored
        Returns:
        the updated path
      • getPrefixLength

        public static int getPrefixLength​(String fileName)
        Returns the length of the fileName prefix, such as C:/ or ~/.

        This method will handle a file in either Unix or Windows format.

        The prefix length includes the first slash in the full fileName if applicable. Thus, it is possible that the length returned is greater than the length of the input string.

         Windows:
         a\b\c.txt           --> 0           --> relative
         \a\b\c.txt          --> 1           --> current drive absolute
         C:a\b\c.txt         --> 2           --> drive relative
         C:\a\b\c.txt        --> 3           --> absolute
         \\server\a\b\c.txt  --> 9           --> UNC
         \\\a\b\c.txt        --> -1          --> error
        
         Unix:
         a/b/c.txt           --> 0           --> relative
         /a/b/c.txt          --> 1           --> absolute
         ~/a/b/c.txt         --> 2           --> current user
         ~                   --> 2           --> current user (slash added)
         ~user/a/b/c.txt     --> 6           --> named user
         ~user               --> 6           --> named user (slash added)
         //server/a/b/c.txt  --> 9
         ///a/b/c.txt        --> -1          --> error
         C:                  --> 0           --> valid filename as only null byte and / are reserved characters
         

        The output will be the same irrespective of the machine that the code is running on. ie. both Unix and Windows prefixes are matched regardless. Note that a leading // (or \\) is used to indicate a UNC name on Windows. These must be followed by a server name, so double-slashes are not collapsed to a single slash at the start of the fileName.

        Parameters:
        fileName - the fileName to find the prefix in, null returns -1
        Returns:
        the length of the prefix, -1 if invalid or null