Class UnicodeBOMInputStream

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class UnicodeBOMInputStream
    extends InputStream
    The UnicodeBOMInputStream class wraps any InputStream and detects the presence of any Unicode BOM (Byte Order Mark) at its beginning, as defined by RFC 3629 - UTF-8, a transformation format of ISO 10646.

    The Unicode FAQ defines 5 types of BOMs:

    •  00 00 FE FF  = UTF-32, big-endian
       
    •  FF FE 00 00  = UTF-32, little-endian
       
    •  FE FF        = UTF-16, big-endian
       
    •  FF FE        = UTF-16, little-endian
       
    •  EF BB BF     = UTF-8
       

    Use the getBOM() method to know whether a BOM has been detected or not.

    Use the skipBOM() method to remove the detected BOM from the wrapped InputStream object.

    Version:
    1.0
    Author:
    Gregory Pakosz