codecs (Jython API documentation)

java.lang.Object
- org.python.core.codecs

```
public class codecs
extends java.lang.Object
```
This class implements the codec registry and utility methods supporting codecs, such as those providing the standard replacement strategies ("ignore", "backslashreplace", etc.). The _codecs module relies heavily on apparatus implemented here, and therefore so does the Python codecs module (in Lib/codecs.py). It corresponds approximately to CPython's Python/codecs.c.
The class also contains the inner methods of the standard Unicode codecs, available for transcoding of text at the Java level. These also are exposed through the _codecs module. In CPython, the implementations are found in Objects/unicodeobject.c.

Since:

Jython 2.0

Field Summary

Fields
Modifier and Type	Field and Description
`static java.lang.String`	`BACKSLASHREPLACE`
`static java.lang.String`	`IGNORE`
`static java.lang.String`	`REPLACE`
`static java.lang.String`	`XMLCHARREFREPLACE`

Constructor Summary

Constructors
Constructor and Description

codecs()

Constructors
Constructor and Description
`codecs()`

Method Summary

Methods
Modifier and Type	Method and Description
`static PyObject`	`backslashreplace_errors(PyObject[] args, java.lang.String[] kws)`
`static java.lang.StringBuilder`	`backslashreplace(int start, int end, java.lang.String toReplace)`
`static int`	`calcNewPosition(int size, PyObject errorTuple)` Given the return from some codec error handler (invoked while encoding or decoding), which specifies a resume position, and the length of the input being encoded or decoded, check and interpret the resume position.
`static PyObject`	`decode(PyString v, java.lang.String encoding, java.lang.String errors)` Decode the bytes `v` using the codec registered for the `encoding`.
`static PyObject`	`decoding_error(java.lang.String errors, java.lang.String encoding, java.lang.String toDecode, int start, int end, java.lang.String reason)` Invoke a user-defined error-handling mechanism, for errors encountered during decoding, as registered through `register_error(String, PyObject)`.
`static java.lang.String`	`encode(PyString v, java.lang.String encoding, java.lang.String errors)` Encode `v` using the codec registered for the `encoding`.
`static PyObject`	`encoding_error(java.lang.String errors, java.lang.String encoding, java.lang.String toEncode, int start, int end, java.lang.String reason)` Invoke a user-defined error-handling mechanism, for errors encountered during encoding, as registered through `register_error(String, PyObject)`.
`static java.lang.String`	`getDefaultEncoding()`
`static PyObject`	`ignore_errors(PyObject[] args, java.lang.String[] kws)`
`static int`	`insertReplacementAndGetResume(java.lang.StringBuilder partialDecode, java.lang.String errors, java.lang.String encoding, java.lang.String toDecode, int start, int end, java.lang.String reason)` Handler for errors encountered during decoding, adjusting the output buffer contents and returning the correct position to resume decoding (if the handler does not simply raise an exception).
`static PyObject`	`lookup_error(java.lang.String handlerName)`
`static PyTuple`	`lookup(java.lang.String encoding)`
`static java.lang.String`	`PyUnicode_DecodeASCII(java.lang.String str, int size, java.lang.String errors)`
`static PyUnicode`	`PyUnicode_DecodeIDNA(java.lang.String input, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_DecodeLatin1(java.lang.String str, int size, java.lang.String errors)`
`static PyUnicode`	`PyUnicode_DecodePunycode(java.lang.String input, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_DecodeRawUnicodeEscape(java.lang.String str, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_DecodeUTF7(java.lang.String bytes, java.lang.String errors)` Decode completely a sequence of bytes representing the UTF-7 encoded form of a Unicode string and return the (Jython internal representation of) the unicode object.
`static java.lang.String`	`PyUnicode_DecodeUTF7Stateful(java.lang.String bytes, java.lang.String errors, int[] consumed)` Decode (perhaps partially) a sequence of bytes representing the UTF-7 encoded form of a Unicode string and return the (Jython internal representation of) the unicode object, and amount of input consumed.
`static java.lang.String`	`PyUnicode_DecodeUTF8(java.lang.String str, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_DecodeUTF8Stateful(java.lang.String str, java.lang.String errors, int[] consumed)`
`static java.lang.String`	`PyUnicode_EncodeASCII(java.lang.String str, int size, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_EncodeIDNA(PyUnicode input, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_EncodeLatin1(java.lang.String str, int size, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_EncodePunycode(PyUnicode input, java.lang.String errors)`
`static java.lang.String`	`PyUnicode_EncodeRawUnicodeEscape(java.lang.String str, java.lang.String errors, boolean modifed)`
`static java.lang.String`	`PyUnicode_EncodeUTF7(java.lang.String unicode, boolean base64SetO, boolean base64WhiteSpace, java.lang.String errors)` Encode a UTF-16 Java String as UTF-7 bytes represented by the low bytes of the characters in a String.
`static java.lang.String`	`PyUnicode_EncodeUTF8(java.lang.String str, java.lang.String errors)`
`static void`	`register_error(java.lang.String name, PyObject error)`
`static void`	`register(PyObject search_function)`
`static PyObject`	`replace_errors(PyObject[] args, java.lang.String[] kws)`
`static void`	`setDefaultEncoding(java.lang.String encoding)`
`static PyObject`	`strict_errors(PyObject[] args, java.lang.String[] kws)`
`static PyObject`	`xmlcharrefreplace_errors(PyObject[] args, java.lang.String[] kws)`
`static java.lang.StringBuilder`	`xmlcharrefreplace(int start, int end, java.lang.String toReplace)`

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - BACKSLASHREPLACE
```
public static final java.lang.String BACKSLASHREPLACE
```
    See Also:
    Constant Field Values
  - IGNORE
```
public static final java.lang.String IGNORE
```
    See Also:
    Constant Field Values
  - REPLACE
```
public static final java.lang.String REPLACE
```
    See Also:
    Constant Field Values
  - XMLCHARREFREPLACE
```
public static final java.lang.String XMLCHARREFREPLACE
```
    See Also:
    Constant Field Values
- Constructor Detail
  - codecs
```
public codecs()
```
- Method Detail
  - getDefaultEncoding
```
public static java.lang.String getDefaultEncoding()
```
  - setDefaultEncoding
```
public static void setDefaultEncoding(java.lang.String encoding)
```
  - lookup_error
```
public static PyObject lookup_error(java.lang.String handlerName)
```
  - register_error
```
public static void register_error(java.lang.String name,
                  PyObject error)
```
  - register
```
public static void register(PyObject search_function)
```
  - lookup
```
public static PyTuple lookup(java.lang.String encoding)
```
  - decode
```
public static PyObject decode(PyString v,
              java.lang.String encoding,
              java.lang.String errors)
```
    Decode the bytes v using the codec registered for the encoding. The encoding defaults to the system default encoding (see getDefaultEncoding()). The string errors may name a different error handling policy (built-in or registered with register_error(String, PyObject)). The default error policy is 'strict' meaning that encoding errors raise a ValueError. This method is exposed through the _codecs module as _codecs.decode(PyString, String, String).
    
    Parameters:
    v - bytes to be decoded
    encoding - name of encoding (to look up in codec registry)
    errors - error policy name (e.g. "ignore", "replace")
    
    Returns:
    Unicode string decoded from bytes
  - encode
```
public static java.lang.String encode(PyString v,
                      java.lang.String encoding,
                      java.lang.String errors)
```
    Encode v using the codec registered for the encoding. The encoding defaults to the system default encoding (see getDefaultEncoding()). The string errors may name a different error handling policy (built-in or registered with register_error(String, PyObject)). The default error policy is 'strict' meaning that encoding errors raise a ValueError.
    
    Parameters:
    v - unicode string to be encoded
    encoding - name of encoding (to look up in codec registry)
    errors - error policy name (e.g. "ignore")
    
    Returns:
    bytes object encoding v
  - strict_errors
```
public static PyObject strict_errors(PyObject[] args,
                     java.lang.String[] kws)
```
  - ignore_errors
```
public static PyObject ignore_errors(PyObject[] args,
                     java.lang.String[] kws)
```
  - replace_errors
```
public static PyObject replace_errors(PyObject[] args,
                      java.lang.String[] kws)
```
  - xmlcharrefreplace_errors
```
public static PyObject xmlcharrefreplace_errors(PyObject[] args,
                                java.lang.String[] kws)
```
  - xmlcharrefreplace
```
public static java.lang.StringBuilder xmlcharrefreplace(int start,
                                        int end,
                                        java.lang.String toReplace)
```
  - backslashreplace_errors
```
public static PyObject backslashreplace_errors(PyObject[] args,
                               java.lang.String[] kws)
```
  - backslashreplace
```
public static java.lang.StringBuilder backslashreplace(int start,
                                       int end,
                                       java.lang.String toReplace)
```
  - PyUnicode_DecodeUTF7Stateful
```
public static java.lang.String PyUnicode_DecodeUTF7Stateful(java.lang.String bytes,
                                            java.lang.String errors,
                                            int[] consumed)
```
    Decode (perhaps partially) a sequence of bytes representing the UTF-7 encoded form of a Unicode string and return the (Jython internal representation of) the unicode object, and amount of input consumed. The only state we preserve is our read position, i.e. how many bytes we have consumed. So if the input ends part way through a Base64 sequence the data reported as consumed is just that up to and not including the Base64 start marker ('+'). Performance will be poor (quadratic cost) on runs of Base64 data long enough to exceed the input quantum in incremental decoding. The returned Java String is a UTF-16 representation of the Unicode result, in line with Java conventions. Unicode characters above the BMP are represented as surrogate pairs.
    
    Parameters:
    bytes - input represented as String (Jython PyString convention)
    errors - error policy name (e.g. "ignore", "replace")
    consumed - returns number of bytes consumed in element 0, or is null if a "final" call
    
    Returns:
    unicode result (as UTF-16 Java String)
  - PyUnicode_DecodeUTF7
```
public static java.lang.String PyUnicode_DecodeUTF7(java.lang.String bytes,
                                    java.lang.String errors)
```
    Decode completely a sequence of bytes representing the UTF-7 encoded form of a Unicode string and return the (Jython internal representation of) the unicode object. The retruned Java String is a UTF-16 representation of the Unicode result, in line with Java conventions. Unicode characters above the BMP are represented as surrogate pairs.
    
    Parameters:
    bytes - input represented as String (Jython PyString convention)
    errors - error policy name (e.g. "ignore", "replace")
    
    Returns:
    unicode result (as UTF-16 Java String)
  - PyUnicode_EncodeUTF7
```
public static java.lang.String PyUnicode_EncodeUTF7(java.lang.String unicode,
                                    boolean base64SetO,
                                    boolean base64WhiteSpace,
                                    java.lang.String errors)
```
    Encode a UTF-16 Java String as UTF-7 bytes represented by the low bytes of the characters in a String. (String representation for byte data is chosen so that it may immediately become a PyString.) This method differs from the CPython equivalent (in Object/unicodeobject.c) which works with an array of code points that are, in a wide build, Unicode code points.
    
    Parameters:
    unicode -
    base64SetO -
    base64WhiteSpace -
    errors -
    
    Returns:
  - PyUnicode_DecodeUTF8
```
public static java.lang.String PyUnicode_DecodeUTF8(java.lang.String str,
                                    java.lang.String errors)
```
  - PyUnicode_DecodeUTF8Stateful
```
public static java.lang.String PyUnicode_DecodeUTF8Stateful(java.lang.String str,
                                            java.lang.String errors,
                                            int[] consumed)
```
  - PyUnicode_EncodeUTF8
```
public static java.lang.String PyUnicode_EncodeUTF8(java.lang.String str,
                                    java.lang.String errors)
```
  - PyUnicode_DecodeASCII
```
public static java.lang.String PyUnicode_DecodeASCII(java.lang.String str,
                                     int size,
                                     java.lang.String errors)
```
  - PyUnicode_DecodeLatin1
```
public static java.lang.String PyUnicode_DecodeLatin1(java.lang.String str,
                                      int size,
                                      java.lang.String errors)
```
  - PyUnicode_EncodeASCII
```
public static java.lang.String PyUnicode_EncodeASCII(java.lang.String str,
                                     int size,
                                     java.lang.String errors)
```
  - PyUnicode_EncodeLatin1
```
public static java.lang.String PyUnicode_EncodeLatin1(java.lang.String str,
                                      int size,
                                      java.lang.String errors)
```
  - PyUnicode_EncodeRawUnicodeEscape
```
public static java.lang.String PyUnicode_EncodeRawUnicodeEscape(java.lang.String str,
                                                java.lang.String errors,
                                                boolean modifed)
```
  - PyUnicode_DecodeRawUnicodeEscape
```
public static java.lang.String PyUnicode_DecodeRawUnicodeEscape(java.lang.String str,
                                                java.lang.String errors)
```
  - PyUnicode_EncodePunycode
```
public static java.lang.String PyUnicode_EncodePunycode(PyUnicode input,
                                        java.lang.String errors)
```
  - PyUnicode_DecodePunycode
```
public static PyUnicode PyUnicode_DecodePunycode(java.lang.String input,
                                 java.lang.String errors)
```
  - PyUnicode_EncodeIDNA
```
public static java.lang.String PyUnicode_EncodeIDNA(PyUnicode input,
                                    java.lang.String errors)
```
  - PyUnicode_DecodeIDNA
```
public static PyUnicode PyUnicode_DecodeIDNA(java.lang.String input,
                             java.lang.String errors)
```
  - encoding_error
```
public static PyObject encoding_error(java.lang.String errors,
                      java.lang.String encoding,
                      java.lang.String toEncode,
                      int start,
                      int end,
                      java.lang.String reason)
```
    Invoke a user-defined error-handling mechanism, for errors encountered during encoding, as registered through register_error(String, PyObject). The return value is the return from the error handler indicating the replacement codec input and the the position at which to resume encoding. Invokes the mechanism described in PEP-293.
    
    Parameters:
    errors - name of the error policy (or null meaning "strict")
    encoding - name of encoding that encountered the error
    toEncode - unicode string being encoded
    start - index of first char it couldn't encode
    end - index+1 of last char it couldn't encode (usually becomes the resume point)
    reason - contribution to error message if any
    
    Returns:
    must be a tuple (replacement_unicode, resume_index)
  - insertReplacementAndGetResume
```
public static int insertReplacementAndGetResume(java.lang.StringBuilder partialDecode,
                                java.lang.String errors,
                                java.lang.String encoding,
                                java.lang.String toDecode,
                                int start,
                                int end,
                                java.lang.String reason)
```
    Handler for errors encountered during decoding, adjusting the output buffer contents and returning the correct position to resume decoding (if the handler does not simply raise an exception).
    
    Parameters:
    partialDecode - output buffer of unicode (as UTF-16) that the codec is building
    errors - name of the error policy (or null meaning "strict")
    encoding - name of encoding that encountered the error
    toDecode - bytes being decoded
    start - index of first byte it couldn't decode
    end - index+1 of last byte it couldn't decode (usually becomes the resume point)
    reason - contribution to error message if any
    
    Returns:
    the resume position: index of next byte to decode
  - decoding_error
```
public static PyObject decoding_error(java.lang.String errors,
                      java.lang.String encoding,
                      java.lang.String toDecode,
                      int start,
                      int end,
                      java.lang.String reason)
```
    Invoke a user-defined error-handling mechanism, for errors encountered during decoding, as registered through register_error(String, PyObject). The return value is the return from the error handler indicating the replacement codec output and the the position at which to resume decoding. Invokes the mechanism described in PEP-293.
    
    Parameters:
    errors - name of the error policy (or null meaning "strict")
    encoding - name of encoding that encountered the error
    toDecode - bytes being decoded
    start - index of first byte it couldn't decode
    end - index+1 of last byte it couldn't decode (usually becomes the resume point)
    reason - contribution to error message if any
    
    Returns:
    must be a tuple (replacement_unicode, resume_index)
  - calcNewPosition
```
public static int calcNewPosition(int size,
                  PyObject errorTuple)
```
    Given the return from some codec error handler (invoked while encoding or decoding), which specifies a resume position, and the length of the input being encoded or decoded, check and interpret the resume position. Negative indexes in the error handler return are interpreted as "from the end". If the result would be out of bounds in the input, an IndexError exception is raised.
    
    Parameters:
    size - of byte buffer being decoded
    errorTuple - returned from error handler
    
    Returns:
    absolute resume position.

Class codecs

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

BACKSLASHREPLACE

IGNORE

REPLACE

XMLCHARREFREPLACE

Constructor Detail

codecs

Method Detail

getDefaultEncoding

setDefaultEncoding

lookup_error

register_error

register

lookup

decode

encode

strict_errors

ignore_errors

replace_errors

xmlcharrefreplace_errors

xmlcharrefreplace

backslashreplace_errors

backslashreplace

PyUnicode_DecodeUTF7Stateful

PyUnicode_DecodeUTF7

PyUnicode_EncodeUTF7

PyUnicode_DecodeUTF8

PyUnicode_DecodeUTF8Stateful

PyUnicode_EncodeUTF8

PyUnicode_DecodeASCII

PyUnicode_DecodeLatin1

PyUnicode_EncodeASCII

PyUnicode_EncodeLatin1

PyUnicode_EncodeRawUnicodeEscape

PyUnicode_DecodeRawUnicodeEscape

PyUnicode_EncodePunycode

PyUnicode_DecodePunycode

PyUnicode_EncodeIDNA

PyUnicode_DecodeIDNA

encoding_error

insertReplacementAndGetResume

decoding_error

calcNewPosition