public class codecs
extends java.lang.Object
codecs
module (in Lib/codecs.py
). It corresponds approximately to
CPython's Python/codecs.c
.
The class also contains the inner methods of the standard Unicode codecs, available for
transcoding of text at the Java level. These also are exposed through the _codecs
module. In CPython, the implementations are found in Objects/unicodeobject.c
.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
BACKSLASHREPLACE |
static java.lang.String |
IGNORE |
static java.lang.String |
REPLACE |
static java.lang.String |
XMLCHARREFREPLACE |
Constructor and Description |
---|
codecs() |
Modifier and Type | Method and Description |
---|---|
static PyObject |
backslashreplace_errors(PyObject[] args,
java.lang.String[] kws) |
static java.lang.StringBuilder |
backslashreplace(int start,
int end,
java.lang.String toReplace) |
static int |
calcNewPosition(int size,
PyObject errorTuple)
Given the return from some codec error handler (invoked while encoding or decoding), which
specifies a resume position, and the length of the input being encoded or decoded, check and
interpret the resume position.
|
static PyObject |
decode(PyString v,
java.lang.String encoding,
java.lang.String errors)
Decode the bytes
v using the codec registered for the encoding . |
static PyObject |
decoding_error(java.lang.String errors,
java.lang.String encoding,
java.lang.String toDecode,
int start,
int end,
java.lang.String reason)
Invoke a user-defined error-handling mechanism, for errors encountered during decoding, as
registered through
register_error(String, PyObject) . |
static java.lang.String |
encode(PyString v,
java.lang.String encoding,
java.lang.String errors)
Encode
v using the codec registered for the encoding . |
static PyObject |
encoding_error(java.lang.String errors,
java.lang.String encoding,
java.lang.String toEncode,
int start,
int end,
java.lang.String reason)
Invoke a user-defined error-handling mechanism, for errors encountered during encoding, as
registered through
register_error(String, PyObject) . |
static java.lang.String |
getDefaultEncoding() |
static PyObject |
ignore_errors(PyObject[] args,
java.lang.String[] kws) |
static int |
insertReplacementAndGetResume(java.lang.StringBuilder partialDecode,
java.lang.String errors,
java.lang.String encoding,
java.lang.String toDecode,
int start,
int end,
java.lang.String reason)
Handler for errors encountered during decoding, adjusting the output buffer contents and
returning the correct position to resume decoding (if the handler does not simply raise an
exception).
|
static PyObject |
lookup_error(java.lang.String handlerName) |
static PyTuple |
lookup(java.lang.String encoding) |
static java.lang.String |
PyUnicode_DecodeASCII(java.lang.String str,
int size,
java.lang.String errors) |
static PyUnicode |
PyUnicode_DecodeIDNA(java.lang.String input,
java.lang.String errors) |
static java.lang.String |
PyUnicode_DecodeLatin1(java.lang.String str,
int size,
java.lang.String errors) |
static PyUnicode |
PyUnicode_DecodePunycode(java.lang.String input,
java.lang.String errors) |
static java.lang.String |
PyUnicode_DecodeRawUnicodeEscape(java.lang.String str,
java.lang.String errors) |
static java.lang.String |
PyUnicode_DecodeUTF7(java.lang.String bytes,
java.lang.String errors)
Decode completely a sequence of bytes representing the UTF-7 encoded form of a Unicode string
and return the (Jython internal representation of) the unicode object.
|
static java.lang.String |
PyUnicode_DecodeUTF7Stateful(java.lang.String bytes,
java.lang.String errors,
int[] consumed)
Decode (perhaps partially) a sequence of bytes representing the UTF-7 encoded form of a
Unicode string and return the (Jython internal representation of) the unicode object, and
amount of input consumed.
|
static java.lang.String |
PyUnicode_DecodeUTF8(java.lang.String str,
java.lang.String errors) |
static java.lang.String |
PyUnicode_DecodeUTF8Stateful(java.lang.String str,
java.lang.String errors,
int[] consumed) |
static java.lang.String |
PyUnicode_EncodeASCII(java.lang.String str,
int size,
java.lang.String errors) |
static java.lang.String |
PyUnicode_EncodeIDNA(PyUnicode input,
java.lang.String errors) |
static java.lang.String |
PyUnicode_EncodeLatin1(java.lang.String str,
int size,
java.lang.String errors) |
static java.lang.String |
PyUnicode_EncodePunycode(PyUnicode input,
java.lang.String errors) |
static java.lang.String |
PyUnicode_EncodeRawUnicodeEscape(java.lang.String str,
java.lang.String errors,
boolean modifed) |
static java.lang.String |
PyUnicode_EncodeUTF7(java.lang.String unicode,
boolean base64SetO,
boolean base64WhiteSpace,
java.lang.String errors)
Encode a UTF-16 Java String as UTF-7 bytes represented by the low bytes of the characters in
a String.
|
static java.lang.String |
PyUnicode_EncodeUTF8(java.lang.String str,
java.lang.String errors) |
static void |
register_error(java.lang.String name,
PyObject error) |
static void |
register(PyObject search_function) |
static PyObject |
replace_errors(PyObject[] args,
java.lang.String[] kws) |
static void |
setDefaultEncoding(java.lang.String encoding) |
static PyObject |
strict_errors(PyObject[] args,
java.lang.String[] kws) |
static PyObject |
xmlcharrefreplace_errors(PyObject[] args,
java.lang.String[] kws) |
static java.lang.StringBuilder |
xmlcharrefreplace(int start,
int end,
java.lang.String toReplace) |
public static final java.lang.String BACKSLASHREPLACE
public static final java.lang.String IGNORE
public static final java.lang.String REPLACE
public static final java.lang.String XMLCHARREFREPLACE
public static java.lang.String getDefaultEncoding()
public static void setDefaultEncoding(java.lang.String encoding)
public static PyObject lookup_error(java.lang.String handlerName)
public static void register_error(java.lang.String name, PyObject error)
public static void register(PyObject search_function)
public static PyTuple lookup(java.lang.String encoding)
public static PyObject decode(PyString v, java.lang.String encoding, java.lang.String errors)
v
using the codec registered for the encoding
.
The encoding
defaults to the system default encoding
(see getDefaultEncoding()
).
The string errors
may name a different error handling
policy (built-in or registered with register_error(String, PyObject)
).
The default error policy is 'strict' meaning that encoding errors raise a
ValueError
.
This method is exposed through the _codecs module as
_codecs.decode(PyString, String, String)
.v
- bytes to be decodedencoding
- name of encoding (to look up in codec registry)errors
- error policy name (e.g. "ignore", "replace")bytes
public static java.lang.String encode(PyString v, java.lang.String encoding, java.lang.String errors)
v
using the codec registered for the encoding
.
The encoding
defaults to the system default encoding
(see getDefaultEncoding()
).
The string errors
may name a different error handling
policy (built-in or registered with register_error(String, PyObject)
).
The default error policy is 'strict' meaning that encoding errors raise a
ValueError
.v
- unicode string to be encodedencoding
- name of encoding (to look up in codec registry)errors
- error policy name (e.g. "ignore")v
public static PyObject xmlcharrefreplace_errors(PyObject[] args, java.lang.String[] kws)
public static java.lang.StringBuilder xmlcharrefreplace(int start, int end, java.lang.String toReplace)
public static PyObject backslashreplace_errors(PyObject[] args, java.lang.String[] kws)
public static java.lang.StringBuilder backslashreplace(int start, int end, java.lang.String toReplace)
public static java.lang.String PyUnicode_DecodeUTF7Stateful(java.lang.String bytes, java.lang.String errors, int[] consumed)
bytes
- input represented as String (Jython PyString convention)errors
- error policy name (e.g. "ignore", "replace")consumed
- returns number of bytes consumed in element 0, or is null if a "final" callpublic static java.lang.String PyUnicode_DecodeUTF7(java.lang.String bytes, java.lang.String errors)
bytes
- input represented as String (Jython PyString convention)errors
- error policy name (e.g. "ignore", "replace")public static java.lang.String PyUnicode_EncodeUTF7(java.lang.String unicode, boolean base64SetO, boolean base64WhiteSpace, java.lang.String errors)
Object/unicodeobject.c
)
which works with an array of code points that are, in a wide build, Unicode code points.unicode
- base64SetO
- base64WhiteSpace
- errors
- public static java.lang.String PyUnicode_DecodeUTF8(java.lang.String str, java.lang.String errors)
public static java.lang.String PyUnicode_DecodeUTF8Stateful(java.lang.String str, java.lang.String errors, int[] consumed)
public static java.lang.String PyUnicode_EncodeUTF8(java.lang.String str, java.lang.String errors)
public static java.lang.String PyUnicode_DecodeASCII(java.lang.String str, int size, java.lang.String errors)
public static java.lang.String PyUnicode_DecodeLatin1(java.lang.String str, int size, java.lang.String errors)
public static java.lang.String PyUnicode_EncodeASCII(java.lang.String str, int size, java.lang.String errors)
public static java.lang.String PyUnicode_EncodeLatin1(java.lang.String str, int size, java.lang.String errors)
public static java.lang.String PyUnicode_EncodeRawUnicodeEscape(java.lang.String str, java.lang.String errors, boolean modifed)
public static java.lang.String PyUnicode_DecodeRawUnicodeEscape(java.lang.String str, java.lang.String errors)
public static java.lang.String PyUnicode_EncodePunycode(PyUnicode input, java.lang.String errors)
public static PyUnicode PyUnicode_DecodePunycode(java.lang.String input, java.lang.String errors)
public static java.lang.String PyUnicode_EncodeIDNA(PyUnicode input, java.lang.String errors)
public static PyUnicode PyUnicode_DecodeIDNA(java.lang.String input, java.lang.String errors)
public static PyObject encoding_error(java.lang.String errors, java.lang.String encoding, java.lang.String toEncode, int start, int end, java.lang.String reason)
register_error(String, PyObject)
. The return value is the return
from the error handler indicating the replacement codec input and the the position at
which to resume encoding. Invokes the mechanism described in PEP-293.errors
- name of the error policy (or null meaning "strict")encoding
- name of encoding that encountered the errortoEncode
- unicode string being encodedstart
- index of first char it couldn't encodeend
- index+1 of last char it couldn't encode (usually becomes the resume point)reason
- contribution to error message if any(replacement_unicode, resume_index)
public static int insertReplacementAndGetResume(java.lang.StringBuilder partialDecode, java.lang.String errors, java.lang.String encoding, java.lang.String toDecode, int start, int end, java.lang.String reason)
partialDecode
- output buffer of unicode (as UTF-16) that the codec is buildingerrors
- name of the error policy (or null meaning "strict")encoding
- name of encoding that encountered the errortoDecode
- bytes being decodedstart
- index of first byte it couldn't decodeend
- index+1 of last byte it couldn't decode (usually becomes the resume point)reason
- contribution to error message if anypublic static PyObject decoding_error(java.lang.String errors, java.lang.String encoding, java.lang.String toDecode, int start, int end, java.lang.String reason)
register_error(String, PyObject)
. The return value is the return
from the error handler indicating the replacement codec output and the the position at
which to resume decoding. Invokes the mechanism described in PEP-293.errors
- name of the error policy (or null meaning "strict")encoding
- name of encoding that encountered the errortoDecode
- bytes being decodedstart
- index of first byte it couldn't decodeend
- index+1 of last byte it couldn't decode (usually becomes the resume point)reason
- contribution to error message if any(replacement_unicode, resume_index)
public static int calcNewPosition(int size, PyObject errorTuple)
IndexError
exception is raised.size
- of byte buffer being decodederrorTuple
- returned from error handler