public class codecs extends Object
codecs
module (in Lib/codecs.py
). It corresponds approximately to
CPython's Python/codecs.c
.
The class also contains the inner methods of the standard Unicode codecs, available for
transcoding of text at the Java level. These also are exposed through the _codecs
module. In CPython, the implementations are found in Objects/unicodeobject.c
.
Modifier and Type | Class and Description |
---|---|
static class |
codecs.CodecState |
Modifier and Type | Field and Description |
---|---|
static String |
BACKSLASHREPLACE |
static String |
IGNORE |
static String |
REPLACE |
static String |
XMLCHARREFREPLACE |
Constructor and Description |
---|
codecs() |
Modifier and Type | Method and Description |
---|---|
static PyObject |
backslashreplace_errors(PyObject[] args,
String[] kws) |
static StringBuilder |
backslashreplace(int start,
int end,
String toReplace) |
static int |
calcNewPosition(int size,
PyObject errorTuple)
Given the return from some codec error handler (invoked while encoding or decoding), which
specifies a resume position, and the length of the input being encoded or decoded, check and
interpret the resume position.
|
static PyObject |
decode(PyString v,
String encoding,
String errors)
Decode the bytes
v using the codec registered for the encoding . |
static PyObject |
decoding_error(String errors,
String encoding,
String toDecode,
int start,
int end,
String reason)
Invoke a user-defined error-handling mechanism, for errors encountered during decoding, as
registered through
register_error(String, PyObject) . |
static String |
encode(PyString v,
String encoding,
String errors)
Encode
v using the codec registered for the encoding . |
static PyObject |
encoding_error(String errors,
String encoding,
String toEncode,
int start,
int end,
String reason)
Invoke a user-defined error-handling mechanism, for errors encountered during encoding, as
registered through
register_error(String, PyObject) . |
static String |
getDefaultEncoding() |
static PyObject |
ignore_errors(PyObject[] args,
String[] kws) |
static int |
insertReplacementAndGetResume(StringBuilder partialDecode,
String errors,
String encoding,
String toDecode,
int start,
int end,
String reason)
Handler for errors encountered during decoding, adjusting the output buffer contents and
returning the correct position to resume decoding (if the handler does not simply raise an
exception).
|
static PyObject |
lookup_error(String handlerName) |
static PyTuple |
lookup(String encoding) |
static String |
PyUnicode_DecodeASCII(String str,
int size,
String errors) |
static PyUnicode |
PyUnicode_DecodeIDNA(String input,
String errors) |
static String |
PyUnicode_DecodeLatin1(String str,
int size,
String errors) |
static PyUnicode |
PyUnicode_DecodePunycode(String input,
String errors) |
static String |
PyUnicode_DecodeRawUnicodeEscape(String str,
String errors) |
static String |
PyUnicode_DecodeUTF7(String bytes,
String errors)
Decode completely a sequence of bytes representing the UTF-7 encoded form of a Unicode string
and return the (Jython internal representation of) the unicode object.
|
static String |
PyUnicode_DecodeUTF7Stateful(String bytes,
String errors,
int[] consumed)
Decode (perhaps partially) a sequence of bytes representing the UTF-7 encoded form of a
Unicode string and return the (Jython internal representation of) the unicode object, and
amount of input consumed.
|
static String |
PyUnicode_DecodeUTF8(String str,
String errors) |
static String |
PyUnicode_DecodeUTF8Stateful(String str,
String errors,
int[] consumed) |
static String |
PyUnicode_EncodeASCII(String str,
int size,
String errors) |
static String |
PyUnicode_EncodeIDNA(PyUnicode input,
String errors) |
static String |
PyUnicode_EncodeLatin1(String str,
int size,
String errors) |
static String |
PyUnicode_EncodePunycode(PyUnicode input,
String errors) |
static String |
PyUnicode_EncodeRawUnicodeEscape(String str,
String errors,
boolean modifed) |
static String |
PyUnicode_EncodeUTF7(String unicode,
boolean base64SetO,
boolean base64WhiteSpace,
String errors)
Encode a UTF-16 Java String as UTF-7 bytes represented by the low bytes of the characters in
a String.
|
static String |
PyUnicode_EncodeUTF8(String str,
String errors) |
static void |
register_error(String name,
PyObject error) |
static void |
register(PyObject search_function) |
static PyObject |
replace_errors(PyObject[] args,
String[] kws) |
static void |
setDefaultEncoding(String encoding) |
static PyObject |
strict_errors(PyObject[] args,
String[] kws) |
static PyObject |
xmlcharrefreplace_errors(PyObject[] args,
String[] kws) |
static StringBuilder |
xmlcharrefreplace(int start,
int end,
String toReplace) |
public static final String BACKSLASHREPLACE
public static final String IGNORE
public static final String REPLACE
public static final String XMLCHARREFREPLACE
public static String getDefaultEncoding()
public static void setDefaultEncoding(String encoding)
public static void register(PyObject search_function)
public static PyObject decode(PyString v, String encoding, String errors)
v
using the codec registered for the encoding
.
The encoding
defaults to the system default encoding
(see getDefaultEncoding()
).
The string errors
may name a different error handling
policy (built-in or registered with register_error(String, PyObject)
).
The default error policy is 'strict' meaning that encoding errors raise a
ValueError
.
This method is exposed through the _codecs module as
_codecs.decode(PyString, String, String)
.v
- bytes to be decodedencoding
- name of encoding (to look up in codec registry)errors
- error policy name (e.g. "ignore", "replace")bytes
public static String encode(PyString v, String encoding, String errors)
v
using the codec registered for the encoding
.
The encoding
defaults to the system default encoding
(see getDefaultEncoding()
).
The string errors
may name a different error handling
policy (built-in or registered with register_error(String, PyObject)
).
The default error policy is 'strict' meaning that encoding errors raise a
ValueError
.v
- unicode string to be encodedencoding
- name of encoding (to look up in codec registry)errors
- error policy name (e.g. "ignore")v
public static PyObject xmlcharrefreplace_errors(PyObject[] args, String[] kws)
public static StringBuilder xmlcharrefreplace(int start, int end, String toReplace)
public static PyObject backslashreplace_errors(PyObject[] args, String[] kws)
public static StringBuilder backslashreplace(int start, int end, String toReplace)
public static String PyUnicode_DecodeUTF7Stateful(String bytes, String errors, int[] consumed)
bytes
- input represented as String (Jython PyString convention)errors
- error policy name (e.g. "ignore", "replace")consumed
- returns number of bytes consumed in element 0, or is null if a "final" callpublic static String PyUnicode_DecodeUTF7(String bytes, String errors)
bytes
- input represented as String (Jython PyString convention)errors
- error policy name (e.g. "ignore", "replace")public static String PyUnicode_EncodeUTF7(String unicode, boolean base64SetO, boolean base64WhiteSpace, String errors)
Object/unicodeobject.c
)
which works with an array of code points that are, in a wide build, Unicode code points.unicode
- to be encodedbase64SetO
- true if characters in "set O" should be translated to base64base64WhiteSpace
- true if white-space characters should be translated to base64errors
- error policy name (e.g. "ignore", "replace")public static String PyUnicode_DecodeUTF8Stateful(String str, String errors, int[] consumed)
public static String PyUnicode_DecodeASCII(String str, int size, String errors)
public static String PyUnicode_DecodeLatin1(String str, int size, String errors)
public static String PyUnicode_EncodeASCII(String str, int size, String errors)
public static String PyUnicode_EncodeLatin1(String str, int size, String errors)
public static String PyUnicode_EncodeRawUnicodeEscape(String str, String errors, boolean modifed)
public static String PyUnicode_DecodeRawUnicodeEscape(String str, String errors)
public static String PyUnicode_EncodePunycode(PyUnicode input, String errors)
public static PyUnicode PyUnicode_DecodePunycode(String input, String errors)
public static PyObject encoding_error(String errors, String encoding, String toEncode, int start, int end, String reason)
register_error(String, PyObject)
. The return value is the return
from the error handler indicating the replacement codec input and the the position at
which to resume encoding. Invokes the mechanism described in PEP-293.errors
- name of the error policy (or null meaning "strict")encoding
- name of encoding that encountered the errortoEncode
- unicode string being encodedstart
- index of first char it couldn't encodeend
- index+1 of last char it couldn't encode (usually becomes the resume point)reason
- contribution to error message if any(replacement_unicode, resume_index)
public static int insertReplacementAndGetResume(StringBuilder partialDecode, String errors, String encoding, String toDecode, int start, int end, String reason)
partialDecode
- output buffer of unicode (as UTF-16) that the codec is buildingerrors
- name of the error policy (or null meaning "strict")encoding
- name of encoding that encountered the errortoDecode
- bytes being decodedstart
- index of first byte it couldn't decodeend
- index+1 of last byte it couldn't decode (usually becomes the resume point)reason
- contribution to error message if anypublic static PyObject decoding_error(String errors, String encoding, String toDecode, int start, int end, String reason)
register_error(String, PyObject)
. The return value is the return
from the error handler indicating the replacement codec output and the the position at
which to resume decoding. Invokes the mechanism described in PEP-293.errors
- name of the error policy (or null meaning "strict")encoding
- name of encoding that encountered the errortoDecode
- bytes being decodedstart
- index of first byte it couldn't decodeend
- index+1 of last byte it couldn't decode (usually becomes the resume point)reason
- contribution to error message if any(replacement_unicode, resume_index)
public static int calcNewPosition(int size, PyObject errorTuple)
IndexError
exception is raised.size
- of byte buffer being decodederrorTuple
- returned from error handler