public class PyUnicode extends PyString implements java.lang.Iterable<java.lang.Integer>
PySequence.DefaultIndexDelegate
PyObject.ConversionException
Modifier and Type | Field and Description |
---|---|
static PyType |
TYPE |
delegator
attributes, gcMonitorGlobal, objtype
Constructor and Description |
---|
PyUnicode() |
PyUnicode(char c) |
PyUnicode(java.util.Collection<java.lang.Integer> ucs4) |
PyUnicode(int codepoint) |
PyUnicode(int[] codepoints) |
PyUnicode(java.util.Iterator<java.lang.Integer> iter) |
PyUnicode(PyString pystring) |
PyUnicode(PyType subtype,
PyString pystring) |
PyUnicode(PyType subtype,
java.lang.String string) |
PyUnicode(java.lang.String string)
Construct a PyUnicode interpreting the Java String argument as UTF-16.
|
PyUnicode(java.lang.String string,
boolean isBasic)
Construct a PyUnicode interpreting the Java String argument as UTF-16.
|
Modifier and Type | Method and Description |
---|---|
PyObject |
__add__(PyObject other)
Equivalent to the standard Python __add__ method.
|
PyComplex |
__complex__()
Equivalent to the standard Python __complex__ method.
|
boolean |
__contains__(PyObject o)
Equivalent to the standard Python __contains__ method.
|
PyObject |
__eq__(PyObject other)
Equivalent to the standard Python __eq__ method.
|
PyObject |
__format__(PyObject formatSpec) |
PyObject |
__ge__(PyObject other)
Equivalent to the standard Python __ge__ method.
|
PyObject |
__gt__(PyObject other)
Equivalent to the standard Python __gt__ method.
|
PyObject |
__le__(PyObject other)
Equivalent to the standard Python __le__ method.
|
int |
__len__()
Equivalent to the standard Python __len__ method.
|
PyObject |
__lt__(PyObject other)
Equivalent to the standard Python __lt__ method.
|
PyObject |
__mod__(PyObject other)
Equivalent to the standard Python __mod__ method
|
PyObject |
__ne__(PyObject other)
Equivalent to the standard Python __ne__ method.
|
PyString |
__repr__()
Equivalent to the standard Python
__repr__ method. |
PyString |
__str__()
Equivalent to the standard Python __str__ method.
|
PyUnicode |
__unicode__() |
protected int |
_findLeft(int right)
Helper for
strip , lstrip implementation, when stripping whitespace. |
protected int |
_findRight()
Helper for
strip , rstrip implementation, when stripping whitespace. |
double |
atof()
Convert this PyString to a floating-point value according to Python rules.
|
int |
atoi(int base) |
PyLong |
atol(int base) |
static java.lang.String |
checkEncoding(java.lang.String s) |
PyString |
createInstance(java.lang.String str)
Create an instance of the same type as this object, from the Java String given as argument.
|
protected PyString |
createInstance(java.lang.String string,
boolean isBasic)
Create an instance of the same type as this object, from the Java String given as argument.
|
boolean |
endswith(PyObject suffix,
PyObject start,
PyObject end)
Equivalent to the Python
unicode.endswith method, testing whether a string ends
with a specified suffix, where a sub-range is specified by [start:end] . |
static PyUnicode |
fromInterned(java.lang.String interned)
Creates a PyUnicode from an already interned String.
|
protected PyString |
fromSubstring(int begin,
int end)
Return a new object of the same type as this one equal to the slice
[begin:end] . |
PyBuffer |
getBuffer(int flags)
PyUnicode implements the interface BufferProtocol technically by inheritance from PyString ,
but does not provide a buffer (in CPython). |
int |
getCodePointCount() |
int |
getInt(int i) |
protected PyObject |
getslice(int start,
int stop,
int step)
Returns a range of elements from the sequence.
|
boolean |
isBasicPlane()
Determine whether the string consists entirely of basic-plane characters.
|
java.util.Iterator<java.lang.Integer> |
iterator() |
PyString |
join(PyObject seq) |
java.util.Iterator<java.lang.Integer> |
newSubsequenceIterator()
Get an iterator over the code point sequence.
|
java.util.Iterator<java.lang.Integer> |
newSubsequenceIterator(int start,
int stop,
int step)
Get an iterator over a slice of the code point sequence.
|
PyTuple |
partition(PyObject sep)
Equivalent to Python
str.partition() , splits the PyString at the
first occurrence of sepObj returning a PyTuple containing the part
before the separator, the separator itself, and the part after the separator. |
protected PyObject |
pyget(int i)
Returns the element of the sequence at the given index.
|
PyTuple |
rpartition(PyObject sep)
Equivalent to Python
str.rpartition() , splits the PyString at the
last occurrence of sepObj returning a PyTuple containing the part before
the separator, the separator itself, and the part after the separator. |
protected PyList |
rsplitfields(int maxsplit)
Helper function for
.rsplit , in str and (when overridden) in
unicode , splitting on white space and returning a list of the separated parts. |
protected PyList |
splitfields(int maxsplit)
Helper function for
.split , in str and (when overridden) in
unicode , splitting on white space and returning a list of the separated parts. |
boolean |
startswith(PyObject prefix,
PyObject start,
PyObject end)
Equivalent to the Python
unicode.startswith method, testing whether a string
starts with a specified prefix, where a sub-range is specified by [start:end] . |
java.lang.String |
substring(int start,
int end)
Return a substring of this object as a Java String.
|
int[] |
toCodePoints() |
protected int[] |
translateIndices(PyObject start,
PyObject end)
Many of the string methods deal with slices specified using Python slice semantics:
endpoints, which are
PyObject s, may be null or None
(meaning default to one end or the other) or may be negative (meaning "from the end"). |
__cmp__, __float__, __getnewargs__, __int__, __invert__, __long__, __mul__, __neg__, __pos__, __rmul__, __tojava__, _count, _find, _lstrip, _lstrip, _replace, _rfind, _rsplit, _rstrip, _rstrip, _split, _strip, _strip, asDouble, asInt, asLong, asName, asString, asString, asU16BytesOrError, atoi, atol, buildFormattedString, capitalize, center, charAt, checkIndex, count, count, count, count, count, count, decode_UnicodeEscape, decode, decode, decode, encode_UnicodeEscape, encode, encode, encode, endswith, endswith, expandtabs, expandtabs, find, find, find, find, find, find, getString, hashCode, index, index, index, index, index, index, internedString, isalnum, isalpha, isdecimal, isdigit, islower, isnumeric, isspace, istitle, isunicode, isupper, length, ljust, ljust, lower, lstrip, lstrip, lstrip, repeat, replace, replace, rfind, rfind, rfind, rfind, rfind, rfind, rindex, rindex, rindex, rindex, rindex, rindex, rjust, rsplit, rsplit, rsplit, rsplit, rsplit, rstrip, rstrip, rstrip, split, split, split, split, split, splitlines, splitlines, startswith, startswith, str___mod__, strip, strip, strip, subSequence, swapcase, title, toBytes, toString, translate, translate, translate, translate, unsupportedopMessage, upper, zfill
__delitem__, __delslice__, __finditem__, __finditem__, __getitem__, __getslice__, __iter__, __nonzero__, __setitem__, __setitem__, __setslice__, boundToSequence, cmp, del, delRange, delslice, fastSequence, isMappingType, isNumberType, isSequenceType, isSubType, pyset, runsupportedopMessage, setslice, sliceLength
__abs__, __and__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __call__, __coerce__, __coerce_ex__, __delattr__, __delattr__, __delete__, __delitem__, __delslice__, __dir__, __div__, __divmod__, __ensure_finalizer__, __findattr__, __findattr__, __findattr_ex__, __finditem__, __floordiv__, __get__, __getattr__, __getattr__, __getitem__, __getslice__, __hash__, __hex__, __iadd__, __iand__, __idiv__, __idivmod__, __ifloordiv__, __ilshift__, __imod__, __imul__, __index__, __ior__, __ipow__, __irshift__, __isub__, __iternext__, __itruediv__, __ixor__, __lshift__, __not__, __oct__, __or__, __pow__, __pow__, __radd__, __rand__, __rawdir__, __rdiv__, __rdivmod__, __reduce__, __reduce_ex__, __reduce_ex__, __rfloordiv__, __rlshift__, __rmod__, __ror__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__, __rxor__, __set__, __setattr__, __setattr__, __setitem__, __setslice__, __sub__, __truediv__, __trunc__, __xor__, _add, _and, _callextra, _cmp, _div, _divmod, _doget, _doget, _doset, _eq, _floordiv, _ge, _gt, _iadd, _iand, _idiv, _idivmod, _ifloordiv, _ilshift, _imod, _imul, _in, _ior, _ipow, _irshift, _is, _isnot, _isub, _itruediv, _ixor, _jcall, _jcallexc, _jthrow, _le, _lshift, _lt, _mod, _mul, _ne, _notin, _or, _pow, _rshift, _sub, _truediv, _unsupportedop, _xor, adaptToCoerceTuple, asIndex, asIndex, asInt, asIterable, asLong, asName, asStringOrNull, asStringOrNull, bit_length, conjugate, delDict, delType, dispatch__init__, equals, fastGetClass, fastGetDict, finalize, getDict, getJavaProxy, getType, impAttr, implementsDescrDelete, implementsDescrGet, implementsDescrSet, invoke, invoke, invoke, invoke, invoke, invoke, isCallable, isDataDescr, isIndex, isInteger, mergeClassDict, mergeDictAttr, mergeListAttr, noAttributeError, object___subclasshook__, readonlyAttributeError, setDict, setType
public static final PyType TYPE
public PyUnicode()
public PyUnicode(java.lang.String string)
string
- UTF-16 string encoding the characters (as Java).public PyUnicode(java.lang.String string, boolean isBasic)
string
- UTF-16 string encoding the characters (as Java).isBasic
- true if it is known that only BMP characters are present.public PyUnicode(PyType subtype, java.lang.String string)
public PyUnicode(PyString pystring)
public PyUnicode(char c)
public PyUnicode(int codepoint)
public PyUnicode(int[] codepoints)
public PyUnicode(java.util.Iterator<java.lang.Integer> iter)
public PyUnicode(java.util.Collection<java.lang.Integer> ucs4)
public int[] toCodePoints()
toCodePoints
in class PyString
public PyBuffer getBuffer(int flags) throws java.lang.ClassCastException
PyUnicode
implements the interface BufferProtocol
technically by inheritance from PyString
,
but does not provide a buffer (in CPython). We therefore arrange that all calls to getBuffer
raise an error.getBuffer
in interface BufferProtocol
getBuffer
in class PyString
flags
- consumer requirementsClassCastException
java.lang.ClassCastException
- when the object only formally implements BufferProtocol
protected int[] translateIndices(PyObject start, PyObject end)
PyObject
s, may be null
or None
(meaning default to one end or the other) or may be negative (meaning "from the end").
Meanwhile, the implementation methods need integer indices, both within the array, and
0<=start<=end<=N
the length of the array.
This method first translates the Python slice startObj
and endObj
according to the slice semantics for null and negative values, and stores these in elements 2
and 3 of the result. Then, since the end points of the range may lie outside this sequence's
bounds (in either direction) it reduces them to the nearest points satisfying
0<=start<=end<=N
, and stores these in elements [0] and [1] of the
result.
In the PyUnicode
version, the arguments are code point indices, such as are
received from the Python caller, while the first two elements of the returned array have been
translated to UTF-16 indices in the implementation string.
translateIndices
in class PyString
start
- Python start of sliceend
- Python end of slicepublic java.lang.String substring(int start, int end)
char
) indices. For
example:
PyUnicode u = new PyUnicode("..𐀂𐀃..."); // (Python) u = u'..\U00010002\U00010003...' String s = u.substring(2, 4); // = "𐀂𐀃" (Java)
public static PyUnicode fromInterned(java.lang.String interned)
public boolean isBasicPlane()
PyString
, of course, it is always true
, but this is useful in cases
where either a PyString
or a PyUnicode
is acceptable.isBasicPlane
in class PyString
public int getCodePointCount()
public static java.lang.String checkEncoding(java.lang.String s)
public PyString createInstance(java.lang.String str)
PyString
createInstance
in class PyString
str
- to wrapstr
protected PyString createInstance(java.lang.String string, boolean isBasic)
PyString
createInstance
in class PyString
string
- UTF-16 string encoding the characters (as Java).isBasic
- true if it is known that only BMP characters are present.str
public PyObject __mod__(PyObject other)
PyObject
public PyUnicode __unicode__()
__unicode__
in class PyString
public PyString __str__()
PyObject
PyObject
) calls PyObject.__repr__()
, making it unnecessary to override
__str__
in sub-classes of PyObject
where both forms are the same. A
common choice is to provide the same implementation to __str__
and
toString
, for consistency in the printed form of objects between Python and
Java.public int __len__()
PyObject
public PyString __repr__()
PyObject
__repr__
method. Each sub-class of
PyObject
is likely to re-define this method to provide for its own reproduction.protected PyObject getslice(int start, int stop, int step)
PySequence
public PyObject __eq__(PyObject other)
PyObject
public PyObject __ne__(PyObject other)
PyObject
public PyObject __lt__(PyObject other)
PyObject
public PyObject __le__(PyObject other)
PyObject
public PyObject __gt__(PyObject other)
PyObject
public PyObject __ge__(PyObject other)
PyObject
protected PyObject pyget(int i)
PySequence
PySequence.__getitem__(org.python.core.PyObject)
It is guaranteed by PySequence that
when it calls pyget(int)
the index is within the bounds of the array. Any other
clients must make the same guarantee.public java.util.Iterator<java.lang.Integer> newSubsequenceIterator()
public java.util.Iterator<java.lang.Integer> newSubsequenceIterator(int start, int stop, int step)
public boolean __contains__(PyObject o)
PyObject
__contains__
in class PyString
o
- the element to search for in this container.protected int _findLeft(int right)
strip
, lstrip
implementation, when stripping whitespace.protected int _findRight()
strip
, rstrip
implementation, when stripping whitespace._findRight
in class PyString
public PyTuple partition(PyObject sep)
PyString
str.partition()
, splits the PyString
at the
first occurrence of sepObj
returning a PyTuple
containing the part
before the separator, the separator itself, and the part after the separator.partition
in class PyString
sep
- str, unicode or object implementing BufferProtocol
public PyTuple rpartition(PyObject sep)
PyString
str.rpartition()
, splits the PyString
at the
last occurrence of sepObj
returning a PyTuple
containing the part before
the separator, the separator itself, and the part after the separator.rpartition
in class PyString
sep
- str, unicode or object implementing BufferProtocol
protected PyList splitfields(int maxsplit)
.split
, in str
and (when overridden) in
unicode
, splitting on white space and returning a list of the separated parts.
If there are more than maxsplit
feasible splits the last element of the list is
the remainder of the original (this) string. The split sections will be PyUnicode
and use the Python
unicode
definition of "space".splitfields
in class PyString
maxsplit
- limit on the number of splits (if >=0)PyList
of split sectionsprotected PyList rsplitfields(int maxsplit)
.rsplit
, in str
and (when overridden) in
unicode
, splitting on white space and returning a list of the separated parts.
If there are more than maxsplit
feasible splits the first element of the list is
the remainder of the original (this) string. The split sections will be PyUnicode
and use the Python
unicode
definition of "space".rsplitfields
in class PyString
maxsplit
- limit on the number of splits (if >=0)PyList
of split sectionsprotected PyString fromSubstring(int begin, int end)
PyString
[begin:end]
. (Python end-relative indexes etc. are not supported.) Subclasses (
fromSubstring(int, int)
) override this to return their own type.)fromSubstring
in class PyString
begin
- first included character.end
- first excluded character.public boolean startswith(PyObject prefix, PyObject start, PyObject end)
unicode.startswith
method, testing whether a string
starts with a specified prefix, where a sub-range is specified by [start:end]
.
Arguments start
and end
are interpreted as in slice notation, with
null or Py.None
representing "missing". prefix
can also be a tuple of
prefixes to look for.startswith
in class PyString
prefix
- string to check for (or a PyTuple
of them).start
- start of slice.end
- end of slice.true
if this string slice starts with a specified prefix, otherwise
false
.public boolean endswith(PyObject suffix, PyObject start, PyObject end)
unicode.endswith
method, testing whether a string ends
with a specified suffix, where a sub-range is specified by [start:end]
.
Arguments start
and end
are interpreted as in slice notation, with
null or Py.None
representing "missing". suffix
can also be a tuple of
suffixes to look for.public PyObject __format__(PyObject formatSpec)
__format__
in class PyString
public java.util.Iterator<java.lang.Integer> iterator()
iterator
in interface java.lang.Iterable<java.lang.Integer>
public PyComplex __complex__()
PyObject
__complex__
in class PyString