CaseCanonicalize (Closure Compiler v20190106 API)

java.lang.Object
- com.google.javascript.jscomp.regex.CaseCanonicalize

```
public final class CaseCanonicalize
extends java.lang.Object
```
Implements the ECMAScript 5 Canonicalize operation used to specify how case-insensitive regular expressions match.
From section 15.10.2.9,
The abstract operation Canonicalize takes a character parameter ch and performs the following steps:
- If IgnoreCase is false, return ch.
- Let u be ch converted to upper case as if by calling the standard built-in method String.prototype.toUpperCase on the one-character String ch.
- If u does not consist of a single character, return ch.
- Let cu be u's character.
- If ch's code unit value is greater than or equal to decimal 128 and cu's code unit value is less than decimal 128, then return ch.
- Return cu.
Author:

Mike Samuel <[email protected]>

Field Summary

Fields
Modifier and Type	Field and Description
`static com.google.javascript.jscomp.regex.CharRanges`	`CASE_SENSITIVE` Set of code units that are case-insensitively equivalent to some other code unit according to the EcmaScript Canonicalize operation described in section 15.10.2.8.

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method and Description
`static char`	`caseCanonicalize(char ch)` Returns the case canonical version of the given code-unit.
`static java.lang.String`	`caseCanonicalize(java.lang.String s)` Returns the case canonical version of the given string.
`static com.google.javascript.jscomp.regex.CharRanges`	`expandToAllMatched(com.google.javascript.jscomp.regex.CharRanges ranges)` Given a character range that may include case sensitive code-units, such as `[0-9B-M]`, returns the character range that includes all the code-units in the input and those that are case-insensitively equivalent to a code-unit in the input.
`static com.google.javascript.jscomp.regex.CharRanges`	`reduceToMinimum(com.google.javascript.jscomp.regex.CharRanges ranges)` Given a character range that may include case sensitive code-units, such as `[0-9B-M]`, returns the character range that includes the minimal set of code units such that for every code unit in the input there is a case-sensitively equivalent canonical code unit in the output.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - CASE_SENSITIVE
```
public static final com.google.javascript.jscomp.regex.CharRanges CASE_SENSITIVE
```
    Set of code units that are case-insensitively equivalent to some other code unit according to the EcmaScript Canonicalize operation described in section 15.10.2.8. The case sensitive characters are the ones that canonicalize to a character other than themselves or have a character that canonicalizes to them. Canonicalize is based on the definition of String.prototype.toUpperCase which is itself based on Unicode 3.0.0 as specified at UnicodeData-3.0.0 and SpecialCasings-2.txt .
    This table was generated by running the below on Chrome:
```
 for (var cc = 0; cc < 0x10000; ++cc) {
   var ch = String.fromCharCode(cc);
   var u = ch.toUpperCase();
   if (ch != u && u.length === 1) {
     var cu = u.charCodeAt(0);
     if (cc <= 128 || u.charCodeAt(0) > 128) {
       print('0x' + cc.toString(16) + ', 0x' + cu.toString(16) + ',');
     }
   }
 }
 
```
- Method Detail
  - caseCanonicalize
```
public static java.lang.String caseCanonicalize(java.lang.String s)
```
    Returns the case canonical version of the given string.
  - caseCanonicalize
```
public static char caseCanonicalize(char ch)
```
    Returns the case canonical version of the given code-unit. ECMAScript 5 explicitly says that code-units are to be treated as their code-point equivalent, even surrogates.
  - expandToAllMatched
```
public static com.google.javascript.jscomp.regex.CharRanges expandToAllMatched(com.google.javascript.jscomp.regex.CharRanges ranges)
```
    Given a character range that may include case sensitive code-units, such as [0-9B-M], returns the character range that includes all the code-units in the input and those that are case-insensitively equivalent to a code-unit in the input.
  - reduceToMinimum
```
public static com.google.javascript.jscomp.regex.CharRanges reduceToMinimum(com.google.javascript.jscomp.regex.CharRanges ranges)
```
    Given a character range that may include case sensitive code-units, such as [0-9B-M], returns the character range that includes the minimal set of code units such that for every code unit in the input there is a case-sensitively equivalent canonical code unit in the output.

Class CaseCanonicalize

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

CASE_SENSITIVE

Method Detail

caseCanonicalize

caseCanonicalize

expandToAllMatched

reduceToMinimum