public final class CSVFormat extends Object implements Serializable
You can use one of the predefined formats:
For example:
CSVParser parser = CSVFormat.EXCEL.parse(reader);
The CSVParser
provides static methods to parse other input types, for example:
CSVParser parser = CSVParser.parse(file, StandardCharsets.US_ASCII, CSVFormat.EXCEL);
You can extend a format by calling the with
methods. For example:
CSVFormat.EXCEL .withNullString("N/A") .withIgnoreSurroundingSpaces(true);
To define the column names you want to use to access records, write:
CSVFormat.EXCEL.withHeader("Col1", "Col2", "Col3");
Calling withHeader(String...)
let's you use the given names to address values in a CSVRecord
, and
assumes that your CSV source does not contain a first record that also defines column names.
If it does, then you are overriding this metadata with your names and you should skip the first record by calling
withSkipHeaderRecord(boolean)
with true
.
You can use a format directly to parse a reader. For example, to parse an Excel file with columns header, write:
Reader in = ...; CSVFormat.EXCEL.withHeader("Col1", "Col2", "Col3").parse(in);
For other input types, like resources, files, and URLs, use the static methods on CSVParser
.
If your source contains a header record, you can simplify your code and safely reference columns,
by using withHeader(String...)
with no arguments:
CSVFormat.EXCEL.withHeader();
This causes the parser to read the first record and use its values as column names.
Then, call one of the CSVRecord
get method that takes a String column name argument:
String value = record.get("Col1");
This makes your code impervious to changes in column order in the CSV file.
This class is immutable.
Modifier and Type | Field and Description |
---|---|
static CSVFormat |
DEFAULT
Standard comma separated format, as for
RFC4180 but allowing empty lines. |
static CSVFormat |
EXCEL
Excel file format (using a comma as the value delimiter).
|
static CSVFormat |
MYSQL
Default MySQL format used by the
SELECT INTO OUTFILE and LOAD DATA INFILE operations. |
static CSVFormat |
RFC4180
Comma separated format as defined by RFC 4180.
|
static CSVFormat |
TDF
Tab-delimited format.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
String |
format(Object... values)
Formats the specified values.
|
boolean |
getAllowMissingColumnNames()
Specifies whether missing column names are allowed when parsing the header line.
|
Character |
getCommentMarker()
Returns the character marking the start of a line comment.
|
char |
getDelimiter()
Returns the character delimiting the values (typically ';', ',' or '\t').
|
Character |
getEscapeCharacter()
Returns the escape character.
|
String[] |
getHeader()
Returns a copy of the header array.
|
boolean |
getIgnoreEmptyLines()
Specifies whether empty lines between records are ignored when parsing input.
|
boolean |
getIgnoreSurroundingSpaces()
Specifies whether spaces around values are ignored when parsing input.
|
String |
getNullString()
Gets the String to convert to and from
null . |
Character |
getQuoteCharacter()
Returns the character used to encapsulate values containing special characters.
|
QuoteMode |
getQuoteMode()
Returns the quote policy output fields.
|
String |
getRecordSeparator()
Returns the record separator delimiting output records.
|
boolean |
getSkipHeaderRecord()
Returns whether to skip the header record.
|
int |
hashCode() |
boolean |
isCommentMarkerSet()
Specifies whether comments are supported by this format.
|
boolean |
isEscapeCharacterSet()
Returns whether escape are being processed.
|
boolean |
isNullStringSet()
Returns whether a nullString has been defined.
|
boolean |
isQuoteCharacterSet()
Returns whether a quoteChar has been defined.
|
static CSVFormat |
newFormat(char delimiter)
Creates a new CSV format with the specified delimiter.
|
CSVParser |
parse(Reader in)
Parses the specified content.
|
CSVPrinter |
print(Appendable out)
Prints to the specified output.
|
String |
toString() |
CSVFormat |
withAllowMissingColumnNames(boolean allowMissingColumnNames)
Sets the missing column names behavior of the format.
|
CSVFormat |
withCommentMarker(char commentMarker)
Sets the comment start marker of the format to the specified character.
|
CSVFormat |
withCommentMarker(Character commentMarker)
Sets the comment start marker of the format to the specified character.
|
CSVFormat |
withDelimiter(char delimiter)
Sets the delimiter of the format to the specified character.
|
CSVFormat |
withEscape(char escape)
Sets the escape character of the format to the specified character.
|
CSVFormat |
withEscape(Character escape)
Sets the escape character of the format to the specified character.
|
CSVFormat |
withHeader(String... header)
Sets the header of the format.
|
CSVFormat |
withIgnoreEmptyLines(boolean ignoreEmptyLines)
Sets the empty line skipping behavior of the format.
|
CSVFormat |
withIgnoreSurroundingSpaces(boolean ignoreSurroundingSpaces)
Sets the trimming behavior of the format.
|
CSVFormat |
withNullString(String nullString)
Performs conversions to and from null for strings on input and output.
|
CSVFormat |
withQuote(char quoteChar)
Sets the quoteChar of the format to the specified character.
|
CSVFormat |
withQuote(Character quoteChar)
Sets the quoteChar of the format to the specified character.
|
CSVFormat |
withQuoteMode(QuoteMode quoteModePolicy)
Sets the output quote policy of the format to the specified value.
|
CSVFormat |
withRecordSeparator(char recordSeparator)
Sets the record separator of the format to the specified character.
|
CSVFormat |
withRecordSeparator(String recordSeparator)
Sets the record separator of the format to the specified String.
|
CSVFormat |
withSkipHeaderRecord(boolean skipHeaderRecord)
Sets whether to skip the header record.
|
public static final CSVFormat DEFAULT
RFC4180
but allowing empty lines.
Settings are:
public static final CSVFormat RFC4180
Settings are:
public static final CSVFormat EXCEL
For example for parsing or generating a CSV file on a French system the following format will be used:
CSVFormat fmt = CSVFormat.EXCEL.withDelimiter(';');
Settings are:
Note: this is currently the same as RFC4180
.
public static final CSVFormat TDF
Settings are:
public static final CSVFormat MYSQL
SELECT INTO OUTFILE
and LOAD DATA INFILE
operations.
This is a tab-delimited format with a LF character as the line separator. Values are not quoted and special characters are escaped with '\'.
Settings are:
public static CSVFormat newFormat(char delimiter)
Use this method if you want to create a CSVFormat from scratch. All fields but the delimiter will be initialized with null/false.
public String format(Object... values)
values
- the values to formatpublic Character getCommentMarker()
null
public char getDelimiter()
public Character getEscapeCharacter()
null
public String[] getHeader()
null
if disabled, the empty array if to be read from the filepublic boolean getAllowMissingColumnNames()
true
if missing column names are allowed when parsing the header line, false
to throw an
IllegalArgumentException
.public boolean getIgnoreEmptyLines()
true
if empty lines between records are ignored, false
if they are turned into empty
records.public boolean getIgnoreSurroundingSpaces()
true
if spaces around values are ignored, false
if they are treated as part of the
value.public String getNullString()
null
.
nullString
to null
when reading
records.
null
as the given nullString
when writing records.null
. No substitution occurs if null
public Character getQuoteCharacter()
null
public QuoteMode getQuoteMode()
public String getRecordSeparator()
public boolean getSkipHeaderRecord()
public boolean isCommentMarkerSet()
true
is comments are supported, false
otherwisepublic boolean isEscapeCharacterSet()
true
if escapes are processedpublic boolean isNullStringSet()
true
if a nullString is definedpublic boolean isQuoteCharacterSet()
true
if a quoteChar is definedpublic CSVParser parse(Reader in) throws IOException
See also the various static parse methods on CSVParser
.
in
- the input streamCSVRecord
s.IOException
- If an I/O error occurspublic CSVPrinter print(Appendable out) throws IOException
See also CSVPrinter
.
out
- the outputIOException
- thrown if the optional header cannot be printed.public CSVFormat withCommentMarker(char commentMarker)
commentMarker
- the comment start markerIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withCommentMarker(Character commentMarker)
commentMarker
- the comment start marker, use null
to disableIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withDelimiter(char delimiter)
delimiter
- the delimiter characterIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withEscape(char escape)
escape
- the escape characterIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withEscape(Character escape)
escape
- the escape character, use null
to disableIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withHeader(String... header)
CSVFormat format = aformat.withHeader();or specified manually with:
CSVFormat format = aformat.withHeader("name", "email", "phone");
header
- the header, null
if disabled, empty if parsed automatically, user specified otherwise.withSkipHeaderRecord(boolean)
public CSVFormat withAllowMissingColumnNames(boolean allowMissingColumnNames)
allowMissingColumnNames
- the missing column names behavior, true
to allow missing column names in the header line,
false
to cause an IllegalArgumentException
to be thrown.public CSVFormat withIgnoreEmptyLines(boolean ignoreEmptyLines)
ignoreEmptyLines
- the empty line skipping behavior, true
to ignore the empty lines between the records,
false
to translate empty lines to empty records.public CSVFormat withIgnoreSurroundingSpaces(boolean ignoreSurroundingSpaces)
ignoreSurroundingSpaces
- the trimming behavior, true
to remove the surrounding spaces, false
to leave the
spaces as is.public CSVFormat withNullString(String nullString)
nullString
to null
when reading
records.null
as the given nullString
when writing records.nullString
- the String to convert to and from null
. No substitution occurs if null
public CSVFormat withQuote(char quoteChar)
quoteChar
- the quoteChar characterIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withQuote(Character quoteChar)
quoteChar
- the quoteChar character, use null
to disableIllegalArgumentException
- thrown if the specified character is a line breakpublic CSVFormat withQuoteMode(QuoteMode quoteModePolicy)
quoteModePolicy
- the quote policy to use for output.public CSVFormat withRecordSeparator(char recordSeparator)
Note: This setting is only used during printing and does not affect parsing. Parsing currently only works for inputs with '\n', '\r' and "\r\n"
recordSeparator
- the record separator to use for output.public CSVFormat withRecordSeparator(String recordSeparator)
Note: This setting is only used during printing and does not affect parsing. Parsing currently only works for inputs with '\n', '\r' and "\r\n"
recordSeparator
- the record separator to use for output.IllegalArgumentException
- if recordSeparator is none of CR, LF or CRLFpublic CSVFormat withSkipHeaderRecord(boolean skipHeaderRecord)
skipHeaderRecord
- whether to skip the header record.withHeader(String...)
Copyright © 2014 The Apache Software Foundation. All rights reserved.