public interface RecrawlAttributeConstants
Modifier and Type | Field and Description |
---|---|
static String |
A_CONTENT_DIGEST
content digest
|
static String |
A_CONTENT_DIGEST_COUNT
number of times we've seen this content digest (1 original + n duplicates)
|
static String |
A_CONTENT_DIGEST_HISTORY
content digest history map
|
static String |
A_ETAG_HEADER
header name (and AList key) for ETag
|
static String |
A_FETCH_HISTORY
fetch history array
|
static String |
A_LAST_MODIFIED_HEADER
header name (and AList key) for last-modified timestamp
|
static String |
A_ORIGINAL_DATE
date content payload was written
|
static String |
A_ORIGINAL_URL
url that the content payload was written for
|
static String |
A_REFERENCE_LENGTH
reference length (content length or virtual length
|
static String |
A_STATUS
key for status (when in history)
|
static String |
A_WARC_FILE_OFFSET
offset into warc file of warc record with content payload
|
static String |
A_WARC_FILENAME
warc filename containing the content payload
|
static String |
A_WARC_RECORD_ID
warc record id of warc record with the content payload
|
static String |
A_WRITE_TAG
Writer processors of all types are encouraged to put a 'writeTag'
(analogous to HTTP 'etag') in the CrawlURI state.
|
static final String A_FETCH_HISTORY
static final String A_CONTENT_DIGEST
static final String A_LAST_MODIFIED_HEADER
static final String A_ETAG_HEADER
static final String A_STATUS
static final String A_REFERENCE_LENGTH
static final String A_CONTENT_DIGEST_HISTORY
static final String A_ORIGINAL_URL
static final String A_WARC_RECORD_ID
static final String A_WARC_FILENAME
static final String A_WARC_FILE_OFFSET
static final String A_ORIGINAL_DATE
static final String A_CONTENT_DIGEST_COUNT
static final String A_WRITE_TAG
PersistLogProcessor
/PersistStoreProcessor
have an option AbstractPersistProcessor.onlyStoreIfWriteTagPresent
, which
defaults to true.Copyright © 2003–2019 Internet Archive. All rights reserved.