Package | Description |
---|---|
org.archive.modules |
The beginnings of a refactored settings framework.
|
org.archive.modules.extractor |
Modifier and Type | Method and Description |
---|---|
LinkContext |
CrawlURI.getViaContext() |
Modifier and Type | Method and Description |
---|---|
CrawlURI |
CrawlURI.createCrawlURI(String destination,
LinkContext context,
Hop hop) |
CrawlURI |
CrawlURI.createCrawlURI(UURI destination,
LinkContext context,
Hop hop)
Utility method for creating CrawlURIs that were found as out links from the current CrawlURI
links from this CrawlURI.
|
CrawlURI |
CrawlURI.createCrawlURI(UURI destination,
LinkContext context,
Hop hop,
int scheduling,
boolean seed)
Utility method for creation of CrawlURIs found extracting
links from this CrawlURI.
|
Constructor and Description |
---|
CrawlURI(UURI u,
String pathFromSeed,
UURI via,
LinkContext viaContext) |
Modifier and Type | Class and Description |
---|---|
class |
HTMLLinkContext
XPath-like context for HTML discovered URIs.
|
static class |
LinkContext.SimpleLinkContext
Class for representing handy default LinkContext values.
|
Modifier and Type | Field and Description |
---|---|
static LinkContext |
LinkContext.EMBED_MISC
Stand-in value for embeds without other context.
|
static LinkContext |
LinkContext.INFERRED_MISC
Stand-in value for inferred urls without other context.
|
static LinkContext |
LinkContext.JS_MISC
Stand-in value for JavaScript-discovered urls without other context.
|
static LinkContext |
LinkContext.MANIFEST_MISC
Stand-in value for prerequisite urls without other context.
|
static LinkContext |
LinkContext.NAVLINK_MISC
Stand-in value for navlink urls without other context.
|
static LinkContext |
LinkContext.PREREQ_MISC
Stand-in value for prerequisite urls without other context.
|
static LinkContext |
LinkContext.SPECULATIVE_MISC
Stand-in value for speculative/aggressively extracted urls without
other context.
|
Modifier and Type | Method and Description |
---|---|
static void |
Extractor.add(CrawlURI uri,
int max,
String newUri,
LinkContext context,
Hop hop) |
protected CrawlURI |
Extractor.addOutlink(CrawlURI curi,
String uri,
LinkContext context,
Hop hop)
Create and add a 'Link' to the CrawlURI with given URI/context/hop-type
|
protected void |
Extractor.addOutlink(CrawlURI curi,
UURI uuri,
LinkContext context,
Hop hop) |
static CrawlURI |
Extractor.addRelativeToBase(CrawlURI uri,
int max,
CharSequence newUri,
LinkContext context,
Hop hop)
Adds an outlink to uri relative to uri.getBaseURI().
|
static CrawlURI |
Extractor.addRelativeToVia(CrawlURI uri,
int max,
String newUri,
LinkContext context,
Hop hop)
Adds an outlink to uri relative to uri.getVia().
|
Copyright © 2003–2022 Internet Archive. All rights reserved.