public class DefaultFetchFilter extends FetchFilter
FetchFilter.FetchStatus
log
Constructor and Description |
---|
DefaultFetchFilter() |
Modifier and Type | Method and Description |
---|---|
void |
addScopeRegex(String scope)
Adds a new domain to the scope list of the spider process.
|
FetchFilter.FetchStatus |
checkFilter(org.apache.commons.httpclient.URI uri)
Checks if the uri must be ignored and not processed and return the filter status.
|
void |
setDomainsAlwaysInScope(List<DomainAlwaysInScopeMatcher> domainsAlwaysInScope)
Sets the domains that will be considered as always in scope.
|
void |
setExcludeRegexes(List<String> excl)
Sets the regexes which are used for checking if an uri should be skipped.
|
void |
setScanContext(Context scanContext)
Sets the scan context.
|
getLogger
public FetchFilter.FetchStatus checkFilter(org.apache.commons.httpclient.URI uri)
FetchFilter
checkFilter
in class FetchFilter
uri
- the uri to be processedpublic void addScopeRegex(String scope)
scope
- the scopepublic void setDomainsAlwaysInScope(List<DomainAlwaysInScopeMatcher> domainsAlwaysInScope)
domainsAlwaysInScope
- the list containing all domains that are always in scope.public void setExcludeRegexes(List<String> excl)
excl
- the new exclude regexespublic void setScanContext(Context scanContext)
scanContext
- the new scan context