public static class AsyncFetcher.Builder extends Object
Modifier and Type | Method and Description |
---|---|
AsyncFetcher |
build()
Builds the fetcher with the options specified.
|
AsyncFetcher.Builder |
disableCompression()
Disables request for compress pages and to decompress pages
after it is fetched.
|
AsyncFetcher.Builder |
disableCookies()
Disables cookie storage.
|
AsyncFetcher.Builder |
register(@NotNull Callback callback)
Register any callbacks that will be called when a page has been fetched.
|
AsyncFetcher.Builder |
setConnectionRequestTimeout(int connectionRequestTimeout)
The timeout in milliseconds used when requesting a connection
from the connection manager.
|
AsyncFetcher.Builder |
setConnectTimeout(int connectTimeout)
Determines the timeout in milliseconds until a connection is established.
|
AsyncFetcher.Builder |
setFileManager(@NotNull FileManager fileManager)
Sets the FileManager to be used.
|
AsyncFetcher.Builder |
setHeaders(@NotNull Map<String,String> headers)
Sets the headers to be used when fetching items.
|
AsyncFetcher.Builder |
setMaxConnections(int maxConnections)
Sets the maximum allowable connections at an instance.
|
AsyncFetcher.Builder |
setMaxRouteConnections(int maxRouteConnections)
Sets the maximum allowable connections at an instance for
a particular route (host).
|
AsyncFetcher.Builder |
setNumIoThreads(int numIoThreads)
Number of httpclient dispatcher threads.
|
AsyncFetcher.Builder |
setProxyProvider(@NotNull ProxyProvider proxyProvider)
Sets the ProxyProvider to be used.
|
AsyncFetcher.Builder |
setRedirectStrategy(org.apache.http.client.RedirectStrategy redirectStrategy)
Sets the redirection strategy for a response received by the fetcher.
|
AsyncFetcher.Builder |
setSocketTimeout(int socketTimeout)
Defines the socket timeout (
SO_TIMEOUT ) in milliseconds,
which is the timeout for waiting for data or, put differently,
a maximum period inactivity between two consecutive data packets). |
AsyncFetcher.Builder |
setSslContext(SSLContext sslContext)
Sets the ssl context for an encrypted response.
|
AsyncFetcher.Builder |
setStopCodes(int... codes)
Set a list of stop code that will interrupt crawling.
|
AsyncFetcher.Builder |
setThreadFactory(@NotNull ThreadFactory threadFactory)
Set the thread factory that creates the httpclient dispatcher
threads.
|
AsyncFetcher.Builder |
setUserAgent(@NotNull UserAgent userAgent)
Sets the UserAgent to be used, if not set, default will be chosen.
|
AsyncFetcher.Builder |
setValidator(Validator... validators)
Sets the multiple validators to be used.
|
AsyncFetcher.Builder |
setValidator(@NotNull Validator validator)
Sets the Validator to be used.
|
AsyncFetcher.Builder |
setValidatorRouter(@NotNull ValidatorRouter router)
Sets ValidatorRouter to be used.
|
public AsyncFetcher.Builder register(@NotNull @NotNull Callback callback)
Please note that blocking callbacks will significantly reduce the rate at which request are processed. Please implement your own executors on I/O blocking callbacks.
callback
- A set of FetcherCallback.public AsyncFetcher.Builder disableCookies()
public AsyncFetcher.Builder setFileManager(@NotNull @NotNull FileManager fileManager)
If fileManager is set, all items fetched will be saved to storage.
fileManager
- file manager to be used.public AsyncFetcher.Builder setHeaders(@NotNull @NotNull Map<String,String> headers)
headers
- a map to headers to be used.public AsyncFetcher.Builder setNumIoThreads(int numIoThreads)
numIoThreads
- number of threads.public AsyncFetcher.Builder setMaxConnections(int maxConnections)
maxConnections
- the max allowable connections.public AsyncFetcher.Builder setMaxRouteConnections(int maxRouteConnections)
maxRouteConnections
- the max allowable connections per route.public AsyncFetcher.Builder setProxyProvider(@NotNull @NotNull ProxyProvider proxyProvider)
proxyProvider
- proxy provider to be used.public AsyncFetcher.Builder setSslContext(SSLContext sslContext)
sslContext
- SSLContext to be used.public AsyncFetcher.Builder setStopCodes(int... codes)
codes
- A list of stop codes.public AsyncFetcher.Builder setThreadFactory(@NotNull @NotNull ThreadFactory threadFactory)
threadFactory
- an instance of ThreadFactory.public AsyncFetcher.Builder setUserAgent(@NotNull @NotNull UserAgent userAgent)
userAgent
- user agent generator to be used.public AsyncFetcher.Builder setValidator(@NotNull @NotNull Validator validator)
This will validate the fetched page and retry if page is not consistent with the specification set by the validator.
validator
- validator to be used.public AsyncFetcher.Builder setValidator(@NotNull Validator... validators)
This will validate the fetched page and retry if page is not consistent with the specification set by the validator.
validators
- validator to be used.public AsyncFetcher.Builder setRedirectStrategy(org.apache.http.client.RedirectStrategy redirectStrategy)
redirectStrategy
- redirection strategy to be used.public AsyncFetcher.Builder setValidatorRouter(@NotNull @NotNull ValidatorRouter router)
router
- router validator setValidatorRouter to be used.public AsyncFetcher.Builder setConnectionRequestTimeout(int connectionRequestTimeout)
connectionRequestTimeout
- timeout.public AsyncFetcher.Builder setConnectTimeout(int connectTimeout)
connectTimeout
- timeout.public AsyncFetcher.Builder setSocketTimeout(int socketTimeout)
SO_TIMEOUT
) in milliseconds,
which is the timeout for waiting for data or, put differently,
a maximum period inactivity between two consecutive data packets).socketTimeout
- timeout.public AsyncFetcher.Builder disableCompression()
public AsyncFetcher build()
Copyright © 2019. All rights reserved.