public static interface SeedUrlConfiguration.Builder extends SdkPojo, CopyableBuilder<SeedUrlConfiguration.Builder,SeedUrlConfiguration>
| Modifier and Type | Method and Description |
|---|---|
SeedUrlConfiguration.Builder |
seedUrls(Collection<String> seedUrls)
The list of seed or starting point URLs of the websites you want to crawl.
|
SeedUrlConfiguration.Builder |
seedUrls(String... seedUrls)
The list of seed or starting point URLs of the websites you want to crawl.
|
SeedUrlConfiguration.Builder |
webCrawlerMode(String webCrawlerMode)
You can choose one of the following modes:
|
SeedUrlConfiguration.Builder |
webCrawlerMode(WebCrawlerMode webCrawlerMode)
You can choose one of the following modes:
|
equalsBySdkFields, sdkFieldscopyapplyMutation, buildSeedUrlConfiguration.Builder seedUrls(Collection<String> seedUrls)
The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
seedUrls - The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
SeedUrlConfiguration.Builder seedUrls(String... seedUrls)
The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
seedUrls - The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
SeedUrlConfiguration.Builder webCrawlerMode(String webCrawlerMode)
You can choose one of the following modes:
HOST_ONLY – crawl only the website host names. For example, if the seed URL is
"abc.example.com", then only URLs with host name "abc.example.com" are crawled.
SUBDOMAINS – crawl the website host names with subdomains. For example, if the seed URL is
"abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.
EVERYTHING – crawl the website host names with subdomains and other domains that the web pages
link to.
The default mode is set to HOST_ONLY.
webCrawlerMode - You can choose one of the following modes:
HOST_ONLY – crawl only the website host names. For example, if the seed URL is
"abc.example.com", then only URLs with host name "abc.example.com" are crawled.
SUBDOMAINS – crawl the website host names with subdomains. For example, if the seed URL
is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.
EVERYTHING – crawl the website host names with subdomains and other domains that the web
pages link to.
The default mode is set to HOST_ONLY.
WebCrawlerMode,
WebCrawlerModeSeedUrlConfiguration.Builder webCrawlerMode(WebCrawlerMode webCrawlerMode)
You can choose one of the following modes:
HOST_ONLY – crawl only the website host names. For example, if the seed URL is
"abc.example.com", then only URLs with host name "abc.example.com" are crawled.
SUBDOMAINS – crawl the website host names with subdomains. For example, if the seed URL is
"abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.
EVERYTHING – crawl the website host names with subdomains and other domains that the web pages
link to.
The default mode is set to HOST_ONLY.
webCrawlerMode - You can choose one of the following modes:
HOST_ONLY – crawl only the website host names. For example, if the seed URL is
"abc.example.com", then only URLs with host name "abc.example.com" are crawled.
SUBDOMAINS – crawl the website host names with subdomains. For example, if the seed URL
is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.
EVERYTHING – crawl the website host names with subdomains and other domains that the web
pages link to.
The default mode is set to HOST_ONLY.
WebCrawlerMode,
WebCrawlerModeCopyright © 2023. All rights reserved.