GET /crawl/v1/
Use this interface to crawl a specific website using our crawlers.
This Endpoint requires an ToS;DR Issued API Key!
Authentication is done through the apikey
Parameter.
Endpoints
Europe
CODE
https://api.tosdr.org/crawl/v1/eu/
United States
CODE
https://api.tosdr.org/crawl/v1/us/
Specific Regions
Europe West Cluster
CODE
https://api.tosdr.org/crawl/v1/eu-west/
Europe Central Cluster
CODE
https://api.tosdr.org/crawl/v1/eu-central/
US East Cluster
CODE
https://api.tosdr.org/crawl/v1/us-east/
US West Cluster
CODE
https://api.tosdr.org/crawl/v1/us-west/
URL Parameters
Parameter | Type | Description |
---|---|---|
apikey | String | ToS;DR Issued Api Key |
url | string | The url to crawl |
xpath | string | The XPath to use, defaults to //body |
Repository
https://github.com/tosdr/crawler.tosdr.org
Error JSON Schema
JS
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$ref": "#/definitions/Welcome",
"definitions": {
"Welcome": {
"type": "object",
"additionalProperties": false,
"properties": {
"error": {
"type": "boolean"
},
"message": {
"$ref": "#/definitions/Message"
}
},
"required": [
"error",
"message"
],
"title": "Welcome"
},
"Message": {
"type": "object",
"additionalProperties": false,
"properties": {
"name": {
"type": "string"
},
"crawler": {
"type": "string"
},
"remoteStacktrace": {
"type": "string"
}
},
"required": [
"crawler",
"name",
"remoteStacktrace"
],
"title": "Message"
}
}
}
Success JSON Schema
JS
{
"$schema": "http://json-schema.org/draft-06/schema#",
"$ref": "#/definitions/Welcome",
"definitions": {
"Welcome": {
"type": "object",
"additionalProperties": false,
"properties": {
"error": {
"type": "boolean"
},
"message": {
"$ref": "#/definitions/Message"
},
"raw_html": {
"type": "string"
},
"imagedata": {
"type": "string"
}
},
"required": [
"error",
"imagedata",
"message",
"raw_html"
],
"title": "Welcome"
},
"Message": {
"type": "object",
"additionalProperties": false,
"properties": {
"name": {
"type": "null"
},
"crawler": {
"type": "string"
},
"remoteStacktrace": {
"type": "null"
}
},
"required": [
"crawler",
"name",
"remoteStacktrace"
],
"title": "Message"
}
}
}