Skip to content

Scrape HTML elements

The /scrape endpoint extracts structured data from specific elements on a webpage, returning details such as element dimensions and inner HTML.

Basic usage

Go to https://example.com and extract metadata from all h1 and a elements in the DOM.

curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/scrape' \
  -H 'Authorization: Bearer <apiToken>' \
  -H 'Content-Type: application/json' \
  -d '{
  "url": "https://example.com/",
  "elements": [{
    "selector": "h1"
  },
  {
    "selector": "a"
  }]
}'

JSON response

{
  "success": true,
  "result": [
    {
      "results": [
        {
          "attributes": [],
          "height": 39,
          "html": "Example Domain",
          "left": 100,
          "text": "Example Domain",
          "top": 133.4375,
          "width": 600
        }
      ],
      "selector": "h1"
    },
    {
      "results": [
        {
          "attributes": [
            { "name": "href", "value": "https://www.iana.org/domains/example" }
          ],
          "height": 20,
          "html": "More information...",
          "left": 100,
          "text": "More information...",
          "top": 249.875,
          "width": 142
        }
      ],
      "selector": "a"
    }
  ]
}

Many more options exist, like setting HTTP credentials using authenticate, setting cookies, and using gotoOptions to control page load behaviour - check the endpoint reference for all available parameters.

Response fields

results (array of objects) - Contains extracted data for each selector.
- selector (string) - The CSS selector used.
- results (array of objects) - List of extracted elements matching the selector.
  - text (string) - Inner text of the element.
  - html (string) - Inner HTML of the element.
  - attributes (array of objects) - List of extracted attributes such as href for links.
  - height, width, top, left (number) - Position and dimensions of the element.

Was this helpful?