Scrape HTML elements
The /scrape
endpoint extracts structured data from specific elements on a webpage, returning details such as element dimensions and inner HTML.
Go to https://example.com
and extract metadata from all h1
and a
elements in the DOM.
curl -X POST 'https://api.cloudflare.com/client/v4/accounts/<accountId>/browser-rendering/scrape' \ -H 'Authorization: Bearer <apiToken>' \ -H 'Content-Type: application/json' \ -d '{ "url": "https://example.com/", "elements": [{ "selector": "h1" }, { "selector": "a" }]}'
{ "success": true, "result": [ { "results": [ { "attributes": [], "height": 39, "html": "Example Domain", "left": 100, "text": "Example Domain", "top": 133.4375, "width": 600 } ], "selector": "h1" }, { "results": [ { "attributes": [ { "name": "href", "value": "https://www.iana.org/domains/example" } ], "height": 20, "html": "More information...", "left": 100, "text": "More information...", "top": 249.875, "width": 142 } ], "selector": "a" } ]}
Many more options exist, like setting HTTP credentials using authenticate
, setting cookies
, and using gotoOptions
to control page load behaviour - check the endpoint reference for all available parameters.
results
(array of objects) - Contains extracted data for each selector.selector
(string) - The CSS selector used.results
(array of objects) - List of extracted elements matching the selector.text
(string) - Inner text of the element.html
(string) - Inner HTML of the element.attributes
(array of objects) - List of extracted attributes such ashref
for links.height
,width
,top
,left
(number) - Position and dimensions of the element.