Skip to main content
POST
/
v1
/
contents
curl --request POST \ --url https://api.valyu.ai/v1/contents \ --header 'Content-Type: application/json' \ --header 'X-API-Key: <api-key>' \ --data ' { "urls": [ "https://en.wikipedia.org/wiki/Quantum_computing" ] } '
{ "success": true, "error": null, "tx_id": "tx_c1d2e3f4-a5b6-7890-cdef-123456789abc", "urls_requested": 2, "urls_processed": 2, "urls_failed": 0, "results": [ { "url": "https://en.wikipedia.org/wiki/Quantum_computing", "status": "success", "title": "Quantum computing - Wikipedia", "content": "# Quantum computing\\n\\nQuantum computing is a type of computation that harnesses quantum mechanical phenomena...", "description": "Quantum computing is a type of computation that harnesses quantum mechanical phenomena.", "source": "web", "price": 0.001, "length": 24500, "data_type": "unstructured", "source_type": "website", "image_url": { "0": "https://upload.wikimedia.org/quantum.png" }, "screenshot_url": null } ], "total_cost_dollars": 0.002, "total_characters": 49000 }

Authorizations

X-API-Key
string
header
required

API key for authentication. Get yours at platform.valyu.ai.

Body

application/json

Content extraction request parameters.

urls
string<uri>[]
required

URLs to extract content from. Must be valid HTTP/HTTPS URLs. Duplicates are silently removed.

Required array length: 1 - 50 elements
Example:
[
  "https://en.wikipedia.org/wiki/Quantum_computing",
  "https://arxiv.org/abs/2401.00001"
]
response_length
default:short

Maximum character length of extracted content per URL.

  • "short" - 25,000 characters (default)
  • "medium" - 50,000 characters
  • "large" - 100,000 characters
  • "max" - No limit
  • Any positive integer for a custom character limit
Available options:
short,
medium,
large,
max
max_price_dollars
number

Maximum budget in USD for this request. If estimated cost exceeds this, the request is rejected. Defaults to 2x the maximum possible cost.

Required range: x > 0
Example:

0.05

summary
default:false

Controls AI-powered content processing. Adds $0.001 per URL when enabled.

  • false - No AI processing, raw extracted content only (default)
  • true - Generate a text summary of each page
  • "string" - Generate a summary guided by custom instructions (max 500 characters)
  • {json_schema} - Extract structured data matching the provided JSON schema
extract_effort
enum<string>
default:auto

Controls how pages are rendered for extraction.

  • "auto" - Automatically selects the best method per URL (default)
  • "normal" - HTTP-only extraction, fastest
  • "high" - Full Chrome rendering, best for JavaScript-heavy pages
Available options:
auto,
normal,
high
screenshot
boolean
default:false

Capture a screenshot of each page. Screenshot URL returned in screenshot_url field of each result.

async
boolean
default:false

Process URLs asynchronously. Required when submitting more than 10 URLs. Returns a job_id for polling.

webhook_url
string<uri>

HTTPS URL to receive a webhook POST when async processing completes. Only used when async is true. The webhook payload is signed with HMAC-SHA256.

Example:

"https://your-app.com/webhooks/valyu"

Response

All URLs processed successfully (synchronous).

Synchronous content extraction response.

success
boolean
required

Whether the extraction completed successfully.

Example:

true

tx_id
string
required

Transaction ID for tracking.

Example:

"tx_a1b2c3d4-e5f6-7890-abcd-ef1234567890"

urls_requested
integer
required

Number of deduplicated URLs submitted.

Example:

3

urls_processed
integer
required

Number of URLs successfully processed.

Example:

3

urls_failed
integer
required

Number of URLs that failed processing.

Example:

0

results
object[]
required

Extraction results for each URL.

total_cost_dollars
number
required

Total cost in USD (only successful URLs are charged).

Example:

0.003

total_characters
integer
required

Sum of all result content lengths.

Example:

45230

error
string | null

Error message if some URLs failed. null on complete success.

Example:

null