Search across web and proprietary data sources
Search the web, academic journals, financial databases, and 40+ proprietary data sources. Returns ranked results with extracted content, ready for RAG pipelines and AI agents.
Results are ranked by relevance using semantic reranking (unless fast_mode is enabled). You only pay for results returned - pricing is CPM-based (cost per mille tokens).
Integrating search into an agent? Consider DeepResearch - a cost-effective autonomous agent built on top of this search engine, purpose-built for knowledge work.
Authorizations
API key for authentication. Get yours at platform.valyu.ai.
Body
Search request parameters.
The search query to execute. Supports natural language queries.
"latest developments in quantum computing"
Maximum number of results to return. Higher limits (up to 100) available on request.
1 <= x <= 2010
Controls which data sources are searched.
all- Web and proprietary sources (default). An LLM router selects the best sources for your query.web- Web search only. Best for current events and general topics.proprietary- Academic, financial, and premium sources only. Best for research and technical analysis.news- News articles only. Best for recent news and current events.
all, web, proprietary, news "all"
Maximum budget in CPM (cost per mille/thousand tokens). When not set, automatically calculated based on max_num_results and search_type with a minimum floor of 20 CPM.
x > 020
Minimum relevance score (0.0-1.0) for results after reranking. Results below this threshold are filtered out.
0 <= x <= 10.5
Sources to include in the search. Accepts:
- Domain names or URLs for web filtering (e.g.
"arxiv.org","https://arxiv.org") - Dataset identifiers for proprietary sources (e.g.
"valyu/valyu-arxiv") - Preset names that expand to curated source groups:
"academic","finance","patent","transportation","politics","legal","health","genomics","chemistry","physics" "web"keyword to explicitly include web search alongside proprietary sources"collection:NAME"to reference an org-scoped saved source collection
Note: most specialised and proprietary sources require a subscription. All plans include web search and open academic sources (arXiv, PubMed).
["valyu/valyu-arxiv", "valyu/valyu-pubmed"]Sources to exclude from the search. Same format as included_sources except presets are not supported.
["reddit.com"]Bias values for specific sources to influence ranking without hard filtering. Keys are domains (e.g., 'nasa.gov') or URL paths (e.g., 'nih.gov/research'). Values are integers from -5 (strong demotion) to +5 (strong boost). Most specific path match wins.
{
"nasa.gov": 5,
"noaa.gov": 3,
"nih.gov": 1,
"example.com": -4
}Natural language instructions to help rank results by relevance to user intent. Acts as a system prompt for the search. Max 500 characters. Ignored when fast_mode is true.
500"Focus on oncology clinical trials from 2023 onwards"
Deprecated. Use instructions instead. Falls back to this value if instructions is not set.
500Indicates whether this request originates from an AI tool call. Affects query rewriting behavior.
Controls the maximum character length of content per result.
"short"- 25,000 characters (default)"medium"- 50,000 characters"large"- 100,000 characters"max"- No limit- Any positive integer for a custom character limit
short, medium, large, max "short"
Filter results published on or after this date. Format: YYYY-MM-DD. If set without end_date, defaults to today.
"2024-01-01"
Filter results published on or before this date. Format: YYYY-MM-DD. If set without start_date, defaults to 1900-01-01.
"2024-12-31"
ISO 3166-1 alpha-2 country code for geo-targeted web search results.
ALL, AR, AU, AT, BE, BR, CA, CL, DK, FI, FR, DE, HK, IN, ID, IT, JP, KR, MY, MX, NL, NZ, NO, CN, PL, PT, PH, RU, SA, ZA, ES, SE, CH, TW, TR, GB, US "US"
Bypass LLM query rewriting and reranking for lower latency. Forces web-only search. Cannot be used with search_type: "proprietary".
Return only URLs without full content extraction. Only available when search_type is "web" or "news". Skips reranking.
Allow results to be served from the historical cache when available. Useful for reproducible queries and lower latency.
Response
Search completed successfully with results.
Search response containing results and metadata.
Whether the search completed successfully.
true
Transaction ID for tracking and support.
"tx_a1b2c3d4-e5f6-7890-abcd-ef1234567890"
The original query as submitted.
"latest developments in quantum computing"
Search results ordered by relevance.
Count of results broken down by source type.
{ "web": 5, "proprietary": 3 }Total cost charged for this search in USD.
0.0075
Sum of all result content lengths in characters.
45230
Error or warning message. May be non-empty even on successful responses if some sources had issues.
null
Warning messages, if any (e.g. collection resolution warnings). Only present when warnings exist.

