Skip to main content
POST
/
v1
/
search
curl --request POST \
  --url https://api.valyu.ai/v1/search \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "query": "latest developments in quantum computing",
  "max_num_results": 5
}
'
{
  "success": true,
  "error": null,
  "tx_id": "tx_a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "query": "latest developments in quantum computing",
  "results": [
    {
      "id": "https://arxiv.org/abs/2024.12345",
      "title": "Quantum Computing Breakthrough: New Error Correction Method",
      "url": "https://arxiv.org/abs/2024.12345",
      "content": "Researchers at MIT have developed a revolutionary quantum error correction method that reduces qubit errors by 99.9%...",
      "description": "Major breakthrough in quantum error correction methodology",
      "source": "web",
      "price": 0.0015,
      "length": 15420,
      "image_url": null,
      "relevance_score": 0.95,
      "data_type": "unstructured",
      "source_type": "website",
      "publication_date": "2024-06-15"
    },
    {
      "id": "valyu/valyu-arxiv/2401.12345",
      "title": "Advances in Topological Quantum Computing",
      "url": "https://arxiv.org/abs/2401.12345",
      "content": "We present a novel approach to topological quantum error correction...",
      "description": "Topological approaches to fault-tolerant quantum computation",
      "source": "valyu/valyu-arxiv",
      "price": 0.0005,
      "length": 28000,
      "image_url": null,
      "relevance_score": 0.89,
      "data_type": "unstructured",
      "source_type": "paper",
      "publication_date": "2024-01-15",
      "doi": "10.48550/arXiv.2401.12345",
      "authors": [
        "J. Smith",
        "A. Chen",
        "M. Kumar"
      ]
    }
  ],
  "results_by_source": {
    "web": 3,
    "proprietary": 2
  },
  "total_deduction_dollars": 0.0075,
  "total_characters": 45230
}

Authorizations

X-API-Key
string
header
required

API key for authentication. Get yours at platform.valyu.ai.

Body

application/json

Search request parameters.

query
string
required

The search query to execute. Supports natural language queries.

Example:

"latest developments in quantum computing"

max_num_results
integer
default:5

Maximum number of results to return. Higher limits (up to 100) available on request.

Required range: 1 <= x <= 20
Example:

10

search_type
enum<string>
default:all

Controls which data sources are searched.

  • all - Web and proprietary sources (default). An LLM router selects the best sources for your query.
  • web - Web search only. Best for current events and general topics.
  • proprietary - Academic, financial, and premium sources only. Best for research and technical analysis.
  • news - News articles only. Best for recent news and current events.
Available options:
all,
web,
proprietary,
news
Example:

"all"

max_price
number

Maximum budget in CPM (cost per mille/thousand tokens). When not set, automatically calculated based on max_num_results and search_type with a minimum floor of 20 CPM.

Required range: x > 0
Example:

20

relevance_threshold
number
default:0.5

Minimum relevance score (0.0-1.0) for results after reranking. Results below this threshold are filtered out.

Required range: 0 <= x <= 1
Example:

0.5

included_sources
string[]

Sources to include in the search. Accepts:

  • Domain names or URLs for web filtering (e.g. "arxiv.org", "https://arxiv.org")
  • Dataset identifiers for proprietary sources (e.g. "valyu/valyu-arxiv")
  • Preset names that expand to curated source groups: "academic", "finance", "patent", "transportation", "politics", "legal", "health", "genomics", "chemistry", "physics"
  • "web" keyword to explicitly include web search alongside proprietary sources
  • "collection:NAME" to reference an org-scoped saved source collection

Note: most specialised and proprietary sources require a subscription. All plans include web search and open academic sources (arXiv, PubMed).

Example:
["valyu/valyu-arxiv", "valyu/valyu-pubmed"]
excluded_sources
string[]

Sources to exclude from the search. Same format as included_sources except presets are not supported.

Example:
["reddit.com"]
source_biases
object

Bias values for specific sources to influence ranking without hard filtering. Keys are domains (e.g., 'nasa.gov') or URL paths (e.g., 'nih.gov/research'). Values are integers from -5 (strong demotion) to +5 (strong boost). Most specific path match wins.

Example:
{
"nasa.gov": 5,
"noaa.gov": 3,
"nih.gov": 1,
"example.com": -4
}
instructions
string

Natural language instructions to help rank results by relevance to user intent. Acts as a system prompt for the search. Max 500 characters. Ignored when fast_mode is true.

Maximum string length: 500
Example:

"Focus on oncology clinical trials from 2023 onwards"

category
string
deprecated

Deprecated. Use instructions instead. Falls back to this value if instructions is not set.

Maximum string length: 500
is_tool_call
boolean
default:true

Indicates whether this request originates from an AI tool call. Affects query rewriting behavior.

response_length
default:short

Controls the maximum character length of content per result.

  • "short" - 25,000 characters (default)
  • "medium" - 50,000 characters
  • "large" - 100,000 characters
  • "max" - No limit
  • Any positive integer for a custom character limit
Available options:
short,
medium,
large,
max
Example:

"short"

start_date
string<date>

Filter results published on or after this date. Format: YYYY-MM-DD. If set without end_date, defaults to today.

Example:

"2024-01-01"

end_date
string<date>

Filter results published on or before this date. Format: YYYY-MM-DD. If set without start_date, defaults to 1900-01-01.

Example:

"2024-12-31"

country_code
enum<string>

ISO 3166-1 alpha-2 country code for geo-targeted web search results.

Available options:
ALL,
AR,
AU,
AT,
BE,
BR,
CA,
CL,
DK,
FI,
FR,
DE,
HK,
IN,
ID,
IT,
JP,
KR,
MY,
MX,
NL,
NZ,
NO,
CN,
PL,
PT,
PH,
RU,
SA,
ZA,
ES,
SE,
CH,
TW,
TR,
GB,
US
Example:

"US"

fast_mode
boolean
default:false

Bypass LLM query rewriting and reranking for lower latency. Forces web-only search. Cannot be used with search_type: "proprietary".

url_only
boolean
default:false

Return only URLs without full content extraction. Only available when search_type is "web" or "news". Skips reranking.

historical_cache
boolean
default:false

Allow results to be served from the historical cache when available. Useful for reproducible queries and lower latency.

Response

Search completed successfully with results.

Search response containing results and metadata.

success
boolean
required

Whether the search completed successfully.

Example:

true

tx_id
string
required

Transaction ID for tracking and support.

Example:

"tx_a1b2c3d4-e5f6-7890-abcd-ef1234567890"

query
string
required

The original query as submitted.

Example:

"latest developments in quantum computing"

results
object[]
required

Search results ordered by relevance.

results_by_source
object
required

Count of results broken down by source type.

Example:
{ "web": 5, "proprietary": 3 }
total_deduction_dollars
number
required

Total cost charged for this search in USD.

Example:

0.0075

total_characters
integer
required

Sum of all result content lengths in characters.

Example:

45230

error
string | null

Error or warning message. May be non-empty even on successful responses if some sources had issues.

Example:

null

warnings
string[]

Warning messages, if any (e.g. collection resolution warnings). Only present when warnings exist.