Skip to main content
POST
/
v1
/
search
curl --request POST \
  --url https://api.valyu.ai/v1/search \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "query": "latest developments in quantum computing",
  "max_num_results": 5
}
'
{
  "success": true,
  "error": null,
  "tx_id": "tx_a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "query": "latest developments in quantum computing",
  "results": [
    {
      "id": "https://arxiv.org/abs/2024.12345",
      "title": "Quantum Computing Breakthrough: New Error Correction Method",
      "url": "https://arxiv.org/abs/2024.12345",
      "content": "Researchers at MIT have developed a revolutionary quantum error correction method that reduces qubit errors by 99.9%...",
      "description": "Major breakthrough in quantum error correction methodology",
      "source": "web",
      "price": 0.0015,
      "length": 15420,
      "image_url": null,
      "relevance_score": 0.95,
      "data_type": "unstructured",
      "source_type": "website",
      "publication_date": "2024-06-15"
    },
    {
      "id": "valyu/valyu-arxiv/2401.12345",
      "title": "Advances in Topological Quantum Computing",
      "url": "https://arxiv.org/abs/2401.12345",
      "content": "We present a novel approach to topological quantum error correction...",
      "description": "Topological approaches to fault-tolerant quantum computation",
      "source": "valyu/valyu-arxiv",
      "price": 0.0005,
      "length": 28000,
      "image_url": null,
      "relevance_score": 0.89,
      "data_type": "unstructured",
      "source_type": "paper",
      "publication_date": "2024-01-15",
      "doi": "10.48550/arXiv.2401.12345",
      "authors": [
        "J. Smith",
        "A. Chen",
        "M. Kumar"
      ]
    }
  ],
  "results_by_source": {
    "web": 3,
    "proprietary": 2
  },
  "total_deduction_dollars": 0.0075,
  "total_characters": 45230
}

Authorizations

X-API-Key
string
header
required

API key for authentication. Get yours at platform.valyu.ai.

Body

application/json

Search request parameters.

query
string
required

The search query to execute. Supports natural language queries.

Example:

"latest developments in quantum computing"

max_num_results
integer
default:5

Maximum number of results to return. Higher limits (up to 100) available on request.

Required range: 1 <= x <= 20
Example:

10

search_type
enum<string>
default:all

Controls which data sources are searched.

  • all - Web and proprietary sources (default). An LLM router selects the best sources for your query.
  • web - Web search only. Best for current events and general topics.
  • proprietary - Academic, financial, and premium sources only. Best for research and technical analysis.
  • news - News articles only. Best for recent news and current events.
Available options:
all,
web,
proprietary,
news
Example:

"all"

max_price
number

Maximum budget in CPM (cost per mille/thousand tokens). When not set, automatically calculated based on max_num_results and search_type with a minimum floor of 20 CPM.

Required range: x > 0
Example:

20

relevance_threshold
number
default:0.5

Minimum relevance score (0.0-1.0) for results after reranking. Results below this threshold are filtered out.

Required range: 0 <= x <= 1
Example:

0.5

included_sources
string[]

Sources to include in the search. Accepts:

  • Domain names or URLs for web filtering (e.g. "arxiv.org", "https://arxiv.org")
  • Dataset identifiers for proprietary sources (e.g. "valyu/valyu-arxiv")
  • Preset names that expand to source groups: "finance", "medical", "transportation", "politics", "legal", "health", "genomics", "chemistry", "physics"
  • "web" keyword to explicitly include web search alongside proprietary sources
  • "collection:NAME" to reference a saved source collection
Example:
["valyu/valyu-arxiv", "valyu/valyu-pubmed"]
excluded_sources
string[]

Sources to exclude from the search. Same format as included_sources except presets are not supported.

Example:
["reddit.com"]
source_biases
object

Bias values for specific sources to influence ranking without hard filtering. Keys are domains (e.g., 'nasa.gov') or URL paths (e.g., 'nih.gov/research'). Values are integers from -5 (strong demotion) to +5 (strong boost). Most specific path match wins.

Example:
{
"nasa.gov": 5,
"noaa.gov": 3,
"nih.gov": 1,
"example.com": -4
}
instructions
string

Custom instructions for query rewriting and result reranking. Max 500 characters.

Maximum string length: 500
Example:

"Focus on peer-reviewed research from the last 2 years"

category
string
deprecated

Deprecated. Use instructions instead. Falls back to this value if instructions is not set.

Maximum string length: 500
is_tool_call
boolean
default:true

Indicates whether this request originates from an AI tool call. Affects query rewriting behavior.

response_length
default:short

Controls the maximum character length of content per result.

  • "short" - 25,000 characters (default)
  • "medium" - 50,000 characters
  • "large" - 100,000 characters
  • "max" - No limit
  • Any positive integer for a custom character limit
Available options:
short,
medium,
large,
max
Example:

"short"

start_date
string<date>

Filter results published on or after this date. Format: YYYY-MM-DD. If set without end_date, defaults to today.

Example:

"2024-01-01"

end_date
string<date>

Filter results published on or before this date. Format: YYYY-MM-DD. If set without start_date, defaults to 1900-01-01.

Example:

"2024-12-31"

country_code
enum<string>

ISO 3166-1 alpha-2 country code for geo-targeted web search results.

Available options:
ALL,
AR,
AU,
AT,
BE,
BR,
CA,
CL,
DK,
FI,
FR,
DE,
HK,
IN,
ID,
IT,
JP,
KR,
MY,
MX,
NL,
NZ,
NO,
CN,
PL,
PT,
PH,
RU,
SA,
ZA,
ES,
SE,
CH,
TW,
TR,
GB,
US
Example:

"US"

fast_mode
boolean
default:false

Bypass LLM query rewriting and reranking for lower latency. Forces web-only search. Cannot be used with search_type: "proprietary".

url_only
boolean
default:false

Return only URLs without full content extraction. Only available when search_type is "web" or "news". Skips reranking.

Response

Search completed successfully with results.

Search response containing results and metadata.

success
boolean
required

Whether the search completed successfully.

Example:

true

tx_id
string
required

Transaction ID for tracking and support.

Example:

"tx_a1b2c3d4-e5f6-7890-abcd-ef1234567890"

query
string
required

The original query as submitted.

Example:

"latest developments in quantum computing"

results
object[]
required

Search results ordered by relevance.

results_by_source
object
required

Count of results broken down by source type.

Example:
{ "web": 5, "proprietary": 3 }
total_deduction_dollars
number
required

Total cost charged for this search in USD.

Example:

0.0075

total_characters
integer
required

Sum of all result content lengths in characters.

Example:

45230

error
string | null

Error or warning message. May be non-empty even on successful responses if some sources had issues.

Example:

null

warnings
string[]

Warning messages, if any (e.g. collection resolution warnings). Only present when warnings exist.