Documentation Index
Fetch the complete documentation index at: https://docs.valyu.ai/llms.txt
Use this file to discover all available pages before exploring further.
The Search API provides powerful search capabilities across web and proprietary data sources, returning relevant content optimized for AI applications and RAG pipelines.
Basic Usage
from valyu import Valyu
valyu = Valyu()
response = valyu.search(
"What are the latest developments in quantum computing?"
)
print(f"Found {len(response.results)} results")
for result in response.results:
print(f"Title: {result.title}")
print(f"URL: {result.url}")
print(f"Content: {result.content[:200]}...")
Parameters
Query (Required)
| Parameter | Type | Description |
|---|
query | str | The search query string (see Prompting Guide) |
Options (Optional)
| Parameter | Type | Description | Default |
|---|
search_type | "web" | "proprietary" | "all" | "news" | Search type: "web" (web only), "proprietary" (Valyu datasets), "news" (news only), "all" | "all" |
max_num_results | int | Maximum results to return (1-20 standard, up to 100 with special API key) | 10 |
max_price | float | Maximum cost per thousand retrievals (CPM). Not set by default. When omitted, all sources are searched regardless of cost. Setting this too low may exclude relevant higher-priced sources | None |
is_tool_call | bool | Set to True for AI agents/tools, False for direct user queries | True |
relevance_threshold | float | Minimum relevance score (0.0-1.0) | 0.5 |
included_sources | List[str] | Sources to search within (URLs, domains, or dataset names) | None |
excluded_sources | List[str] | Sources to exclude from results | None |
source_biases | Dict[str, int] | Bias values for specific sources (-5 to +5) to influence ranking without hard filtering | None |
instructions | str | Natural language instructions to help rank results by relevance to user intent. Ignored in fast mode. | None |
category | str | Deprecated. Use instructions instead. Falls back to this value if instructions is not set | None |
start_date | str | Start date for filtering (YYYY-MM-DD) | None |
end_date | str | End date for filtering (YYYY-MM-DD) | None |
country_code | str | 2-letter ISO country code to bias results | None |
response_length | str | int | Content length: "short" (25k), "medium" (50k), "large" (100k), "max" (full), or custom count | "short" |
fast_mode | bool | Enable fast mode for reduced latency but shorter results | False |
url_only | bool | Return URLs only (no content). Only applies when search_type is "web" or "news" | False |
Check out our other guides for more information on how to best use the Search API: Quick Start.
class SearchResponse:
success: bool
error: Optional[str]
tx_id: str
query: str
results: List[SearchResult]
results_by_source: ResultsBySource
total_deduction_dollars: float
total_characters: int
class SearchResult:
title: str
url: str
content: str
description: Optional[str]
source: str
price: float
length: int
relevance_score: float
data_type: Optional[str] # "structured" | "unstructured"
# Additional fields for academic/proprietary sources
publication_date: Optional[str]
authors: Optional[List[str]]
citation: Optional[str]
citation_count: Optional[int]
doi: Optional[str]
references: Optional[str]
metadata: Optional[Dict[str, Any]]
Parameter Examples
Fast Mode
Enable fast mode for quicker results:
response = valyu.search(query, fast_mode=True)
Responses will be quicker but the content will be shorter.
Search Type Configuration
Control which data sources to search:
# Web search only
web_response = valyu.search(query, search_type="web", max_num_results=10)
# Proprietary datasets only
proprietary_response = valyu.search(query, search_type="proprietary", max_num_results=8)
# Both web and proprietary (default)
all_response = valyu.search(query, search_type="all", max_num_results=12)
Source Filtering
Control which specific sources to include or exclude:
response = valyu.search(
"quantum computing applications",
search_type="all",
max_num_results=10,
included_sources=["valyu/valyu-arxiv", "valyu/valyu-pubmed", "valyu/valyu-biorxiv"],
response_length="medium"
)
response = valyu.search(
"quantum computing applications",
search_type="all",
max_num_results=10,
excluded_sources=["example.com", "example.org"],
response_length="medium"
)
You can either include or exclude sources, but not both.
Source Biases
Influence ranking without hard filtering. Values range from -5 (strong demotion) to +5 (strong boost):
response = valyu.search(
"climate change research",
search_type="web",
source_biases={
"nasa.gov": 5,
"noaa.gov": 3,
"epa.gov": 2,
"nih.gov": 1,
"example.com": -4
}
)
Source biases can be combined with included_sources to control ranking within a filtered pool:
response = valyu.search(
"climate data",
included_sources=["nasa.gov", "noaa.gov", "epa.gov"],
source_biases={
"nasa.gov": 5,
"epa.gov": -3
}
)
URL path specificity is supported — the most specific match wins:
source_biases={
"nih.gov": 2, # General NIH content boosted
"nih.gov/research": -3 # Specific section demoted
}
Geographic and Date Filtering
Bias results by location and time range:
response = valyu.search(
"renewable energy policies",
country_code="US",
start_date="2024-01-01",
end_date="2024-12-31",
max_num_results=7,
category="government policy"
)
Response Length Control
Customize content length per result:
# Predefined lengths
short_response = valyu.search(query, response_length="short") # 25k characters
medium_response = valyu.search(query, response_length="medium") # 50k characters
large_response = valyu.search(query, response_length="large") # 100k characters
# Custom character limit
custom_response = valyu.search(query, response_length=15000) # Custom limit
Use Case Examples
Academic Research Assistant
Build a comprehensive research tool that searches across academic databases:
def academic_research(query: str):
response = valyu.search(
query,
search_type="proprietary",
included_sources=["valyu/valyu-pubmed", "valyu/valyu-arxiv"],
max_num_results=15,
response_length="large",
category="academic research"
)
if response.success:
print(f"=== Academic Research Results ===")
print(f"Found {len(response.results)} papers for: \"{query}\"")
# Group by source
arxiv_papers = [r for r in response.results if "arxiv" in r.source]
pubmed_papers = [r for r in response.results if "pubmed" in r.source]
print(f"\nArXiv Papers: {len(arxiv_papers)}")
for i, paper in enumerate(arxiv_papers, 1):
print(f"{i}. {paper.title}")
print(f" Relevance: {paper.relevance_score:.2f}")
print(f" URL: {paper.url}")
if paper.publication_date:
print(f" Published: {paper.publication_date}")
if paper.authors:
print(f" Authors: {', '.join(paper.authors)}")
print(f"\nPubMed Articles: {len(pubmed_papers)}")
for i, article in enumerate(pubmed_papers, 1):
print(f"{i}. {article.title}")
print(f" Relevance: {article.relevance_score:.2f}")
if article.citation:
print(f" Citation: {article.citation}")
return {
"arxiv": arxiv_papers,
"pubmed": pubmed_papers,
"query": query
}
return None
# Usage examples - dates are included in natural language
covid_research = academic_research(
"COVID-19 vaccine efficacy studies published between 2023 and 2024"
)
ai_research = academic_research(
"recent machine learning breakthroughs in the last 2 years"
)
climate_research = academic_research(
"climate change mitigation strategies peer-reviewed research since 2020"
)
Financial Market Intelligence
Create a financial analysis tool that searches market data and news:
def financial_intelligence(query: str, analysis_type: str):
"""analysis_type: 'fundamental', 'technical', or 'news'"""
sources = {
"fundamental": ["valyu/valyu-stocks", "sec.gov", "sec.gov"],
"technical": ["valyu/valyu-stocks", "kaggle.com", "yahoo.com"],
"news": ["treasury.gov", "federalreserve.gov", "sec.gov", "fred.stlouisfed.org"]
}
response = valyu.search(
query,
search_type="all" if analysis_type in ["fundamental", "technical"] else "web",
included_sources=sources[analysis_type],
max_num_results=10,
response_length="medium",
category="financial analysis"
)
if response.success:
print(f"=== {analysis_type.upper()} Analysis ===")
print(f"Query: \"{query}\"")
for i, result in enumerate(response.results, 1):
print(f"\n{i}. {result.title}")
print(f" Source: {result.source}")
print(f" Relevance: {result.relevance_score:.2f}")
print(f" URL: {result.url}")
# Show excerpt for financial data
if len(result.content) > 200:
print(f" Preview: {result.content[:200]}...")
if result.publication_date:
print(f" Date: {result.publication_date}")
return {
"results": response.results,
"analysis_type": analysis_type,
"query": query
}
return None
# Usage examples - include timeframes in natural language
tesla_fundamentals = financial_intelligence(
"Tesla financial performance Q3 2024 earnings revenue profit margins",
"fundamental"
)
apple_news = financial_intelligence(
"Apple latest news this week product announcements stock updates",
"news"
)
bitcoin_technical = financial_intelligence(
"Bitcoin price analysis technical indicators support resistance levels recent trends",
"technical"
)
Real-time News Monitoring
Build a news monitoring system using news mode with date and country filtering. Supports up to 100 results with the increased_max_results permission.
from datetime import datetime, timedelta
def news_monitoring(queries: List[str], days_back: int = 7, country: str = "US"):
"""
Monitor multiple news topics with date and country filtering.
Requires API key with increased_max_results permission for >20 results.
"""
end_date = datetime.now().strftime("%Y-%m-%d")
start_date = (datetime.now() - timedelta(days=days_back)).strftime("%Y-%m-%d")
all_results = []
for query in queries:
response = valyu.search(
query,
search_type="news", # Use news mode
max_num_results=50, # Up to 100 with increased_max_results permission
start_date=start_date,
end_date=end_date,
country_code=country,
response_length="short"
)
if response.success:
all_results.append({
"query": query,
"articles": response.results,
"count": len(response.results)
})
# Generate monitoring report
print(f"=== News Monitoring Report ===")
print(f"Date range: {start_date} to {end_date}")
print(f"Country: {country}")
print(f"Monitoring {len(queries)} topics\n")
for result in all_results:
query, articles, count = result["query"], result["articles"], result["count"]
print(f"📰 {query.upper()}: {count} articles")
for i, article in enumerate(articles[:5], 1):
print(f" {i}. {article.title}")
print(f" Date: {article.publication_date or 'N/A'}")
print(f" URL: {article.url}")
print("")
return all_results
# Monitor tech news from the last 7 days in the US
tech_news = news_monitoring(
queries=[
"artificial intelligence breakthroughs",
"quantum computing progress",
"cryptocurrency regulation"
],
days_back=7,
country="US"
)
# Monitor business news from the last 30 days
business_news = news_monitoring(
queries=[
"Tesla earnings report",
"Federal Reserve interest rate decisions",
"tech layoffs announcements"
],
days_back=30,
country="US"
)
To use more than 20 results, request the increased_max_results permission at API Key Management.
Error Handling
The Search API includes comprehensive error handling and validation:
response = valyu.search("test query", max_num_results=5)
if not response.success:
print("Search failed:", response.error)
# Handle specific error cases
if "insufficient credits" in response.error:
print("Please add more credits to your account")
elif "invalid" in response.error:
print("Check your search parameters")
return
# Process successful results
print(f"Transaction ID: {response.tx_id}")
for index, result in enumerate(response.results):
print(f"{index + 1}. {result.title}")
print(f" Relevance: {result.relevance_score}")
print(f" Source: {result.source}")
print(f" URL: {result.url}")
Async
AsyncValyu.search accepts the exact same arguments and returns the
same SearchResponse as the synchronous search — every parameter
described above (search_type, included_sources, source_biases,
instructions, date filters, and the rest) behaves identically. The
only difference is that the call is awaited.
import asyncio
from valyu import AsyncValyu
async def main():
async with AsyncValyu() as valyu:
response = await valyu.search(
"What are the latest developments in quantum computing?"
)
for result in response.results:
print(result.title, "—", result.url)
asyncio.run(main())
All of the parameters documented above — search_type,
max_num_results, included_sources, source_biases, date filters,
instructions, fast_mode, etc. — apply unchanged on the async
client:
response = await valyu.search(
"Real options valuation binomial Monte Carlo natural resource R&D",
included_sources=["wiley/wiley-finance-papers", "wiley/wiley-finance-books"],
max_num_results=15,
)
Fan out multiple searches
The most common reason to use the async client on search is to run
several queries in parallel — for example, an agent that breaks a
user question into sub-queries and combines the answers:
import asyncio
from valyu import AsyncValyu
queries = [
"DCF valuation terminal value",
"Black-Scholes options pricing",
"GARCH volatility modelling",
"Monte Carlo exotic derivatives",
]
async def main():
async with AsyncValyu() as valyu:
responses = await asyncio.gather(*[
valyu.search(q, max_num_results=10) for q in queries
])
for q, r in zip(queries, responses):
print(f"{q}: {len(r.results)} results")
asyncio.run(main())
All four requests run concurrently — wall time is the slowest of the
four, not the sum. For larger fan-outs (hundreds of queries) cap
concurrency with an asyncio.Semaphore; see the Async Usage section
of the Python SDK overview for that
pattern.
See the Python SDK overview for all
AsyncValyu constructor options and lifecycle patterns.
Source Types
Web Sources
- General websites and domains
- News sites and blogs
- Forums and community sites
- Documentation sites
Proprietary Sources
valyu/valyu-arxiv - Academic papers from arXiv
valyu/valyu-pubmed - Medical and life science literature
valyu/valyu-stocks - Global stock market data
- And many more. Check out the Valyu Platform Datasets for more information.