Practical tips for getting better results from the Valyu DeepSearch API.
Multi-Step Search Workflows
For complex research tasks, break your search into multiple steps rather than relying on a single query. This works especially well for technical domains like research, finance, and medicine.
Example workflow:
# Multi-step search workflow
async def research_agent(query: str):
# Step 1: Break down the query into focused searches
sub_queries = decompose_query(query)
results = {}
for i, sub_query in enumerate(sub_queries):
# Step 2: Adjust strategy based on what you've found
strategy = adapt_strategy(sub_query, results)
search_result = valyu.search(
query=sub_query,
included_sources=strategy.sources,
max_price=strategy.budget,
relevance_threshold=0.65
)
results[f"step_{i}"] = search_result
# Step 3: Fill in any gaps
gaps = identify_knowledge_gaps(search_result, query)
if gaps:
gap_result = valyu.search(
query=gaps[0].refined_query,
included_sources=gaps[0].target_sources,
max_price=50.0
)
results[f"gap_fill_{i}"] = gap_result
# Step 4: Combine everything
return synthesise_multi_source_findings(results)
AI vs Human Searches
Valyu is optimised for AI agents by default. The tool_call_mode parameter controls this:
For AI agents (the default):
# Optimised for LLMs
response = valyu.search(
"quantum error correction surface codes LDPC performance benchmarks",
tool_call_mode=True, # Default
)
For human-facing searches:
# Better for human readability
response = valyu.search(
"quantum computing error correction methods",
tool_call_mode=False,
)
Using Search Parameters
Combine good prompts with search parameters:
response = valyu.search(
"GPT-4 vs GPT-3 architectural innovations: training efficiency, inference optimisation, and benchmark comparisons",
search_type="proprietary",
max_num_results=10,
relevance_threshold=0.6,
included_sources=["valyu/valyu-arxiv"],
max_price=50.0,
category="machine learning",
start_date="2024-01-01",
end_date="2024-12-31"
)
Use included_sources to search datasets other APIs can’t access—like
valyu/valyu-arxiv for academic papers or specialised
datasets for financial data.
Balancing Quality and Cost
Budget Tiers
Not getting enough results? Try increasing max_price:
search_configs = [
{"max_price": 20.0, "use_case": "Quick fact-checking"},
{"max_price": 50.0, "use_case": "Standard research"},
{"max_price": 100.0, "use_case": "Comprehensive analysis"},
]
What each budget gets you:
- $20 CPM: Basic web + academic content
- $50 CPM: Full web + most research databases + financial data
- $100 CPM: Premium sources + financial data + specialised datasets
Higher budgets unlock exclusive sources
like academic journals, financial data feeds, and curated research databases.
Managing Context Size
Control how much data goes into your LLM’s context window:
# Smaller context
lightweight_search = valyu.search(
"transformer architecture innovations",
max_num_results=3,
results_length="short",
max_price=50.0
)
# Larger context
comprehensive_search = valyu.search(
"transformer architecture innovations",
max_num_results=15,
results_length="max",
max_price=100.0
)
Token estimates:
- Short: ~6k tokens per result (25k chars)
- Medium: ~12k tokens per result (50k chars)
- Long: ~24k tokens per result (100k chars)
- Rule of thumb: 4 characters ≈ 1 token
Start with max_num_results=10 and results_length="short", then adjust
based on your needs.
Specialised Datasets
Valyu offers datasets beyond standard web search. Browse them at platform.valyu.ai/data-sources.
Categories include:
- Academic: ArXiv, PubMed, academic publishers
- Financial: SEC filings, earnings reports, market data
- Medical: Clinical trials, FDA drug labels, medical literature
- Technical: Patents, specifications, implementation guides
- Books & Literature: Digitised texts, reference materials
Targeting specific datasets:
# Academic search
academic_search = valyu.search(
"CRISPR gene editing clinical trials safety outcomes",
included_sources=["valyu/valyu-pubmed", "valyu/valyu-US-clinical-trials"],
max_price=30.0
)
# Financial search
financial_search = valyu.search(
"Tesla Q3 2024 earnings revenue breakdown",
included_sources=["valyu/valyu-US-sec-filings", "valyu/valyu-US-earnings"],
max_price=60.0
)
Data advantage: Proprietary datasets behind the DeepSearch API often
contain information unavailable through standard web APIs, giving your AI
system access to authoritative, structured knowledge that improves factual
accuracy.
Common Mistakes to Avoid
- Wasting tokens: Use
max_num_results and results_length to control context size
- Skipping filters: Set relevance thresholds and source controls
- Ignoring costs: Balance
max_price with your quality needs
- Wrong sources: Pick datasets that match your domain—academic, financial, medical, or web
- Single-shot queries: For complex research, use multi-step workflows
Next Steps
Get Help