Process multiple research tasks efficiently with shared configuration, unified monitoring, and aggregated cost tracking. The Batch API is ideal for bulk research operations where you need to process many queries simultaneously.
When to Use Batching
Use batch processing when you need to:
- Process multiple queries - Run 1-100 research tasks in parallel
- Share configuration - Apply the same mode, output formats, and search settings to all tasks
- Unified monitoring - Track progress and costs across all tasks in one place
- Efficient bulk operations - Reduce API calls and simplify task management
For individual tasks with unique configurations or advanced features (files, deliverables, MCP servers), use the standard DeepResearch API instead.
Key Concepts
Batch Lifecycle
A batch progresses through the following statuses:
| Status | Description |
|---|
open | Batch is created but no tasks are running yet |
processing | At least one task is queued, running, or completed |
completed | All tasks have finished successfully |
completed_with_errors | All tasks finished, but some failed |
cancelled | Batch was cancelled before completion |
Task States
Individual tasks within a batch can be in these states:
| Status | Description |
|---|
queued | Task is waiting to start |
running | Task is currently executing |
completed | Task finished successfully |
failed | Task encountered an error |
cancelled | Task was cancelled |
Shared Configuration
Tasks in a batch inherit these settings from the batch:
mode - Research mode (standard, heavy, fast)
output_formats - Output formats (markdown, pdf, toon, or JSON schema)
search_params - Search configuration (type, sources, dates, category)
Tasks can override:
strategy - Custom research instructions
urls - URLs to analyze
metadata - Custom metadata
Basic Workflow
from valyu import Valyu
valyu = Valyu()
# 1. Create a batch with default settings
batch = valyu.batch.create(
name="Market Research Q4 2024",
mode="standard",
output_formats=["markdown"],
search={
"search_type": "all",
"included_sources": ["web", "academic"],
"start_date": "2024-01-01",
"end_date": "2024-12-31"
},
metadata={"project": "Q4-2024", "team": "research"}
)
if batch.success:
batch_id = batch.batch_id
print(f"Created batch: {batch_id}")
# 2. Add tasks to the batch
tasks = [
{"query": "Analyze technology sector performance in Q4 2024"},
{"query": "Research healthcare sector trends and key players"},
{"query": "Review renewable energy market developments"}
]
add_result = valyu.batch.add_tasks(batch_id, tasks)
if add_result.success:
print(f"Added {add_result.added} tasks")
# 3. Monitor progress
status = valyu.batch.status(batch_id)
if status.success and status.batch:
print(f"Progress: {status.batch.counts.completed}/{status.batch.counts.total}")
print(f"Total cost: ${status.batch.cost}")
Waiting for Completion
Use wait_for_completion() to automatically poll until the batch finishes:
def on_progress(status):
if status.success and status.batch:
counts = status.batch.counts
print(
f"Progress: {counts.completed + counts.failed}/{counts.total} "
f"(Running: {counts.running}, Queued: {counts.queued})"
)
try:
final_status = valyu.batch.wait_for_completion(
batch_id,
poll_interval=10,
max_wait_time=3600, # 1 hour
on_progress=on_progress
)
if final_status.success and final_status.batch:
print(f"Batch completed!")
print(f"Status: {final_status.batch.status}")
print(f"Total cost: ${final_status.batch.cost}")
except TimeoutError as e:
print(f"Timeout: {e}")
Retrieving Task Results
List all tasks in a batch and retrieve individual results:
# List all tasks in the batch
tasks_response = valyu.batch.list_tasks(batch_id)
if tasks_response.success and tasks_response.tasks:
for task in tasks_response.tasks:
print(f"Task: {task.task_id or task.deepresearch_id}")
print(f"Query: {task.query}")
print(f"Status: {task.status}")
# Get detailed results for completed tasks
if task.status == "completed":
result = valyu.deepresearch.status(task.deepresearch_id)
print(f"Output: {result.output[:200]}...")
Parameters Reference
Mode Values
| Mode | Description | Cost per Task |
|---|
standard | Standard research mode (default) | $0.50 |
heavy | Comprehensive research mode | $1.50 |
fast | Fast research mode (lower cost, faster completion) | Lower |
The lite mode has been replaced by fast.
markdown (default) - Markdown text output
pdf - PDF document output
toon - TOON format (requires JSON schema)
- JSON Schema Object - Structured output matching the provided schema
Cannot mix JSON schema with markdown or pdf. Use one or the other. toon format requires a JSON schema.
Search Parameters
Search parameters control which data sources are queried, what content is included/excluded, and how results are filtered by date or category. When set at the batch level, these parameters are applied to all tasks in the batch and cannot be overridden by individual tasks.
Search Type
Controls which backend search systems are queried for all tasks in the batch:
"all" (default): Searches both web and proprietary data sources
"web": Searches only web sources (general web search, news, articles)
"proprietary": Searches only proprietary data sources (academic papers, finance data, patents, etc.)
When set at the batch level, this parameter cannot be overridden by individual tasks.
batch = valyu.batch.create(
name="Academic Research Batch",
search={"search_type": "proprietary"}
)
Included Sources
Restricts search to only the specified source types for all tasks in the batch. When specified, only these sources will be searched. Tasks inherit this setting and cannot override it.
Available source types:
"web": General web search results (news, articles, websites)
"academic": Academic papers and research databases (ArXiv, PubMed, BioRxiv/MedRxiv, Clinical trials, FDA drug labels, WHO health data, NIH grants, Wikipedia)
"finance": Financial and economic data (Stock/crypto/FX prices, SEC filings, Company financial statements, Economic indicators, Prediction markets)
"patent": Patent and intellectual property data (USPTO patent database, Patent abstracts, claims, descriptions)
"transportation": Transit and transportation data (UK National Rail schedules, Maritime vessel tracking)
"politics": Government and parliamentary data (UK Parliament members, bills, votes)
"legal": Case law and legal data (UK court judgments, Legislation text)
batch = valyu.batch.create(
name="Academic Research Batch",
search={
"search_type": "proprietary",
"included_sources": ["academic", "web"]
}
)
Excluded Sources
Excludes specific source types from search results for all tasks in the batch. Uses the same source type values as included_sources. Cannot be used simultaneously with included_sources (use one or the other).
batch = valyu.batch.create(
name="Research Batch",
search={
"search_type": "proprietary",
"excluded_sources": ["web", "patent"]
}
)
Start Date
Format: ISO date format (YYYY-MM-DD)
Filters search results to only include content published or dated on or after this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.
batch = valyu.batch.create(
name="2024 Research",
search={"start_date": "2024-01-01"}
)
End Date
Format: ISO date format (YYYY-MM-DD)
Filters search results to only include content published or dated on or before this date for all tasks in the batch. Applied to both publication dates and event dates when available. Works across all source types.
batch = valyu.batch.create(
name="Q4 2024 Analysis",
search={
"start_date": "2024-10-01",
"end_date": "2024-12-31"
}
)
Category
Filters results by a specific category for all tasks in the batch. The exact categories available depend on the data source. Category values are source-dependent and may not be applicable to all source types.
batch = valyu.batch.create(
name="Technology Research",
search={"category": "technology"}
)
Important Notes
Parameter Enforcement
Batch-level parameters are enforced and cannot be overridden by individual tasks. This ensures consistent search behavior across all tasks in the batch. Tool-level source specifications are ignored if batch-level sources are specified.
Date Filtering
Dates are applied to both publication dates and event dates when available. ISO format (YYYY-MM-DD) is required. Date filtering works across all source types. If only start_date / startDate is provided, results include all content from that date forward. If only end_date / endDate is provided, results include all content up to that date. Both dates can be combined for a specific date range.
Limitations
Not Yet Supported in Batch API
The following features are not yet supported in the batch API:
deliverables - Cannot specify deliverables (CSV, XLSX, PPTX, DOCX) for batch tasks
brand_collection_id - Cannot apply branding to batch tasks
files - Cannot attach files to batch tasks
mcp_servers - Cannot configure MCP servers for batch tasks
code_execution - Always enabled (cannot disable per batch)
previous_reports - Cannot reference previous reports in batch tasks
alert_email - Cannot set email alerts for batch tasks
Workaround: Use individual task creation (POST /v1/deepresearch/tasks) if you need these features.
Task Constraints
- Maximum tasks per request: 100
- Minimum tasks per request: 1
- Batch status: Batch must be in
"open" or "processing" status to add tasks
- Inherited settings: Tasks cannot override
mode, output_formats, or search_params from the batch
Best Practices
When to Use Batches vs Individual Tasks
| Use Batches When | Use Individual Tasks When |
|---|
| Processing 10+ queries with shared config | Each task needs unique configuration |
| Need unified cost tracking | Need advanced features (files, deliverables, MCP) |
| Bulk research operations | Single or few research queries |
| Shared search parameters | Different search settings per task |
Batch Size Recommendations
- Small batches (1-10 tasks): Good for testing and quick research
- Medium batches (10-50 tasks): Ideal for most production use cases
- Large batches (50-100 tasks): Use for bulk operations, monitor closely
Cost Tracking
Monitor batch costs through the cost field:
status = valyu.batch.status(batch_id)
if status.success and status.batch:
cost = status.batch.cost
print(f"Total cost: ${cost}")
Error Handling
Always check success fields and handle errors appropriately:
response = valyu.batch.create(...)
if not response.success:
print(f"Error: {response.error}")
return
# Proceed with successful response
batch_id = response.batch_id
Webhooks
Set up webhooks for production use to avoid polling:
batch = valyu.batch.create(
name="Research Batch",
mode="standard",
webhook_url="https://your-domain.com/webhook"
)
# IMPORTANT: Save the webhook_secret immediately - it's only returned once
webhook_secret = batch.webhook_secret
The webhook will receive a POST request when the batch reaches a terminal state (completed, completed_with_errors, or cancelled).
Next Steps