Skip to main content

Overview

Valyu integrates seamlessly with LangChain as a search tool, allowing you to enhance your AI agents and RAG applications with real-time web search and proprietary data sources. The integration provides LLM-ready context from multiple sources including web pages, academic journals, financial data, and more. The package includes two main tools:
  • ValyuSearchTool: Deep search operations with comprehensive parameter control
  • ValyuContentsTool: Extract clean content from specific URLs

Installation

Install the official LangChain Valyu package:
pip install -U langchain-valyu
Configure credentials by setting the following environment variable:
export VALYU_API_KEY="your-valyu-api-key-here"
Or set it programmatically:
import os
os.environ["VALYU_API_KEY"] = "your-valyu-api-key-here"
For agent examples, you’ll also need:
export ANTHROPIC_API_KEY="your-anthropic-api-key"  # For Claude examples
export OPENAI_API_KEY="your-openai-api-key"        # For OpenAI examples

Free Credits

Get your API key with $10 credit from the Valyu Platform.

Basic Usage

import os
from langchain_valyu import ValyuSearchTool

# Set your API key
os.environ["VALYU_API_KEY"] = "your-api-key-here"

# Initialize the search tool
tool = ValyuSearchTool()

# Perform a search
search_results = tool._run(
    query="What are agentic search-enhanced large reasoning models?",
    search_type="all",  # "all", "web", or "proprietary"
    max_num_results=5,
    relevance_threshold=0.5,
    max_price=30.0
)

print("Search Results:", search_results.results)

Using ValyuContentsTool for Content Extraction

Extract clean, structured content from specific URLs:
import os
from langchain_valyu import ValyuContentsTool

# Set your API key
os.environ["VALYU_API_KEY"] = "your-api-key-here"

# Initialize the contents tool
contents_tool = ValyuContentsTool()

# Extract content from URLs
urls = [
    "https://arxiv.org/abs/2301.00001",
    "https://example.com/article",
]

extracted_content = contents_tool._run(urls=urls)
print("Extracted Content:", extracted_content.results)

# Print individual results
for result in extracted_content.results:
    print(f"URL: {result['url']}")
    print(f"Title: {result['title']}")
    print(f"Content: {result['content'][:200]}...")
    print(f"Status: {result['status']}")
    print("---")

Using with LangChain Agents

The most powerful way to use Valyu is within LangChain agents, where the AI can dynamically decide when and how to search:
pip install langchain-anthropic langgraph
import os
from langchain_valyu import ValyuSearchTool
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

# Set API keys
os.environ["VALYU_API_KEY"] = "your-valyu-api-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-api-key"

# Initialize components
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
valyu_search_tool = ValyuSearchTool()

# Create agent with Valyu search capability
agent = create_react_agent(llm, [valyu_search_tool])

# Use the agent
user_input = "What are the key factors driving recent stock market volatility, and how do macroeconomic indicators influence equity prices across different sectors?"

for step in agent.stream(
    {"messages": [HumanMessage(content=user_input)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Advanced Configuration

Search Parameters

The ValyuSearchTool supports comprehensive search parameters for fine-tuned control:
from langchain_valyu import ValyuSearchTool

tool = ValyuSearchTool()

# Advanced search with all available parameters
results = tool._run(
    query="quantum computing breakthroughs 2024",
    search_type="proprietary",  # "all", "web", or "proprietary"
    max_num_results=10,  # 1-20 results
    relevance_threshold=0.6,  # 0.0-1.0 relevance score
    max_price=30.0,  # Maximum cost in dollars
    is_tool_call=True,  # Optimized for LLM consumption
    start_date="2024-01-01",  # Time filtering (YYYY-MM-DD)
    end_date="2024-12-31",
    included_sources=["arxiv.org", "nature.com"],  # Include specific sources
    excluded_sources=["reddit.com"],  # Exclude sources
    response_length="medium",  # "short", "medium", "large", "max", or int
    country_code="US",  # 2-letter ISO country code
    fast_mode=False,  # Enable for faster but shorter results
)

Source Filtering

Control which sources are included or excluded from your search:
# Include only academic sources
academic_results = tool._run(
    query="machine learning research 2024",
    search_type="proprietary",
    included_sources=["arxiv.org", "pubmed.ncbi.nlm.nih.gov", "ieee.org"],
    max_num_results=8
)

# Exclude social media and forum content
filtered_results = tool._run(
    query="AI policy developments",
    search_type="web",
    excluded_sources=["reddit.com", "twitter.com", "facebook.com"],
    max_num_results=10
)

Multi-Agent Workflows

Use Valyu in complex multi-agent systems:
from langchain_valyu import ValyuSearchTool
from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage

# Create specialized research agent
research_llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.1)
research_tool = ValyuSearchTool()

research_agent = create_react_agent(
    research_llm,
    [research_tool]
)

# Create analysis agent
analysis_llm = ChatOpenAI(model="gpt-5", temperature=0.3)
analysis_agent = create_react_agent(
    analysis_llm,
    [research_tool]
)

# Coordinate agents for complex queries
research_query = "Find recent papers on transformer architecture improvements"
analysis_query = "Analyze market trends in AI chip demand"

# Execute research agent
for step in research_agent.stream(
    {"messages": [HumanMessage(content=research_query)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

# Execute analysis agent
for step in analysis_agent.stream(
    {"messages": [HumanMessage(content=analysis_query)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Example Applications

Financial Research Assistant

from langchain_valyu import ValyuSearchTool
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langchain_core.messages import HumanMessage, SystemMessage

# Create financial research agent
financial_llm = ChatAnthropic(model="claude-sonnet-4-20250514")
valyu_tool = ValyuSearchTool()

financial_agent = create_react_agent(financial_llm, [valyu_tool])

# Query financial markets with system context
query = "What are the latest developments in cryptocurrency regulation and their impact on institutional adoption?"

system_context = SystemMessage(content="""You are a financial research assistant. Use Valyu to search for:
- Real-time market data and news
- Academic research on financial models
- Economic indicators and analysis

Always cite your sources and provide context about data recency.""")

for step in financial_agent.stream(
    {"messages": [system_context, HumanMessage(content=query)]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

Academic Research Agent

from langchain_valyu import ValyuSearchTool

# Configure for academic research
academic_tool = ValyuSearchTool()

# Search academic sources specifically
academic_results = academic_tool._run(
    query="CRISPR gene editing safety protocols",
    search_type="proprietary",  # Focus on academic datasets
    max_num_results=8,
    relevance_threshold=0.6,
)

print("Academic Sources Found:", len(academic_results.results))
for result in academic_results.results:
    print(f"Title: {result['title']}")
    print(f"Source: {result['source']}")
    print(f"Relevance: {result['relevance_score']}")
    print("---")

Best Practices

1. Cost Optimization

# Set appropriate price limits based on use case
tool = ValyuSearchTool()

# For quick lookups
quick_search = tool._run(
    query="current bitcoin price",
    max_price=30.0,  # Lower cost for simple queries
    max_num_results=3
)

# For comprehensive research
detailed_search = tool._run(
    query="comprehensive analysis of renewable energy trends",
    max_price=50.0,  # Higher budget for complex queries
    max_num_results=15,
    search_type="all"
)

2. Search Type Selection

# Web search for current events
web_results = tool._run(
    query="latest AI policy developments",
    search_type="web",
    max_num_results=5
)

# Proprietary search for academic research
academic_results = tool._run(
    query="machine learning interpretability methods",
    search_type="proprietary",
    max_num_results=8
)

# Combined search for comprehensive coverage
all_results = tool._run(
    query="climate change economic impact",
    search_type="all",
    max_num_results=10
)

3. Error Handling and Fallbacks

from langchain_valyu import ValyuSearchTool

def robust_search(query: str, fallback_query: str = None):
    tool = ValyuSearchTool()

    try:
        # Primary search
        results = tool._run(
            query=query,
            max_price=30.0,
            max_num_results=5
        )
        return results
    except Exception as e:
        print(f"Primary search failed: {e}")

        if fallback_query:
            try:
                # Fallback with simpler query
                results = tool._run(
                    query=fallback_query,
                    max_price=30.0,
                    max_num_results=3,
                    search_type="web"
                )
                return results
            except Exception as e2:
                print(f"Fallback search also failed: {e2}")
                return "Search unavailable"

        return "Search failed"

# Usage
results = robust_search(
    "complex quantum entanglement applications",
    "quantum entanglement basics"
)

4. Agent System Messages

from langchain_core.messages import SystemMessage, HumanMessage

# Optimize agent behavior with good system messages
system_message = SystemMessage(content="""You are an AI research assistant with access to Valyu search.

SEARCH GUIDELINES:
- Use search_type="proprietary" for academic/scientific queries
- Use search_type="web" for current events and news
- Use search_type="all" for comprehensive research
- Set higher relevance_threshold (0.6+) for precise results
- Use category parameter to guide search context
- Always cite sources from search results

RESPONSE FORMAT:
- Provide direct answers based on search results
- Include source citations with URLs when available
- Mention publication dates for time-sensitive information
- Indicate if information might be outdated""")

valyu_tool = ValyuSearchTool()
agent = create_react_agent(llm, [valyu_tool])

# Use the agent with system context
for step in agent.stream(
    {"messages": [system_message, HumanMessage(content="Your query here")]},
    stream_mode="values",
):
    step["messages"][-1].pretty_print()

API Reference

For complete parameter documentation, see the Valyu API Reference.

ValyuSearchTool Parameters

  • query (required): Natural language search query
  • search_type: "all", "web", or "proprietary" (default: “all”)
  • max_num_results: 1-20 results (default: 5)
  • relevance_threshold: 0.0-1.0 relevance score (default: 0.5)
  • max_price: Maximum cost in dollars (default: 20.0)
  • is_tool_call: Optimize for LLM consumption (default: true)
  • start_date/end_date: Time filtering in YYYY-MM-DD format (optional)
  • included_sources: List of URLs/domains to include (optional)
  • excluded_sources: List of URLs/domains to exclude (optional)
  • response_length: Content length - int, “short”, “medium”, “large”, “max” (optional)
  • country_code: 2-letter ISO country code for geo-bias (optional)
  • fast_mode: Enable for faster but shorter results (default: false)

ValyuContentsTool Parameters

  • urls (required): List of URLs to extract content from (max 10 per request)

Additional Resources

I