Skip to content

voralabs/pharma_data_agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pharma Insights Agent (MVP)

Local terminal Deep Agent for pharma marketing and sales analysis.

This MVP is intentionally simple:

  • single-user local workflow,
  • CSV dummy data generated with realistic patterns,
  • SQLite backend,
  • orchestrator + subagents (Dexavir, Nabinix),
  • shared datasets only (no brand-unique datasets in MVP),
  • SQL tools with practical error handling.

What It Can Answer

  • KPI trend and change analysis (TRx, NBRx, calls, impressions, engagement).
  • Brand-specific and cross-period questions.
  • Schema exploration questions (table/column discovery).
  • Call plan metadata questions (for example latest cycle by brand).

Project Structure

pharma-insights-agent/
  AGENTS.md
  agent.py
  subagents/brands.yaml
  tools/sql_tools.py
  skills/
    rx-dataset/SKILL.md
    veeva-call-dataset/SKILL.md
    digital-engagement-dataset/SKILL.md
    call-plan-dataset/SKILL.md
    finance-dataset/SKILL.md
    inventory-dataset/SKILL.md
    sales-analysis/SKILL.md
    veeva-crm-analysis/SKILL.md
    digital-engagement-analysis/SKILL.md
  scripts/
    generate_dummy_data.py
    build_sqlite.py
  data/
    raw/*.csv
    pharma_mvp.db
  eval/questions.md

Setup

  1. Create environment and install:
cd /Users/shyamvora/Documents/GitHub\ Repos/pharma_data_agent
uv venv --python 3.11
source .venv/bin/activate
uv pip install -e .
  1. Configure environment:
cp .env.example .env
# add your API key(s)
  1. Generate dummy CSV data:
python scripts/generate_dummy_data.py
  1. Build SQLite DB from CSV files:
python scripts/build_sqlite.py
  1. Ask a question:
python agent.py "What is the change in engagement and impressions for Tier 1 HCPs between Q3 and Q2 of 2025 for Dexavir?"
  1. For full multi-turn conversation testing:
python agent.py --interactive
  1. Optional terminal visibility levels:
# default
python agent.py --interactive --visibility standard

# final answers only
python agent.py --interactive --visibility quiet

# full tool-call detail (includes full SQL in tool call logs)
python agent.py --interactive --visibility debug

Terminal Visibility

In standard/debug, each turn prints:

  • [TURN] user query
  • [ORCH] planning/delegation/synthesis
  • [SUBAGENT:<name>] start/completion
  • [TOOL] and [TOOL:OK|ERR] execution events
  • [SKILL] skill load notices (best-effort when skill files are read)
  • [ANSWER] final response
  • [AUDIT] per-turn summary (subagents, skills, datasets touched, tools, sql count, retries, errors, duration)
  • [AUDIT:WARN] validation warnings (for example SQL executed without matching dataset skill loads, or discover_schema overuse without fallback need)

Delegation Architecture

  • The orchestrator does not have direct SQL tools.
  • The orchestrator does not load dataset or analysis skills.
  • SQL/data work is delegated to subagents via task.
  • Brand-specific queries should route to dexavir-analyst or nabinix-analyst.
  • Cross-brand analytics should be split across both brand analysts and synthesized by the orchestrator.
  • Brand-agnostic analytics, schema/metadata questions, and data-access questions should also be delegated to both brand analysts and synthesized by the orchestrator.
  • Dataset skills are the default schema source for analysis queries; schema tools are fallback-only.
  • Subagents are responsible for:
    1. dataset selection
    2. loading matching dataset skills
    3. SQL execution
    4. validated response assembly

LangSmith Tracing

This app supports LangSmith tracing for Deep Agents.

  1. Set environment variables in .env:
LANGSMITH_API_KEY=...
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=pharma-insights-mvp
  1. Run the app in interactive mode for full conversation tracing:
python agent.py --interactive
  1. In LangSmith, verify:
  • runs appear under project pharma-insights-mvp,
  • one session groups multiple turns,
  • nested spans include tool calls and subagent calls,
  • SQL tool calls are visible end-to-end.

Tracing Privacy Note

  • Prompts, tool inputs, and tool outputs can be traced.
  • Do not include secrets or sensitive personal data in prompts.

Optional Tracing Controls

Optional programmatic controls (for example custom tracing_context wrapping and custom env knobs for tags/metadata) are intentionally not implemented in this MVP.

Notes

  • Default DB path is data/pharma_mvp.db.
  • Default model is anthropic:claude-sonnet-4-5-20250929 unless PHARMA_AGENT_MODEL is set.
  • SQL execution is read-only (SELECT, WITH, PRAGMA, EXPLAIN).
  • Responses are configured to include both SQL query text and evidence summary.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages