- Introduction
- Key Features
- Environment Setup Guide
- Configuration
- Usage Guide
- RAG Capabilities
- Architecture & Deep Dive
- Diagnostics
- Testing
- Benchmarking
This tool is designed to externalize your Perplexity.ai conversation history into organized, semantically searchable Markdown files. It facilitates the emergence of a personal knowledge base powered by local AI, bridging the gap between ephemeral inquiry and structured knowledge.
- Parallelized Extraction: Leverages Playwright to extract multiple conversation threads simultaneously for high-velocity data retrieval.
- Architectural Resilience: Automatically restores browser contexts and retries operations, ensuring continuity amidst environmental instability.
- Advanced RAG (Retrieval-Augmented Generation): Engage in a cognitive dialogue with your history. The system employs intent analysis to synthesize broad summaries or pinpoint specific technical insights.
- HyDE (Hypothetical Document Embeddings): Before searching, the planner generates a hypothetical answer passage and uses it as an additional search vector, improving recall when your question wording differs from how you originally wrote things.
- Cross-Encoder Reranking: After initial retrieval, a local ONNX cross-encoder (
ms-marco-MiniLM-L-6-v2) rescores the top candidates by jointly reasoning over query and passage, surfacing the most relevant results before synthesis. - Semantic Vector Search: Move beyond keyword matching. Locate information based on conceptual depth and semantic relevance.
- Persistent State Tracking: Frequent checkpoints allow the system to resume progress after any interruption.
- Interactive Synthesis (REPL): A streamlined command-line interface for human-system synergy.
- Smart Content Hashing: The scraper now computes a SHA-256 hash of thread content. Subsequent runs will skip unchanged threads, significantly reducing execution time and API overhead while ensuring your local history stays up to date when new messages are added.
If you are new to development or don't have the necessary tools installed, follow these steps to set up your environment.
We recommend using a version manager to install Node.js. This allows you to easily switch versions and avoids permission issues.
- Windows:
- Download and run the latest installer from nvm-windows.
- Open a new Command Prompt or PowerShell and run:
nvm install 20 nvm use 20
- macOS / Linux:
- Install
nvmby following the instructions at nvm.sh. - Run:
nvm install 20 nvm use 20
- Install
Ollama is optional. It is only required if you want to use the Semantic Search or RAG (Retrieval-Augmented Generation) features. Basic extraction and keyword search work without it.
- Download and install Ollama from ollama.ai.
- Open your terminal and pull the required models:
ollama pull nomic-embed-text ollama pull deepseek-r1
If you don't have the git command installed, you can simply download this project as a ZIP file from GitHub and extract it.
Once extracted, open your terminal in the project folder and run:
npm install
npx playwright install chromiumEstablish your environment by duplicating the template:
cp .env.example .env- HEADLESS: Set to
falsein your.envfile. Note: Headless mode (true) is currently non-functional due to Cloudflare Turnstile protection on Perplexity.ai. Using headful mode allows you to complete any challenges manually if they appear. - OLLAMA_URL: Access point for your local AI engine (default: http://localhost:11434).
- OLLAMA_MODEL: Cognitive model for RAG synthesis (e.g., deepseek-r1).
- OLLAMA_EMBED_MODEL: Model for generating vector representations (e.g., nomic-embed-text).
- ENABLE_VECTOR_SEARCH: Set to
trueto activate semantic and RAG layers.
Launch the system:
# Start the development environment
npm run dev- Start scraper (Library): Initiates extraction. Authenticate manually if required.
- Note: Due to the complexity of Perplexity's API and potential network fluctuations, it may be necessary to run the scraper multiple times to ensure all conversations are fully gathered. The system uses checkpoints to resume where it left off.
- Search conversations: Interface with your history using various modes:
- Auto: Heuristic selection between semantic and exact search.
- Semantic: Fuzzy matching via high-dimensional vector space.
- RAG: Direct inquiry, such as "What did I learn about emergent intelligence?"
- Exact: Rapid string matching via ripgrep (bundled).
- Build vector index: Processes Markdown exports into a local vector store.
- Reset all data: Purges checkpoints, authentication data, and the vector index.
The RAG modality is engineered for various levels of cognitive inquiry:
- Broad Synthesis: "Summarize all threads regarding distributed systems."
- Granular Retrieval: "Locate the specific TypeScript pattern I used for the worker pool."
- Cross-Thread Integration: "How has my conceptual understanding of React hooks shifted?"
The pipeline runs three enhancement stages automatically:
- HyDE: The planner writes a hypothetical answer passage and uses it as an extra search vector alongside your query variations, improving recall when question wording diverges from stored content.
- Expanded pool: Precise mode retrieves 35 candidates (up from 20), exhaustive mode retrieves 60.
- Cross-encoder reranking: A local ONNX model (
Xenova/ms-marco-MiniLM-L-6-v2) jointly scores each (query, passage) pair and reorders before synthesis. Activates automatically afternpm install. First run downloads ~85MB model, cached thereafter.
For a detailed look at our RAG implementation, hybrid search strategy, and theoretical foundations, please refer to:
π ARCH.md
- src/ai/: Ollama interaction and advanced RAG orchestration layers.
- src/scraper/: Playwright-based extraction logic and parallel worker pool management.
- src/search/: Vector storage (Vectra) and ripgrep search implementation.
- src/repl/: Interactive CLI components.
- src/utils/: Shared utility functions for data chunking, logging, and API diagnostics.
If the scraper encounters unexpected API response formats or empty conversation entries, it logs detailed (but non-sensitive) diagnostic information to debug/api-diagnostics.jsonl. This file helps maintain architectural resilience by providing insights into Perplexity's evolving API without compromising user privacy.
We prioritize a "Testing Trophy" architecture, emphasizing integration tests.
# Execute unit-level verifications
npm run test:unit
# Execute integration-level verifications
npm run test:integrationMeasure RAG pipeline latency and validate the full retrieval stack against your actual export data.
npm run benchmarkRequires a built vector index and a running Ollama instance. The benchmark runs a set of predefined queries end-to-end through the full pipeline (HyDE β hybrid search β cross-encoder reranking β MapReduce β synthesis) and reports per-query latency and success rate. Edit BENCHMARK_QUERIES in src/benchmark.ts to tailor queries to your history.
π BENCHMARKS.md: Full details on each benchmark, why the metrics were chosen, how to interpret results, and how to write effective custom queries.