RAG PDF System (Queue + Worker + Qdrant + Gemini)

This is a Next.js project bootstrapped with create-next-app.

Getting Started

First, run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev

Open http://localhost:3000 with your browser to see the result.

You can start editing the page by modifying app/page.tsx. The page auto-updates as you edit the file.

This project uses next/font to automatically optimize and load Geist, a new font family for Vercel.

RAG PDF System (Queue + Worker + Qdrant + Gemini)

A scalable Retrieval-Augmented Generation (RAG) system for processing and querying PDF documents using asynchronous workers, vector search, and LLMs.

🧠 Overview

This system enables users to:

Upload PDF documents
Process them asynchronously (non-blocking)
Generate embeddings and store them in a vector database
Query documents using semantic search + LLM reasoning

The architecture is designed to be modular, scalable, and production-ready.

🏗️ Architecture

Client (Next.js)
      ↓
Express API (Upload / Query)
      ↓
BullMQ Queue (Redis)
      ↓
Worker (PDF → chunk → embedding)
      ↓
Qdrant (Vector Database)
      ↓
Retriever → LLM (Gemini)
      ↓
Response

⚙️ Tech Stack

Layer	Technology
Backend	Node.js, Express
Queue	BullMQ + Redis
Vector DB	Qdrant (Cloud / Local)
Embeddings	Google Generative AI
LLM	Gemini Flash
Frontend	Next.js

🔑 Core Design Decisions

1. Asynchronous Processing via Queue

PDF ingestion is CPU and I/O intensive. Offloading it to a worker:

prevents API blocking
enables horizontal scaling
allows retry/failure handling

2. Worker Isolation

The worker runs in a separate process:

avoids memory contention
isolates failures
allows independent scaling

3. Vector Database (Qdrant)

A dedicated vector database is used instead of in-memory storage:

persistent storage
multi-process accessibility
efficient similarity search (HNSW indexing)

4. Collection-per-Document Strategy

Each uploaded document is stored in its own collection:

const collectionName = job.id;

Benefits:

document isolation
simplified retrieval
easier lifecycle management

📡 API Endpoints

Upload PDF

POST /upload

Response

{
  "jobId": "123"
}

Check Job Status

GET /status/:id

Response

{
  "status": "completed",
  "progress": 100,
  "result": {
    "collectionName": "pdf_123"
  }
}

Query Document

POST /api/query/:id

Body

{
  "query": "Explain the document"
}

Response

{
  "answer": "Generated response...",
  "chunksUsed": 3
}

⚙️ Setup

1. Clone repository

git clone https://github.com/AB-stack-cmd/Rag-Application.git
cd Rag-Application

2. Install dependencies

npm install

3. Configure environment

Create a .env file:

GOOGLE_API_KEY=your_google_api_key
QDRANT_URL=https://your-cluster.qdrant.tech
QDRANT_API_KEY=your_qdrant_api_key
REDIS_HOST=127.0.0.1
REDIS_PORT=6379

4. Start Redis and QdrantDB

docker run -p 6379:6379 redis
docker run -p 6333:6333 qdrant/qdrant

5. Start services

# API server
node app/server/server.js

# Worker (separate terminal)
node app/server/worker.js

npm run dev
---

## 🧪 Testing

### Upload

```bash
curl -X POST http://localhost:4000/upload \
  -F "pdf=@file.pdf"

Query

curl -X POST http://localhost:4000/api/query/1 \
  -H "Content-Type: application/json" \
  -d '{"query":"Summarize the document"}'

⚠️ Common Failure Modes

Embedding dimension mismatch

Ensure correct model:

model: "models/embedding-001"

Empty retrieval results

Check:

worker completed successfully
collection exists in Qdrant
embeddings are non-empty

JSON parsing errors (Windows)

Use proper escaping in PowerShell when using curl.

🔐 Security Considerations

Validate file type and size during upload
Sanitize file paths
Do not expose API keys to client
Add rate limiting for query endpoints
Consider authentication for multi-user systems

🚀 Future Improvements

Streaming responses (Server-Sent Events)
Multi-document querying
User-based document isolation
Hybrid search (vector + keyword)
Embedding caching to reduce cost
UI support for source highlighting

🧠 Key Takeaways

This system demonstrates:

asynchronous job processing
vector-based information retrieval
LLM integration in a production pattern

It is a solid foundation for building AI-powered document systems at scale.

📄 License

No License

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.vscode		.vscode
__pycache__		__pycache__
app		app
public		public
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
compose.debug.yaml		compose.debug.yaml
compose.yaml		compose.yaml
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
postcss.config.mjs		postcss.config.mjs
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Getting Started

RAG PDF System (Queue + Worker + Qdrant + Gemini)

🧠 Overview

🏗️ Architecture

⚙️ Tech Stack

🔑 Core Design Decisions

1. Asynchronous Processing via Queue

2. Worker Isolation

3. Vector Database (Qdrant)

4. Collection-per-Document Strategy

📡 API Endpoints

Upload PDF

Check Job Status

Query Document

⚙️ Setup

1. Clone repository

2. Install dependencies

3. Configure environment

4. Start Redis and QdrantDB

5. Start services

Query

⚠️ Common Failure Modes

Embedding dimension mismatch

Empty retrieval results

JSON parsing errors (Windows)

🔐 Security Considerations

🚀 Future Improvements

🧠 Key Takeaways

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages