This is a Next.js project bootstrapped with create-next-app.
First, run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun devOpen http://localhost:3000 with your browser to see the result.
You can start editing the page by modifying app/page.tsx. The page auto-updates as you edit the file.
This project uses next/font to automatically optimize and load Geist, a new font family for Vercel.
A scalable Retrieval-Augmented Generation (RAG) system for processing and querying PDF documents using asynchronous workers, vector search, and LLMs.
This system enables users to:
- Upload PDF documents
- Process them asynchronously (non-blocking)
- Generate embeddings and store them in a vector database
- Query documents using semantic search + LLM reasoning
The architecture is designed to be modular, scalable, and production-ready.
Client (Next.js)
β
Express API (Upload / Query)
β
BullMQ Queue (Redis)
β
Worker (PDF β chunk β embedding)
β
Qdrant (Vector Database)
β
Retriever β LLM (Gemini)
β
Response
| Layer | Technology |
|---|---|
| Backend | Node.js, Express |
| Queue | BullMQ + Redis |
| Vector DB | Qdrant (Cloud / Local) |
| Embeddings | Google Generative AI |
| LLM | Gemini Flash |
| Frontend | Next.js |
PDF ingestion is CPU and I/O intensive. Offloading it to a worker:
- prevents API blocking
- enables horizontal scaling
- allows retry/failure handling
The worker runs in a separate process:
- avoids memory contention
- isolates failures
- allows independent scaling
A dedicated vector database is used instead of in-memory storage:
- persistent storage
- multi-process accessibility
- efficient similarity search (HNSW indexing)
Each uploaded document is stored in its own collection:
const collectionName = job.id;Benefits:
- document isolation
- simplified retrieval
- easier lifecycle management
POST /upload
Response
{
"jobId": "123"
}GET /status/:id
Response
{
"status": "completed",
"progress": 100,
"result": {
"collectionName": "pdf_123"
}
}POST /api/query/:id
Body
{
"query": "Explain the document"
}Response
{
"answer": "Generated response...",
"chunksUsed": 3
}git clone https://github.com/AB-stack-cmd/Rag-Application.git
cd Rag-Applicationnpm installCreate a .env file:
GOOGLE_API_KEY=your_google_api_key
QDRANT_URL=https://your-cluster.qdrant.tech
QDRANT_API_KEY=your_qdrant_api_key
REDIS_HOST=127.0.0.1
REDIS_PORT=6379docker run -p 6379:6379 redis
docker run -p 6333:6333 qdrant/qdrant# API server
node app/server/server.js
# Worker (separate terminal)
node app/server/worker.jsnpm run dev
---
## π§ͺ Testing
### Upload
```bash
curl -X POST http://localhost:4000/upload \
-F "pdf=@file.pdf"
curl -X POST http://localhost:4000/api/query/1 \
-H "Content-Type: application/json" \
-d '{"query":"Summarize the document"}'Ensure correct model:
model: "models/embedding-001"Check:
- worker completed successfully
- collection exists in Qdrant
- embeddings are non-empty
Use proper escaping in PowerShell when using curl.
- Validate file type and size during upload
- Sanitize file paths
- Do not expose API keys to client
- Add rate limiting for query endpoints
- Consider authentication for multi-user systems
- Streaming responses (Server-Sent Events)
- Multi-document querying
- User-based document isolation
- Hybrid search (vector + keyword)
- Embedding caching to reduce cost
- UI support for source highlighting
This system demonstrates:
- asynchronous job processing
- vector-based information retrieval
- LLM integration in a production pattern
It is a solid foundation for building AI-powered document systems at scale.
No License