Grabit Engine

A simple, plugin-based engine for scraping media streams and subtitles. Load provider plugins from GitHub, local files, or directly in code — with health tracking, auto-updates, caching, and more built right in. Works in Node.js, browsers, React and React Native.

📑 Table of Contents

Features
Installation
Quick Start
Provider Sources
Creating a Provider Plugin
Bundling Providers
Testing Providers
API Reference
Configuration
Metrics & Health Monitoring
Examples
React Hook (useSources)
Testing
License

✨ Features

Core

🔌 Plugin system — add or remove providers anytime
🌍 Runs anywhere — Node.js, browsers, React Native
🎯 Pick a provider — scrape from one specific provider by its scheme
⚡ Run in parallel — scrape from multiple providers at the same time
🏁 Stop early — quit as soon as enough providers have responded
⏱️ Timeouts — never wait forever for a slow provider

Reliability

📊 Health tracking — see how each provider is doing (errors, successes)
🔴 Auto-disable — bad providers get turned off on their own
🔄 Auto-update — remote providers refresh themselves on a timer
💾 Built-in cache — save results in memory so you don't repeat work
🔁 Retries — automatically retry failed providers
✅ Validation — checks that plugins are set up correctly before loading

📦 Installation

npm install grabit-engine

Optional: Puppeteer support (Node.js only)

npm install puppeteer-real-browser

Puppeteer is an optional peer dependency for providers that need headless browser automation.

Optional: base64 polyfill (React Native)

npm install base-64

React Native versions below 0.74 do not expose atob / btoa as globals. This library automatically polyfills them when it detects they are missing, using the base-64 package as an optional peer dependency.

If you are targeting React Native, install base-64 alongside this package. On Node.js and modern browsers the built-in atob / btoa are used and no extra package is needed.

🚀 Quick Start

import { ScrapePluginManager } from "grabit-engine";

// Create the manager with a registry source (simplest approach)
const manager = await ScrapePluginManager.create({
	source: {
		type: "registry",
		name: "my-providers",
		providers: {
			"my-provider": myProviderModule
		}
	},
	tmdbApiKeys: ["your-tmdb-api-key"]
});

// Scrape streams for a movie — minimal: only tmdbId is required!
// TMDB service auto-fills title, year, duration, imdbId, etc.
// Or provide full media data — TMDB only fills what's missing
const streams = await manager.getStreams({
	media: {
		type: "movie",
		tmdbId: "27205"
		// imdbId: "tt1375666"
		// title: "Inception",
		// duration: 148,
		// releaseYear: 2010,
	},
	targetLanguageISO: "en"
});

// Scrape from a specific provider by scheme
const targeted = await manager.getStreamsByScheme("my-provider", request);

🔗 Provider Sources

The manager can load plugins from three places:

Source	Runtime	Description	Auto-Update
`github`	All	Download providers from a GitHub repo	✅
`local`	All	Load providers from files on your machine	❌
`registry`	All	Pass provider modules directly in code — no file I/O needed	❌

GitHub Source

const manager = await ScrapePluginManager.create({
	source: {
		type: "github",
		url: "https://github.com/your-org/your-providers",
		branch: "main",
		rootDir: "dist", // optional, subdirectory containing manifest.json and providers (default: repo root)
		token: process.env.GITHUB_TOKEN, // optional, for private repos
		// Required in browser / React Native:
		moduleResolver: async (scheme, sourceCode) => {
			const exports = {};
			const module = { exports };
			new Function("module", "exports", sourceCode)(module, exports);
			return (module.exports as any).default ?? module.exports;
		}
	}
});

Repository structure

Your GitHub repo must contain a manifest.json. By default it's expected at the repo root, but you can set rootDir to point to a subdirectory:

your-providers/              # rootDir not set (default: repo root)
├── manifest.json
└── providers/
    ├── example-provider/
    │   └── index.js
    └── another-provider/
        └── index.js

your-providers/              # rootDir: "dist"
├── dist/
│   ├── manifest.json
│   └── providers/
│       ├── example-provider/
│       │   └── index.js
│       └── another-provider/
│           └── index.js
└── src/
    └── ...

manifest.json

{
	"name": "my-providers",
	"author": "your-name",
	"providers": {
		"example-provider": {
			"name": "ExampleProvider",
			"version": "1.0.0",
			"active": true,
			"language": "en",
			"type": "media",
			"env": "universal",
			"supportedMediaTypes": ["movie", "serie"],
			"priority": 10,
			"dir": "providers"
		}
	}
}

Local Source

const manager = await ScrapePluginManager.create({
	source: {
		type: "local",
		manifest: require("./manifest.json"),
		rootDir: "./providers",
		resolve: (path) => require(path)
	}
});

Registry Source

import exampleProvider from "./providers/example-provider";

const manager = await ScrapePluginManager.create({
	source: {
		type: "registry",
		name: "my-providers",
		providers: {
			"example-provider": exampleProvider
		}
	}
});

🔧 Creating a Provider Plugin

The fastest way to create a new provider is with the built-in CLI:

npx create-provider my-cool-provider

You can specify the language(s) upfront with --lang. Pass a comma-separated list for multiple languages:

# Single language (default: "en")
npx create-provider my-cool-provider --lang fr

# Multiple languages
npx create-provider my-cool-provider --lang en,fr,es

If no scheme is provided, the CLI enters interactive mode and prompts you for it:

npx create-provider

Once your provider is ready, bundle it for distribution with npx bundle-provider — see Bundling Providers for all available flags (--src, --out, --dry-run, --clean).

This creates a ready-to-edit folder:

providers/
└── my-cool-provider/
    ├── index.ts      ← entry point (exports the module)
    ├── config.ts     ← provider settings (URL, endpoints, etc.)
    ├── stream.ts     ← stream scraping logic
    └── subtitle.ts   ← subtitle scraping logic (optional)

You can also create the files by hand. Here's what each file looks like:

`config.ts` — Provider Configuration

import { ProviderConfig } from "grabit-engine";

export const config: ProviderConfig = {
	scheme: "example-provider",
	name: "ExampleProvider",
	language: "en", // or ["en", "fr"] for multi-language providers
	baseUrl: "https://example-streams.com",
	entries: {
		movie: { endpoint: "/embed/movie?tmdb={id:string}" },
		serie: { endpoint: "/embed/tv?tmdb={id:string}&season={season:1}&episode={episode:1}" }
	},
	mediaIds: ["tmdb", "imdb"]
};

`stream.ts` — Stream Handler

import { ScrapeRequester, InternalMediaSource, ProviderContext } from "grabit-engine";
import { Provider } from "grabit-engine/models/provider";
import { config } from "./config";

export async function getStreams(requester: ScrapeRequester, ctx: ProviderContext): Promise<InternalMediaSource[]> {
	const provider = Provider.create(config);
	const url = provider.createResourceURL(requester);

	ctx.log.info(`Fetching streams from ${url.href}`);

	const { $, response } = await ctx.cheerio.load(url, requester, ctx.xhr);
	const src = $("video > source").attr("src");

	if (!src) return [];

	return [
		{
			fileName: "video.mp4",
			format: "mp4",
			language: "en",
			playlist: src,
			xhr: { haveCorsPolicy: false, headers: {} }
		}
	];
}

`subtitle.ts` — Subtitle Handler

import { ScrapeRequester, InternalSubtitleSource, ProviderContext } from "grabit-engine";
import { Provider } from "grabit-engine/models/provider";
import { config } from "./config";

export async function getSubtitles(requester: ScrapeRequester, ctx: ProviderContext): Promise<InternalSubtitleSource[]> {
	const provider = Provider.create(config);
	const url = provider.createResourceURL(requester);

	ctx.log.info(`Fetching subtitles from ${url.href}`);

	const apiUrl = new URL(`/api/subtitles?id=${url.searchParams.get("tmdb")}`, url.origin);
	const response = await ctx.xhr.fetch(apiUrl, {}, requester);
	const data = await response.json();

	return data.map((sub: any) => ({
		fileName: "subtitles.srt",
		format: "srt" as const,
		language: sub.language,
		languageName: sub.languageName,
		url: sub.url,
		xhr: { haveCorsPolicy: false, headers: {} }
	}));
}

`index.ts` — Entry Point

import { defineProviderModule } from "grabit-engine/controllers/provider";
import { Provider } from "grabit-engine/models/provider";
import { config } from "./config";
import { getStreams } from "./stream";
import { getSubtitles } from "./subtitle";

const provider = Provider.create(config);

export default defineProviderModule(
	provider,
	{
		name: config.name,
		version: "1.0.0",
		active: true,
		env: "universal",
		type: "media",
		supportedMediaTypes: ["movie", "serie"],
		priority: 10,
		dir: "providers"
	},
	{ getStreams, getSubtitles }
);

Multi-Language Providers

The language field on both ProviderConfig and ProviderModuleManifest accepts a single string or an array of strings. This lets you declare that a provider serves content in multiple languages.

CLI

# Single language (default)
npx create-provider my-provider --lang en

# Multiple languages
npx create-provider my-provider --lang en,fr,es

Config

// Single language
export const config: ProviderConfig = {
	scheme: "single-lang",
	name: "SingleLang",
	language: "en"
	// ...
};

// Multi-language
export const config: ProviderConfig = {
	scheme: "multi-lang",
	name: "MultiLang",
	language: ["en", "fr", "es"]
	// ...
};

Manifest (`manifest.json`)

{
	"providers": {
		"my-provider": {
			"name": "MyProvider",
			"version": "1.0.0",
			"active": true,
			"language": ["en", "fr", "es"]
			// ...
		}
	}
}

When the manager sorts providers for a request, providers whose language field includes the requester's targetLanguageISO are prioritized higher.

� Bundling Providers

When providers are loaded from GitHub (via GithubService), each provider is fetched as a single index.js file and loaded via dynamic import() in an isolated temp directory. That directory has no node_modules and no sibling files — so relative imports (./config) and package imports (grabit-engine) would fail.

The bundler solves this by compiling each provider into a standalone, self-contained ES module with zero external imports.

Install esbuild

npm install --save-dev esbuild

Bundle all providers

npx bundle-provider

Bundle a specific provider

npx bundle-provider my-cool-provider

Folder structure

Providers can be organized flat or grouped inside subdirectories:

providers/
├── english/                    ← group folder (no index.ts)
│   ├── vidsrc/                 ← provider → scheme "english/vidsrc"
│   │   ├── index.ts
│   │   ├── config.ts
│   │   ├── stream.ts
│   │   └── subtitle.ts
│   └── another/                ← provider → scheme "english/another"
│       └── index.ts ...
├── loodvidrsc/                 ← provider → scheme "loodvidrsc"
│   ├── index.ts
│   └── ...
└── manifest.json

The bundler recursively walks the source directory. Folders with index.ts are providers; folders without are group organizers.

For grouped providers, pass the full relative path:

npx bundle-provider english/vidsrc

Custom source & output directories

By default, providers are read from providers/ and bundles are written next to the source. You can change both:

# Custom source directory
npx bundle-provider --src ./my-providers

# Custom output directory (mirrors the folder structure)
npx bundle-provider --out ./dist/providers

# Both
npx bundle-provider --src ./my-providers --out ./dist/providers

With --out ./dist/providers, the output becomes:

dist/providers/
├── english/vidsrc/index.js     ← standalone bundle
├── loodvidrsc/index.js         ← standalone bundle
└── ...

What the bundle contains

Each bundled index.js inlines everything it needs:

Your provider's config, stream, and subtitle logic
Runtime code from grabit-engine (Provider, defineProviderModule, etc.)
Manifest data from manifest.json

Tree-shaking keeps bundles small (~5–15 KB). The output has zero import statements.

CLI reference

Command	Description
`npx bundle-provider`	Bundle all providers
`npx bundle-provider <scheme>`	Bundle one provider (e.g. `vidsrc` or `english/vidsrc`)
`npx bundle-provider --src <dir>`	Custom source directory
`npx bundle-provider --out <dir>`	Custom output directory
`npx bundle-provider --dry-run`	Preview without writing
`npx bundle-provider --clean`	Remove all generated bundles

Tip: After editing any provider source files, always re-bundle before pushing to GitHub.

See scripts/BUNDLING.md for the full bundling guide.

🧪 Testing Providers

Once you have written a provider, use the test-provider CLI tool to verify it scrapes correctly against real media data — without writing any test files or setting up a manager.

# Test a movie — minimal (TMDB fills title, year, duration, etc.)
npx test-provider --scheme my-provider --type movie --tmdb 27205

# Test a movie — full (all data provided, TMDB only fills gaps)
npx test-provider --scheme my-provider --type movie \
  --title "Inception" --year 2010 --tmdb 27205 --duration 148

# Test a series episode — minimal
npx test-provider --scheme my-provider --type serie \
  --tmdb 1396 --season 1 --episode 1

# Test a series episode — full
npx test-provider --scheme my-provider --type serie \
  --title "Breaking Bad" --year 2008 --tmdb 1396 \
  --season 1 --episode 1 --ep-tmdb 349232

# Test both streams and subtitles
npx test-provider --scheme my-provider --mode both --type movie --tmdb 27205

# Load media from a JSON file
npx test-provider --scheme my-provider --media-file ./test-media.json

The tool auto-bundles TypeScript source via esbuild if no pre-built index.js is present, fetches missing media data from TMDB automatically, runs the scrape with a configurable timeout, and prints a formatted report with a PASS / EMPTY / FAIL verdict.

See /TESTING.md for the full guide — all flags, output format, media file examples, and tips.

📖 API Reference

Full API documentation has been moved to API_REFERENCE.md for better readability.

It covers: ScrapePluginManager, ScrapeRequester, ProviderModuleManifest, ProviderMetrics & ProviderHealthReport, ProviderContext, ProviderFetchOptions, Media Input Types, Output Types, Provider Configuration, the Provider class, Error Classes, Utility Functions, and Services.

⚙️ Configuration

Option	Type	Default	Description
`source`	`GithubSource \| LocalSource \| RegistrySource`	—	Required. Where to load your plugins from.
`debug`	`boolean`	`false`	Turn on detailed logging.
`strict`	`boolean`	`false`	Throw errors for bad plugins instead of just skipping them.
`autoUpdateIntervalMinutes`	`number`	`15`	How often to refresh remote providers (min: 5).
`cache.enabled`	`boolean`	`false`	Turn on result caching.
`cache.TTL`	`number`	`0`	How long to keep cached results (in ms).
`cache.MODULE_TTL`	`number`	`900000`	How long to keep loaded provider modules in cache (15 min).
`cache.TMDB_TTL`	`number`	`0`	How long to cache TMDB API responses (in ms). Helps avoid hitting the TMDB API too hard. Set to e.g. `3600000` (1 hour) to cache responses.
`cache.maxEntries`	`number`	`10000`	Maximum number of entries in the in-memory cache. Oldest entries are evicted when the limit is reached (LRU).
`tmdbApiKeys`	`string[]`	—	Required. Array of TMDB API keys. A random key is selected for each request to distribute load.

Scrape Configuration

Option	Type	Default	Description
`scrapeConfig.concurrentOperations`	`number`	`5`	How many providers can run at the same time.
`scrapeConfig.maxAttempts`	`number`	`1`	How many times to retry a failing provider.
`scrapeConfig.operationTimeout`	`number`	`15000`	Max time before giving up on a scrape (15 sec).
`scrapeConfig.successQuorum`	`number`	`undefined`	Stop once this many providers have succeeded.
`scrapeConfig.errorThresholdRate`	`number`	`0.7`	Error rate that triggers auto-disable (70%).
`scrapeConfig.minOperationsForEvaluation`	`number`	`10`	How many scrapes before checking if a provider is healthy.

📊 Metrics & Health Monitoring

The manager keeps track of how each provider is doing and can automatically turn off unhealthy ones:

// Raw metrics map
const metrics = manager.getMetrics();
for (const [scheme, m] of metrics) {
	console.log(`${scheme}: ${m.successes} ok, ${m.errors} err`);
}

// Detailed health report
const report = manager.getMetricsReport();
report.forEach((r) => {
	console.log(`${r.moduleName}: ${r.totalOperations} ops, ` + `${(r.errorRate * 100).toFixed(1)}% errors, ` + `active=${r.active}`);
});

Providers that fail too often (more than errorThresholdRate after minOperationsForEvaluation scrapes) get turned off and won't be used again until the manager is reloaded.

📝 Examples

React Native with GitHub source

import { ScrapePluginManager } from "grabit-engine";

const manager = await ScrapePluginManager.create({
	source: {
		type: "github",
		url: "your-org/providers-repo",
		branch: "main",
		rootDir: "dist", // optional
		moduleResolver: async (_scheme, sourceCode) => {
			const exports: Record<string, unknown> = {};
			const module = { exports };
			new Function("module", "exports", sourceCode)(module, exports);
			return (module.exports as any).default ?? module.exports;
		}
	},
	tmdbApiKeys: ["your-tmdb-api-key"],
	scrapeConfig: {
		concurrentOperations: 3,
		successQuorum: 2,
		operationTimeout: 15000
	}
});

// Minimal request — just tmdbId, TMDB fills the rest

// Minimal request — just tmdbId, TMDB fills the rest
const streams = await manager.getStreams({
	media: { type: "movie", tmdbId: "27205" },
	targetLanguageISO: "en"
});

Node.js with local providers

import { ScrapePluginManager } from "grabit-engine";
import manifest from "./providers/manifest.json";

const manager = await ScrapePluginManager.create({
	source: {
		type: "local",
		manifest,
		rootDir: "./providers",
		resolve: (path) => require(path)
	},
	tmdbApiKeys: ["your-tmdb-api-key"],
	debug: true,
	cache: {
		enabled: true,
		TTL: 300_000,
		TMDB_TTL: 3_600_000, // Cache TMDB responses for 1 hour
		maxEntries: 5_000
	},
	scrapeConfig: {
		maxAttempts: 3,
		errorThresholdRate: 0.5
	}
});

Targeted scraping by scheme

// Only scrape from a specific provider
const streams = await manager.getStreamsByScheme("example-provider", request);
const subs = await manager.getSubtitlesByScheme("subtitle-provider", request);

⚛️ React Hook (`useSources`)

An optional React hook for declarative scraping inside React / React Native components. Requires react >= 17 as a peer dependency (already optional — non-React consumers are unaffected).

npm install react   # if not already installed

Basic Usage

import { useSources } from "grabit-engine";

function StreamList() {
	const { mediaSources, subtitleSources, isLoading, isManagerReady, error, scrape, clearSources } = useSources({
		managerConfig: {
			source: {
				type: "registry",
				name: "my-providers",
				providers: {
					/* ... */
				}
			},
			tmdbApiKeys: ["your-tmdb-api-key"]
		},
		type: "both"
	});

	const handleScrape = () => {
		scrape({
			media: { type: "movie", tmdbId: "27205" },
			targetLanguageISO: "en"
		});
	};

	return (
		<div>
			<button onClick={handleScrape} disabled={!isManagerReady || isLoading}>
				{isLoading ? "Scraping…" : "Scrape"}
			</button>
			{error && <p>Error: {error.message}</p>}
			<h3>Media ({mediaSources.length})</h3>
			<ul>
				{mediaSources.map((s) => (
					<li key={`${s.scheme}-${s.providerName}-${s.fileName}`}>{s.fileName}</li>
				))}
			</ul>
			<h3>Subtitles ({subtitleSources.length})</h3>
			<ul>
				{subtitleSources.map((s) => (
					<li key={`${s.scheme}-${s.providerName}-${s.fileName}`}>{s.fileName}</li>
				))}
			</ul>
		</div>
	);
}

Continuous Mode

When continuous: true, calling scrape() ignores scrapeConfig.successQuorum and streams results per-provider as they arrive — the list grows live instead of waiting for all providers to finish.

const { mediaSources, isContinuousScraping, scrape, stopContinuousScraping } = useSources({
	managerConfig: {
		/* ... */
	},
	continuous: true,
	type: "media"
});

// Start scraping — results appear one by one
scrape({ media: { type: "serie", tmdbId: "1396", ep_tmdbId: "62085", season: 1, episode: 1 }, targetLanguageISO: "en" });

// Cancel early — already-collected sources are kept
stopContinuousScraping();

Config (`UseSourcesConfig`)

Property	Type	Default	Description
`managerConfig`	`ProviderManagerConfig`	—	Configuration for the `ScrapePluginManager` singleton.
`continuous`	`boolean`	`false`	Stream results per-provider as they arrive (ignores `successQuorum`).
`type`	`"media" \| "subtitle" \| "both"`	`"both"`	Which source category to fetch.

Return Value (`UseSourcesReturn`)

Property	Type	Description
`mediaSources`	`MediaSource[]`	Collected media sources (de-duplicated).
`subtitleSources`	`SubtitleSource[]`	Collected subtitle sources (de-duplicated).
`isLoading`	`boolean`	`true` while manager is initialising or a scrape is in-flight.
`isManagerReady`	`boolean`	`true` once the manager singleton is created.
`isContinuousScraping`	`boolean`	`true` while a continuous scrape is still resolving providers.
`error`	`ProcessError \| HttpError \| null`	The last error from init or scraping.
`scrape(requester)`	`(req: RawScrapeRequester) => Promise<void>`	Start a scrape. Clears previous sources.
`stopContinuousScraping()`	`() => void`	Cancel in-flight continuous scrape. Keeps collected sources.
`clearSources()`	`() => void`	Clear all collected sources.

Lifecycle

Mount — The manager singleton is created asynchronously.
scrape(requester) — Clears previous sources, then fetches. In continuous mode results stream in; in normal mode they arrive all at once.
New scrape() call — Cancels any in-flight operations, clears sources, starts fresh.
stopContinuousScraping() — Cancels remaining queued provider operations. Already-collected results are kept.
Unmount — All operations are cancelled and the manager is destroyed automatically.

🧪 Testing

# Run all tests
npm test

# Run specific test suites
npx jest tests/models/manager/ --verbose      # Manager unit tests
npx jest tests/models/sources/ --verbose      # Source integration tests

# With coverage
npx jest --coverage

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.npmignore		.npmignore
.npmrc		.npmrc
.prettierrc		.prettierrc
API_REFERENCE.md		API_REFERENCE.md
BUNDLING.md		BUNDLING.md
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
TESTING.md		TESTING.md
babel.config.js		babel.config.js
grabit.svg		grabit.svg
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

Grabit Engine

📑 Table of Contents

✨ Features

Core

Reliability

📦 Installation

🚀 Quick Start

🔗 Provider Sources

GitHub Source

Local Source

Registry Source

🔧 Creating a Provider Plugin

config.ts — Provider Configuration

stream.ts — Stream Handler

subtitle.ts — Subtitle Handler

index.ts — Entry Point

Multi-Language Providers

CLI

Config

Manifest (manifest.json)

� Bundling Providers

Install esbuild

Bundle all providers

Bundle a specific provider

Folder structure

Custom source & output directories

What the bundle contains

CLI reference

🧪 Testing Providers

📖 API Reference

⚙️ Configuration

Scrape Configuration

📊 Metrics & Health Monitoring

📝 Examples

⚛️ React Hook (useSources)

Basic Usage

Continuous Mode

Config (UseSourcesConfig)

Return Value (UseSourcesReturn)

Lifecycle

🧪 Testing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`config.ts` — Provider Configuration

`stream.ts` — Stream Handler

`subtitle.ts` — Subtitle Handler

`index.ts` — Entry Point

Manifest (`manifest.json`)

⚛️ React Hook (`useSources`)

Config (`UseSourcesConfig`)

Return Value (`UseSourcesReturn`)

Packages