A simple, plugin-based engine for scraping media streams and subtitles. Load provider plugins from GitHub, local files, or directly in code — with health tracking, auto-updates, caching, and more built right in. Works in Node.js, browsers, React and React Native.
- Features
- Installation
- Quick Start
- Provider Sources
- Creating a Provider Plugin
- Bundling Providers
- Testing Providers
- API Reference
- Configuration
- Metrics & Health Monitoring
- Examples
- React Hook (
useSources) - Testing
- License
- 🔌 Plugin system — add or remove providers anytime
- 🌍 Runs anywhere — Node.js, browsers, React Native
- 🎯 Pick a provider — scrape from one specific provider by its scheme
- ⚡ Run in parallel — scrape from multiple providers at the same time
- 🏁 Stop early — quit as soon as enough providers have responded
- ⏱️ Timeouts — never wait forever for a slow provider
- 📊 Health tracking — see how each provider is doing (errors, successes)
- 🔴 Auto-disable — bad providers get turned off on their own
- 🔄 Auto-update — remote providers refresh themselves on a timer
- 💾 Built-in cache — save results in memory so you don't repeat work
- 🔁 Retries — automatically retry failed providers
- ✅ Validation — checks that plugins are set up correctly before loading
npm install grabit-engineOptional: Puppeteer support (Node.js only)
npm install puppeteer-real-browserPuppeteer is an optional peer dependency for providers that need headless browser automation.
Optional: base64 polyfill (React Native)
npm install base-64React Native versions below 0.74 do not expose atob / btoa as globals. This library automatically polyfills them when it detects they are missing, using the base-64 package as an optional peer dependency.
If you are targeting React Native, install base-64 alongside this package. On Node.js and modern browsers the built-in atob / btoa are used and no extra package is needed.
import { ScrapePluginManager } from "grabit-engine";
// Create the manager with a registry source (simplest approach)
const manager = await ScrapePluginManager.create({
source: {
type: "registry",
name: "my-providers",
providers: {
"my-provider": myProviderModule
}
},
tmdbApiKeys: ["your-tmdb-api-key"]
});
// Scrape streams for a movie — minimal: only tmdbId is required!
// TMDB service auto-fills title, year, duration, imdbId, etc.
// Or provide full media data — TMDB only fills what's missing
const streams = await manager.getStreams({
media: {
type: "movie",
tmdbId: "27205"
// imdbId: "tt1375666"
// title: "Inception",
// duration: 148,
// releaseYear: 2010,
},
targetLanguageISO: "en"
});
// Scrape from a specific provider by scheme
const targeted = await manager.getStreamsByScheme("my-provider", request);The manager can load plugins from three places:
| Source | Runtime | Description | Auto-Update |
|---|---|---|---|
github |
All | Download providers from a GitHub repo | ✅ |
local |
All | Load providers from files on your machine | ❌ |
registry |
All | Pass provider modules directly in code — no file I/O needed | ❌ |
const manager = await ScrapePluginManager.create({
source: {
type: "github",
url: "https://github.com/your-org/your-providers",
branch: "main",
rootDir: "dist", // optional, subdirectory containing manifest.json and providers (default: repo root)
token: process.env.GITHUB_TOKEN, // optional, for private repos
// Required in browser / React Native:
moduleResolver: async (scheme, sourceCode) => {
const exports = {};
const module = { exports };
new Function("module", "exports", sourceCode)(module, exports);
return (module.exports as any).default ?? module.exports;
}
}
});Repository structure
Your GitHub repo must contain a manifest.json. By default it's expected at the repo root, but you can set rootDir to point to a subdirectory:
your-providers/ # rootDir not set (default: repo root)
├── manifest.json
└── providers/
├── example-provider/
│ └── index.js
└── another-provider/
└── index.js
your-providers/ # rootDir: "dist"
├── dist/
│ ├── manifest.json
│ └── providers/
│ ├── example-provider/
│ │ └── index.js
│ └── another-provider/
│ └── index.js
└── src/
└── ...
manifest.json
{
"name": "my-providers",
"author": "your-name",
"providers": {
"example-provider": {
"name": "ExampleProvider",
"version": "1.0.0",
"active": true,
"language": "en",
"type": "media",
"env": "universal",
"supportedMediaTypes": ["movie", "serie"],
"priority": 10,
"dir": "providers"
}
}
}const manager = await ScrapePluginManager.create({
source: {
type: "local",
manifest: require("./manifest.json"),
rootDir: "./providers",
resolve: (path) => require(path)
}
});import exampleProvider from "./providers/example-provider";
const manager = await ScrapePluginManager.create({
source: {
type: "registry",
name: "my-providers",
providers: {
"example-provider": exampleProvider
}
}
});The fastest way to create a new provider is with the built-in CLI:
npx create-provider my-cool-providerYou can specify the language(s) upfront with --lang. Pass a comma-separated list for multiple languages:
# Single language (default: "en")
npx create-provider my-cool-provider --lang fr
# Multiple languages
npx create-provider my-cool-provider --lang en,fr,esIf no scheme is provided, the CLI enters interactive mode and prompts you for it:
npx create-providerOnce your provider is ready, bundle it for distribution with
npx bundle-provider— see Bundling Providers for all available flags (--src,--out,--dry-run,--clean).
This creates a ready-to-edit folder:
providers/
└── my-cool-provider/
├── index.ts ← entry point (exports the module)
├── config.ts ← provider settings (URL, endpoints, etc.)
├── stream.ts ← stream scraping logic
└── subtitle.ts ← subtitle scraping logic (optional)
You can also create the files by hand. Here's what each file looks like:
import { ProviderConfig } from "grabit-engine";
export const config: ProviderConfig = {
scheme: "example-provider",
name: "ExampleProvider",
language: "en", // or ["en", "fr"] for multi-language providers
baseUrl: "https://example-streams.com",
entries: {
movie: { endpoint: "/embed/movie?tmdb={id:string}" },
serie: { endpoint: "/embed/tv?tmdb={id:string}&season={season:1}&episode={episode:1}" }
},
mediaIds: ["tmdb", "imdb"]
};import { ScrapeRequester, InternalMediaSource, ProviderContext } from "grabit-engine";
import { Provider } from "grabit-engine/models/provider";
import { config } from "./config";
export async function getStreams(requester: ScrapeRequester, ctx: ProviderContext): Promise<InternalMediaSource[]> {
const provider = Provider.create(config);
const url = provider.createResourceURL(requester);
ctx.log.info(`Fetching streams from ${url.href}`);
const { $, response } = await ctx.cheerio.load(url, requester, ctx.xhr);
const src = $("video > source").attr("src");
if (!src) return [];
return [
{
fileName: "video.mp4",
format: "mp4",
language: "en",
playlist: src,
xhr: { haveCorsPolicy: false, headers: {} }
}
];
}import { ScrapeRequester, InternalSubtitleSource, ProviderContext } from "grabit-engine";
import { Provider } from "grabit-engine/models/provider";
import { config } from "./config";
export async function getSubtitles(requester: ScrapeRequester, ctx: ProviderContext): Promise<InternalSubtitleSource[]> {
const provider = Provider.create(config);
const url = provider.createResourceURL(requester);
ctx.log.info(`Fetching subtitles from ${url.href}`);
const apiUrl = new URL(`/api/subtitles?id=${url.searchParams.get("tmdb")}`, url.origin);
const response = await ctx.xhr.fetch(apiUrl, {}, requester);
const data = await response.json();
return data.map((sub: any) => ({
fileName: "subtitles.srt",
format: "srt" as const,
language: sub.language,
languageName: sub.languageName,
url: sub.url,
xhr: { haveCorsPolicy: false, headers: {} }
}));
}import { defineProviderModule } from "grabit-engine/controllers/provider";
import { Provider } from "grabit-engine/models/provider";
import { config } from "./config";
import { getStreams } from "./stream";
import { getSubtitles } from "./subtitle";
const provider = Provider.create(config);
export default defineProviderModule(
provider,
{
name: config.name,
version: "1.0.0",
active: true,
env: "universal",
type: "media",
supportedMediaTypes: ["movie", "serie"],
priority: 10,
dir: "providers"
},
{ getStreams, getSubtitles }
);The language field on both ProviderConfig and ProviderModuleManifest accepts a single string or an array of strings. This lets you declare that a provider serves content in multiple languages.
# Single language (default)
npx create-provider my-provider --lang en
# Multiple languages
npx create-provider my-provider --lang en,fr,es// Single language
export const config: ProviderConfig = {
scheme: "single-lang",
name: "SingleLang",
language: "en"
// ...
};
// Multi-language
export const config: ProviderConfig = {
scheme: "multi-lang",
name: "MultiLang",
language: ["en", "fr", "es"]
// ...
};When the manager sorts providers for a request, providers whose language field includes the requester's targetLanguageISO are prioritized higher.
When providers are loaded from GitHub (via GithubService), each provider is fetched as a single index.js file and loaded via dynamic import() in an isolated temp directory. That directory has no node_modules and no sibling files — so relative imports (./config) and package imports (grabit-engine) would fail.
The bundler solves this by compiling each provider into a standalone, self-contained ES module with zero external imports.
npm install --save-dev esbuildnpx bundle-providernpx bundle-provider my-cool-providerProviders can be organized flat or grouped inside subdirectories:
providers/
├── english/ ← group folder (no index.ts)
│ ├── vidsrc/ ← provider → scheme "english/vidsrc"
│ │ ├── index.ts
│ │ ├── config.ts
│ │ ├── stream.ts
│ │ └── subtitle.ts
│ └── another/ ← provider → scheme "english/another"
│ └── index.ts ...
├── loodvidrsc/ ← provider → scheme "loodvidrsc"
│ ├── index.ts
│ └── ...
└── manifest.json
The bundler recursively walks the source directory. Folders with index.ts are providers; folders without are group organizers.
For grouped providers, pass the full relative path:
npx bundle-provider english/vidsrcBy default, providers are read from providers/ and bundles are written next to the source. You can change both:
# Custom source directory
npx bundle-provider --src ./my-providers
# Custom output directory (mirrors the folder structure)
npx bundle-provider --out ./dist/providers
# Both
npx bundle-provider --src ./my-providers --out ./dist/providersWith --out ./dist/providers, the output becomes:
dist/providers/
├── english/vidsrc/index.js ← standalone bundle
├── loodvidrsc/index.js ← standalone bundle
└── ...
Each bundled index.js inlines everything it needs:
- Your provider's config, stream, and subtitle logic
- Runtime code from
grabit-engine(Provider,defineProviderModule, etc.) - Manifest data from
manifest.json
Tree-shaking keeps bundles small (~5–15 KB). The output has zero import statements.
| Command | Description |
|---|---|
npx bundle-provider |
Bundle all providers |
npx bundle-provider <scheme> |
Bundle one provider (e.g. vidsrc or english/vidsrc) |
npx bundle-provider --src <dir> |
Custom source directory |
npx bundle-provider --out <dir> |
Custom output directory |
npx bundle-provider --dry-run |
Preview without writing |
npx bundle-provider --clean |
Remove all generated bundles |
Tip: After editing any provider source files, always re-bundle before pushing to GitHub.
See
scripts/BUNDLING.mdfor the full bundling guide.
Once you have written a provider, use the test-provider CLI tool to verify it scrapes correctly against real media data — without writing any test files or setting up a manager.
# Test a movie — minimal (TMDB fills title, year, duration, etc.)
npx test-provider --scheme my-provider --type movie --tmdb 27205
# Test a movie — full (all data provided, TMDB only fills gaps)
npx test-provider --scheme my-provider --type movie \
--title "Inception" --year 2010 --tmdb 27205 --duration 148
# Test a series episode — minimal
npx test-provider --scheme my-provider --type serie \
--tmdb 1396 --season 1 --episode 1
# Test a series episode — full
npx test-provider --scheme my-provider --type serie \
--title "Breaking Bad" --year 2008 --tmdb 1396 \
--season 1 --episode 1 --ep-tmdb 349232
# Test both streams and subtitles
npx test-provider --scheme my-provider --mode both --type movie --tmdb 27205
# Load media from a JSON file
npx test-provider --scheme my-provider --media-file ./test-media.jsonThe tool auto-bundles TypeScript source via esbuild if no pre-built index.js is present, fetches missing media data from TMDB automatically, runs the scrape with a configurable timeout, and prints a formatted report with a PASS / EMPTY / FAIL verdict.
See
/TESTING.mdfor the full guide — all flags, output format, media file examples, and tips.
Full API documentation has been moved to API_REFERENCE.md for better readability.
It covers:
ScrapePluginManager,ScrapeRequester,ProviderModuleManifest,ProviderMetrics&ProviderHealthReport,ProviderContext,ProviderFetchOptions, Media Input Types, Output Types, Provider Configuration, theProviderclass, Error Classes, Utility Functions, and Services.
| Option | Type | Default | Description |
|---|---|---|---|
source |
GithubSource | LocalSource | RegistrySource |
— | Required. Where to load your plugins from. |
debug |
boolean |
false |
Turn on detailed logging. |
strict |
boolean |
false |
Throw errors for bad plugins instead of just skipping them. |
autoUpdateIntervalMinutes |
number |
15 |
How often to refresh remote providers (min: 5). |
cache.enabled |
boolean |
false |
Turn on result caching. |
cache.TTL |
number |
0 |
How long to keep cached results (in ms). |
cache.MODULE_TTL |
number |
900000 |
How long to keep loaded provider modules in cache (15 min). |
cache.TMDB_TTL |
number |
0 |
How long to cache TMDB API responses (in ms). Helps avoid hitting the TMDB API too hard. Set to e.g. 3600000 (1 hour) to cache responses. |
cache.maxEntries |
number |
10000 |
Maximum number of entries in the in-memory cache. Oldest entries are evicted when the limit is reached (LRU). |
tmdbApiKeys |
string[] |
— | Required. Array of TMDB API keys. A random key is selected for each request to distribute load. |
| Option | Type | Default | Description |
|---|---|---|---|
scrapeConfig.concurrentOperations |
number |
5 |
How many providers can run at the same time. |
scrapeConfig.maxAttempts |
number |
1 |
How many times to retry a failing provider. |
scrapeConfig.operationTimeout |
number |
15000 |
Max time before giving up on a scrape (15 sec). |
scrapeConfig.successQuorum |
number |
undefined |
Stop once this many providers have succeeded. |
scrapeConfig.errorThresholdRate |
number |
0.7 |
Error rate that triggers auto-disable (70%). |
scrapeConfig.minOperationsForEvaluation |
number |
10 |
How many scrapes before checking if a provider is healthy. |
The manager keeps track of how each provider is doing and can automatically turn off unhealthy ones:
// Raw metrics map
const metrics = manager.getMetrics();
for (const [scheme, m] of metrics) {
console.log(`${scheme}: ${m.successes} ok, ${m.errors} err`);
}
// Detailed health report
const report = manager.getMetricsReport();
report.forEach((r) => {
console.log(`${r.moduleName}: ${r.totalOperations} ops, ` + `${(r.errorRate * 100).toFixed(1)}% errors, ` + `active=${r.active}`);
});Providers that fail too often (more than errorThresholdRate after minOperationsForEvaluation scrapes) get turned off and won't be used again until the manager is reloaded.
React Native with GitHub source
import { ScrapePluginManager } from "grabit-engine";
const manager = await ScrapePluginManager.create({
source: {
type: "github",
url: "your-org/providers-repo",
branch: "main",
rootDir: "dist", // optional
moduleResolver: async (_scheme, sourceCode) => {
const exports: Record<string, unknown> = {};
const module = { exports };
new Function("module", "exports", sourceCode)(module, exports);
return (module.exports as any).default ?? module.exports;
}
},
tmdbApiKeys: ["your-tmdb-api-key"],
scrapeConfig: {
concurrentOperations: 3,
successQuorum: 2,
operationTimeout: 15000
}
});
// Minimal request — just tmdbId, TMDB fills the rest
// Minimal request — just tmdbId, TMDB fills the rest
const streams = await manager.getStreams({
media: { type: "movie", tmdbId: "27205" },
targetLanguageISO: "en"
});Node.js with local providers
import { ScrapePluginManager } from "grabit-engine";
import manifest from "./providers/manifest.json";
const manager = await ScrapePluginManager.create({
source: {
type: "local",
manifest,
rootDir: "./providers",
resolve: (path) => require(path)
},
tmdbApiKeys: ["your-tmdb-api-key"],
debug: true,
cache: {
enabled: true,
TTL: 300_000,
TMDB_TTL: 3_600_000, // Cache TMDB responses for 1 hour
maxEntries: 5_000
},
scrapeConfig: {
maxAttempts: 3,
errorThresholdRate: 0.5
}
});Targeted scraping by scheme
// Only scrape from a specific provider
const streams = await manager.getStreamsByScheme("example-provider", request);
const subs = await manager.getSubtitlesByScheme("subtitle-provider", request);An optional React hook for declarative scraping inside React / React Native components. Requires react >= 17 as a peer dependency (already optional — non-React consumers are unaffected).
npm install react # if not already installedimport { useSources } from "grabit-engine";
function StreamList() {
const { mediaSources, subtitleSources, isLoading, isManagerReady, error, scrape, clearSources } = useSources({
managerConfig: {
source: {
type: "registry",
name: "my-providers",
providers: {
/* ... */
}
},
tmdbApiKeys: ["your-tmdb-api-key"]
},
type: "both"
});
const handleScrape = () => {
scrape({
media: { type: "movie", tmdbId: "27205" },
targetLanguageISO: "en"
});
};
return (
<div>
<button onClick={handleScrape} disabled={!isManagerReady || isLoading}>
{isLoading ? "Scraping…" : "Scrape"}
</button>
{error && <p>Error: {error.message}</p>}
<h3>Media ({mediaSources.length})</h3>
<ul>
{mediaSources.map((s) => (
<li key={`${s.scheme}-${s.providerName}-${s.fileName}`}>{s.fileName}</li>
))}
</ul>
<h3>Subtitles ({subtitleSources.length})</h3>
<ul>
{subtitleSources.map((s) => (
<li key={`${s.scheme}-${s.providerName}-${s.fileName}`}>{s.fileName}</li>
))}
</ul>
</div>
);
}When continuous: true, calling scrape() ignores scrapeConfig.successQuorum and streams results per-provider as they arrive — the list grows live instead of waiting for all providers to finish.
const { mediaSources, isContinuousScraping, scrape, stopContinuousScraping } = useSources({
managerConfig: {
/* ... */
},
continuous: true,
type: "media"
});
// Start scraping — results appear one by one
scrape({ media: { type: "serie", tmdbId: "1396", ep_tmdbId: "62085", season: 1, episode: 1 }, targetLanguageISO: "en" });
// Cancel early — already-collected sources are kept
stopContinuousScraping();| Property | Type | Default | Description |
|---|---|---|---|
managerConfig |
ProviderManagerConfig |
— | Configuration for the ScrapePluginManager singleton. |
continuous |
boolean |
false |
Stream results per-provider as they arrive (ignores successQuorum). |
type |
"media" | "subtitle" | "both" |
"both" |
Which source category to fetch. |
| Property | Type | Description |
|---|---|---|
mediaSources |
MediaSource[] |
Collected media sources (de-duplicated). |
subtitleSources |
SubtitleSource[] |
Collected subtitle sources (de-duplicated). |
isLoading |
boolean |
true while manager is initialising or a scrape is in-flight. |
isManagerReady |
boolean |
true once the manager singleton is created. |
isContinuousScraping |
boolean |
true while a continuous scrape is still resolving providers. |
error |
ProcessError | HttpError | null |
The last error from init or scraping. |
scrape(requester) |
(req: RawScrapeRequester) => Promise<void> |
Start a scrape. Clears previous sources. |
stopContinuousScraping() |
() => void |
Cancel in-flight continuous scrape. Keeps collected sources. |
clearSources() |
() => void |
Clear all collected sources. |
- Mount — The manager singleton is created asynchronously.
scrape(requester)— Clears previous sources, then fetches. In continuous mode results stream in; in normal mode they arrive all at once.- New
scrape()call — Cancels any in-flight operations, clears sources, starts fresh. stopContinuousScraping()— Cancels remaining queued provider operations. Already-collected results are kept.- Unmount — All operations are cancelled and the manager is destroyed automatically.
# Run all tests
npm test
# Run specific test suites
npx jest tests/models/manager/ --verbose # Manager unit tests
npx jest tests/models/sources/ --verbose # Source integration tests
# With coverage
npx jest --coverageISC © grabit-engine
{ "providers": { "my-provider": { "name": "MyProvider", "version": "1.0.0", "active": true, "language": ["en", "fr", "es"] // ... } } }