chemotools is a Python library that brings chemometric preprocessing tools into the scikit-learn ecosystem.
It provides modular transformers for spectral data, designed to plug seamlessly into your ML workflows.
- Preprocessing for spectral data (baseline correction, smoothing, scaling, derivatization, scatter correction).
- Physical unit conversions (absorbance, transmittance, reflectance, Kubelka-Munk, pseudoabsorbance).
- Adaptation methods for calibration transfer between instruments (DS, PDS, x-axis interpolation).
- Fully compatible with
scikit-learnpipelines and transformers. - Simple, modular API for flexible workflows.
- Open-source, actively maintained, and published on PyPI and Conda.
Install from PyPI:
pip install chemotoolsInstall from Conda:
conda install -c conda-forge chemotoolsExample: preprocessing pipeline with scikit-learn:
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from chemotools.baseline import AirPls
from chemotools.scatter import MultiplicativeScatterCorrection
preprocessing = make_pipeline(
AirPls(),
MultiplicativeScatterCorrection(),
StandardScaler(with_std=False),
)
spectra_transformed = preprocessing.fit_transform(spectra)β‘οΈ See the documentation for full details.
This project uses uv for dependency management and Task to simplify common development workflows. You can get started quickly by using the predefined Taskfile, which provides handy shortcuts for common tasks:
task install # install all dependencies
task check # run formatting, linting, typing, and tests
task test # quick test run in the current environment
task coverage # run tests with coverage reportingRun tests in your current environment:
task test # quick test
task test:quick # same as task testRun tests across the compatibility matrix using nox:
task test:nox:list # list available nox sessions
task test:nox:core # core tests (no plotting/inspector)
task test:nox:full # full Python 3.10-3.14 matrix
task test:nox:min-sklearn # minimum scikit-learn compatibility tests
task test:nox:all # run all test matricesProfile performance of estimators:
task benchmark:list # list available benchmarks
task benchmark:list:all # show detailed benchmark registry
task benchmark:run -- --estimator adaptation.direct_standardization
task benchmark:run -- --estimator baseline.air_pls --profile regulartask build # build the package
task docs:html # build English documentation
task docs:html-all # build all language variantsFor more control, use nox directly:
uv run nox --list # show all available sessions
uv run nox -s tests-3.12 # run tests on a specific Python version
uv run nox -s tests-min-sklearn-3.12 # test minimum scikit-learn versionContributions are welcome! Check out the contributing guide and the project board.
Released under the MIT License.
This project embraces software supply chain transparency by generating an SBOM (Software Bill of Materials) for all dependencies. SBOMs help organizations, including those in regulated industries, track open-source components, ensure compliance, and manage security risks.
The SBOM file is made public as an asset attached to every release. It is generated using CycloneDX SBOM generator for Python, and can be vsualized in tools like CycloneDX Sunshine.
