Skip to content

Cross-subject transfer learning via CrossSubjectEvaluation (calibration_size)#1093

Open
bruAristimunha wants to merge 3 commits into
NeuroTechX:developfrom
bruAristimunha:cross-subject-transfer-split
Open

Cross-subject transfer learning via CrossSubjectEvaluation (calibration_size)#1093
bruAristimunha wants to merge 3 commits into
NeuroTechX:developfrom
bruAristimunha:cross-subject-transfer-split

Conversation

@bruAristimunha

@bruAristimunha bruAristimunha commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Addresses #1077. The split-based alternative to the dedicated evaluation engine in #1091, following the direction discussed in #1077: target-aware transfer learning runs through the existing CrossSubjectEvaluation with a few lines of change — no new evaluation class, no separate transfer module.

Usage

CrossSubjectEvaluation(paradigm=..., datasets=..., calibration_size=0.2).process(pipelines)

What it adds

  • TransferSplitter(base_splitter, calibration_size) (splitters.py) — a generic wrapper that carves the first calibration_size fraction off the held-out group of any leave-one-group-out splitter, yielding (train, calibration, test). Gives subject-, session-, and dataset-transfer from one mechanism.
  • CrossSubjectEvaluation gains calibration_size (+ calibration_labeled); when > 0, _create_splitter wraps its CrossSubjectSplitter in TransferSplitter.
  • base.py consumes the split with train, *cal, test (parallel + serial paths). The held-out calibration slice is routed raw to the pipeline steps that request it via scikit-learn metadata routing (set_fit_request): subjects, X_target_unlabeled / X_target_labeled. Plain pipelines request nothing → {} → the fit is unchanged.

Design notes

  • The calibration slice is passed raw; base.py does not transform it through the pipeline (no transform-through-steps). The transfer estimator owns the target representation. This is what keeps the change minimal.
  • Labeled vs unlabeled is just which kwarg carries the slice — a fit-time concern, not split geometry.
  • Metadata routing (SLEP006) replaces signature inspection for passing subjects/target data to estimators.
  • No trialwise predict wrapper (a no-op for inductive estimators); no new dependency.

Verification

Full evaluation + splitter test suites pass. Integration test (test_cross_subject_calibration_*): a target-aware step receives the routed subjects + non-empty X_target_unlabeled on every fold, and a plain pipeline runs untouched at calibration_size=0.5.

cc @toncho11 — concrete, minimal counter-proposal to #1091; an existing target-aware estimator (e.g. RPA) drops in by declaring set_fit_request(subjects=True, X_target_unlabeled=True, ...).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da2b30015c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread moabb/evaluations/transfer.py Outdated
Comment on lines +105 to +107
md["X_target_labeled" if labeled else "X_target_unlabeled"] = X[calib]
if labeled:
md["y_target_labeled"] = y[calib]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject labeled 100% calibration to avoid leaking test labels

When calibration_size=1.0 is combined with labeled=True, TransferSplitter makes calib identical to test, so this block routes y[test] as y_target_labeled before the same trials are scored. Any target-aware estimator that requests y_target_labeled can train on the labels of every evaluated sample, making the reported score invalid; please disallow this combination or avoid passing labels when the calibration slice overlaps the test slice.

Useful? React with 👍 / 👎.

…aluation

Run target-aware transfer protocols through the existing CrossSubjectEvaluation,
instead of a dedicated evaluation engine or a separate transfer module.

- CrossSubjectEvaluation gains calibration_size (+ calibration_labeled): when
  > 0 it wraps its CrossSubjectSplitter in TransferSplitter, so each fold yields
  (train, calibration, test).
- base.py consumes the split with `train, *cal, test` (parallel + serial paths);
  the held-out calibration slice is routed RAW to the pipeline steps that
  request it via sklearn metadata routing (subjects, X_target_unlabeled /
  X_target_labeled). Plain pipelines request nothing and are unaffected -- the
  estimator owns the target representation, so there is no transform-through-steps.
- TransferSplitter is the single generic transfer splitter (subject / session /
  dataset).

Removes moabb/evaluations/transfer.py and CrossSubjectTransferSplitter.

Usage:
    CrossSubjectEvaluation(..., calibration_size=0.2).process(pipelines)
@bruAristimunha bruAristimunha force-pushed the cross-subject-transfer-split branch from da2b300 to 2b06658 Compare June 19, 2026 20:24
@bruAristimunha bruAristimunha changed the title Transfer-learning cross-subject splits via TransferSplitter + metadata routing Cross-subject transfer learning via CrossSubjectEvaluation (calibration_size) Jun 19, 2026
- calibration_size / calibration_labeled now ride cv_kwargs instead of bespoke
  CrossSubjectEvaluation __init__ params: read from self.cv_kwargs and stripped
  before the inner CV. No __init__ override.
- Expose cv_class as a documented option (like WithinSessionEvaluation); it
  composes with calibration.
- base.py / serial evaluate() read calibration_labeled from cv_kwargs.

Numerically identical to CrossSubjectTargetAwareEvaluation.process() on
BNCI2014_004 (max abs score diff 0.0) when the target-aware estimator covs the
raw target and declares matching fit/transform metadata requests.

Usage:
    CrossSubjectEvaluation(..., cv_kwargs={"calibration_size": 0.2}).process(pipelines)
Move the transfer calibration into the splitter so CrossSubjectEvaluation's
_create_splitter is the plain original (no .get / .pop / wrapper).

- CrossSubjectSplitter gains a calibration_size param: yields (train, calib,
  test) when > 0, otherwise the usual (train, test). Removes TransferSplitter.
- _resolve_cv now always merges self.cv_kwargs over the defaults (a latent fix),
  so calibration_size flows via cv_kwargs with the default cv_class too.
- Drop calibration_labeled: _evaluate_fold offers all transfer kwargs
  (subjects, X_target_unlabeled, X_target_labeled, y_target_labeled) and
  metadata routing (consumes) keeps only what the estimator requested, so the
  estimator's set_fit_request decides labeled vs unlabeled.

Numerically identical to CrossSubjectTargetAwareEvaluation.process() on
BNCI2014_004 (max abs score diff 0.0).
@toncho11

toncho11 commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

I will need more time to analyze your code Monday. Here are my initial comments:

  • Trialwise / one-shot cross-subject evaluation is the protocol I currently need for an important publication I am preparing. Infact, the additional modes were added to make the evaluation more general, but HOS_SOURCE_ONLY_TRIALWISE is the one I really really need. Everyone who works on trailwise/one shot will benefit from it. I would like to keep HOS_SOURCE_ONLY_TRIALWISE in the loop, because I would like to be able to say that my results were obtained with the latest MOABB version. Also I want to make sure that when people use HOS_SOURCE_ONLY_TRIALWISE - they can only acccess a single trial per evaluation, no doubt about that.

  • I think explicit modes are useful for standardizing comparisons in the community. If the calibration/adaptation fraction is only a free float, different papers may use different values such as 0.2, 0.3, 0.5, or 0.7, which makes results harder to compare. Having predefined modes such as 20%, 50% would make the evaluated protocol clearer and easier to benchmark consistently.

  • I have to study more your set_fit_request(subjects=True, X_target_unlabeled=True, ...), but I have a concern about the representation of X_target_unlabeled. If the calibration slice is passed raw and is not transformed through the previous pipeline steps, then an existing RPA step placed after Covariances() would not directly drop in. In that pipeline, RPA would receive source covariance matrices as X, but raw target epochs as X_target_unlabeled.

For example:

Covariances()
RPA()
TangentSpace()
Classifier()

would lead in the RPA transformer to:

X                  = source covariance matrices
X_target_unlabeled = raw target epochs

Not sure how this can be handled.

I also remind you that we can do a Teams meeting and discuss. It will be easier to discuss some details in person.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants