Sim-vs-obs plots for the 3 BU SIPNET ensembles (#238) by AritraDey-Dev · Pull Request #31 · ccmmf/workflows

AritraDey-Dev · 2026-06-10T14:42:36Z

Sim-vs-obs plots for the 3 BU SIPNET ensembles (Salinas SOCS, Modesto Nichols, Russell Ranch).

Closes ccmmf/organization#238.

Salinas SOCS — TotSoilCarb

Modesto Nichols — N₂O flux

Russell Ranch — TotSoilCarb (derived)

Signed-off-by: Aritra Dey <adey01027@gmail.com>

divine7022

thanks for plots, highlevel
not totally sure on where it belongs at this moment but for sure not workflows.
for now i would suggest have this under benckmarking/ dir or similar in uncertainity repo

droped some inline, nice to have a look

divine7022 · 2026-06-24T23:21:21Z

+# ----------------------------------------------------------------------------
+WORKBOOK_DIR <- Sys.getenv(
+  "WORKBOOK_DIR",
+  unset = "/projectnb2/dietzelab/ccmmf/usr/adey2/workflows"


small thing, nothing blocking. right now WORKBOOK_DIR points at your scc folder and the script grabs the raw xlsx straight from there

i just got cal-val-data repo stood up where the cleaned cal/val data lives as csv. once that settles it would be nicer to read from those csvs instead of the workbook so anybody can run this. so no need to touch anything now, just leaving a note so we circle back and hook it up later

divine7022 · 2026-06-24T23:23:20Z

+# Read a single PEcAn-style NetCDF (one year file) and return a tibble with
+# posix + the requested variable. Time is encoded in the standard CF way
+# (units = "days since YYYY-01-01" or similar).
+read_one_nc <- function(nc_path, variable) {


read_one_nc is hand rolling the netcdf read here and parsing cf time units by chopping the string apart.

any reason not to lean on read.output/ ncdf4 helpers we already have for pulling the time origin and doing the unit conversion ? feels like a lot less to get wrong than re rolling it ourselves

came here to say this - PEcAn.utils::read.output()

more generally, there needs to be a strategy that 1) uses / improves existing PEcAn functionality and 2) writes new functions as if they are going to be merged into PEcAn at some point (see e.g. how I've organized the R folder in the downscaling repository).

We don't want to be held up by PEcAn review process, but we also don't want to re-invent the wheel and create one-off functions that aren't as robust or PEcAn-compatible as the ones that already exist.

divine7022 · 2026-06-24T23:30:08Z

these plot pngs are checked into the repo, and since they get regenerated they will keep churning in the diffs and go stale fast. might be cleaner drop the plots folder and just let folks rebuild them from the script, that way repo only carries code and not the output

this totally fine now, in intent to get it documented for records is more priority

Agreed, please remove these plots from the PR

dlebauer

Please make sure to coordinate with the work that @ayushman1210 is doing in his GSOC project to make sure there isn't unnecessary duplication of work.

Other points

observations appear to be stock measurement at a single time point, while the model states are calculated as annual means
separate out dataset specific from generalized calculations, and separate calculations from plotting functions. e.g. read data --> compute metrics --> plot. Again, lets align with @ayushman1210.
separate out workflow documentation and dataset specific assumptions
add documentation (per ccmmf/organization#238).

Consider making suggested changes and then submitting as a PR to the uncertainty and cal-val-data repositories.

dlebauer · 2026-07-03T18:40:48Z

Agreed, please remove these plots from the PR

dlebauer · 2026-07-03T18:41:49Z

+MAGIC_MAIN    <- file.path(WORKBOOK_DIR, "MAGiC Calibration Validation Dataset(1).xlsx")
+MAGIC_RUSSELL <- file.path(WORKBOOK_DIR, "MAGiC Calibration Validation Data_ Russell Ranch.xlsx")


update to use data in cal-val-data repo

dlebauer · 2026-07-03T18:53:57Z

+
+# Load ensemble output across all ENS-* dirs for one site and one variable.
+# Returns a tibble: ens_num, posix, year, <variable>.
+read_ensemble <- function(run_dir, variable, start_year, end_year) {


this functionality has been implemented multiple times; there is a PEcAn function PEcAn.utils::nc_merge_all_sites_by_year()

Before that function existed, I wrote similar functionality in
ccmmf/downscaling scripts/030_extract_sipnet_output.R - though I am not sure if @divine7022 has refactored that in one of the open PRs. It can output a single netcdf or long CSV.

dlebauer · 2026-07-03T19:00:04Z

+    dplyr::filter(!is.na(value))
+}
+
+# Derive Russell Ranch SOC stock (Mg C ha-1) from per-layer C% and BD.


datasets and dataset-specific harmonization scripts should live together in a single repository (e.g. cal-val-data).

We don't want to prematurely optimize, but keep in mind:

writing a dataset specific function like this is appropriate for the first pass. and can be part of a dataset specific harmonization script as mentioned above. But general functionality should be extracted so that it doesn't have to be rewritten.

when refactoring, consider breaking out specific conversion functions into utility functions in data.land (e.g. https://github.com/PecanProject/pecan/blob/develop/modules/data.land/R/soil_utils.R)

dlebauer · 2026-07-03T19:00:14Z

+}
+
+# Derive Russell Ranch SOC stock (Mg C ha-1) from per-layer C% and BD.
+# Per David's guidance (Slack):


Suggested change

# Per David's guidance (Slack):

dlebauer · 2026-07-03T19:19:27Z

+        title    = sprintf("%s — %s (%d–%d)",
+                           site$name, site$treatment,
+                           site$start_year, site$end_year),
+        subtitle = "Total Soil Carbon — derived obs (0-30 cm: C% x BD x depth, summed)\nAssumptions: total C = SOC (pH < 7); coarse fraction ignored (sandy loam)",


coarse fraction is not ignored; it is assumed to be zero based on soil type.

dlebauer · 2026-07-03T19:20:05Z

+for (site in SITES) {
+  if (site$short == "russell") {
+    # Russell Ranch workbook stores per-layer C% + BD, not SOC stock directly.
+    # Derive the 0-30 cm SOC stock per David's guidance (Slack), then plot.


don't need to include 'per David's guidance (Slack)' in the source code; this type of information goes in the PR description

dlebauer · 2026-07-03T19:20:39Z

+#   2. Compute SOC stock per layer = C% * BD * depth, then sum across layers.
+#   3. Assumptions:
+#      - total C = SOC (pH < 7 at this site)
+#      - coarse fraction negligible (sandy loam, not mentioned in source text)


please cite source that supports this assumption

dlebauer · 2026-07-03T19:22:48Z

+    posix <- as.POSIXct(origin_date) + t_vals * 3600
+  } else {
+    posix <- as.POSIXct(origin_date) + t_vals * 86400


use named constants, or, e.g.

year2s <- PEcAn.utils::ud_convert(1, 'y', 's')

dlebauer · 2026-07-03T19:29:43Z

+}
+
+# Plot ensemble band + observed points.
+plot_sim_vs_obs <- function(site, variable, label, obs_var, units_conv = 1,


lines 319 ff duplicate the core plotting block already implemented in plot_sim_vs_obs(). The special case seems to be how the observation dataframe is constructed.

Consider refactoring plot_sim_vs_obs() so it accepts an optional precomputed obs dataframe.

AritraDey-Dev closed this Jun 10, 2026

AritraDey-Dev deleted the cal-val-plots-238 branch June 10, 2026 14:48

AritraDey-Dev restored the cal-val-plots-238 branch June 10, 2026 14:51

AritraDey-Dev reopened this Jun 10, 2026

AritraDey-Dev force-pushed the cal-val-plots-238 branch from 7816e40 to f58370b Compare June 10, 2026 14:53

AritraDey-Dev requested a review from dlebauer June 10, 2026 14:55

AritraDey-Dev force-pushed the cal-val-plots-238 branch from f58370b to ce2a791 Compare June 10, 2026 15:02

AritraDey-Dev added 7 commits June 10, 2026 20:32

Add cal_val_plots scaffold

501036d

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add site configs

b045d20

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add ncdf ensemble readers

e717c4e

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add workbook obs loader

bd0ae37

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add Russell SOC derivation

8158f1b

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add plot function and main loop

5476a2c

Signed-off-by: Aritra Dey <adey01027@gmail.com>

Add ensemble plots

3749a56

Signed-off-by: Aritra Dey <adey01027@gmail.com>

AritraDey-Dev force-pushed the cal-val-plots-238 branch from ce2a791 to 3749a56 Compare June 10, 2026 15:03

divine7022 reviewed Jun 25, 2026

View reviewed changes

dlebauer reviewed Jul 3, 2026

View reviewed changes

		MAGIC_MAIN <- file.path(WORKBOOK_DIR, "MAGiC Calibration Validation Dataset(1).xlsx")
		MAGIC_RUSSELL <- file.path(WORKBOOK_DIR, "MAGiC Calibration Validation Data_ Russell Ranch.xlsx")

Uh oh!

Conversation

AritraDey-Dev commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Salinas SOCS — TotSoilCarb

Modesto Nichols — N₂O flux

Russell Ranch — TotSoilCarb (derived)

Uh oh!

divine7022 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

divine7022 Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dlebauer left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AritraDey-Dev commented Jun 10, 2026 •

edited

Loading

divine7022 Jun 24, 2026 •

edited

Loading

dlebauer left a comment •

edited

Loading