Skip to content

Result of halfwidth_ci_test may not correct. #413

@Tianjie-Wu

Description

@Tianjie-Wu

In document of xskillscore, result of halfwidth_ci_test is described as follow: difference in scores (score(f2) - score(f1))

However, I discovered that this is only partially right.
For example, when the metric is RMSE, this method returns the difference score(f2) - score(f1), but when the metric is ME, this function returns the difference score(f1) - score(f2).

Code Sample, a copy-pastable example if possible

length = 100
obs_1d = xarray.DataArray(
       numpy.random.rand(length),
       coords=[
           numpy.arange(length),
       ],
       dims=["time"],
       name='var'
   )
fct_1d = obs_1d.copy()
fct_1d.values = numpy.random.rand(length)

# create a worse forecast with high but different to perfect correlation and choose RMSE as the distance metric
fct_1d_worse = fct_1d.copy()
step = 3
fct_1d_worse[::step] = fct_1d[::step].values + 0.3

# half-with of the confidence interval at level alpha is larger than the RMSE differences,
# therefore not significant
metric = "rmse"
alpha = 0.05
significantly_different, diff, hwci = xskillscore.halfwidth_ci_test(
    fct_1d, fct_1d_worse, obs_1d, metric, time_dim="time", dim=[], alpha=alpha
)
print("RMSE DIFF:",diff.values)
print("RMSE HWCI:",hwci.values)
print(
    f"RMSEs significantly different at level {alpha} : {bool(significantly_different)}"
)

metric = "me"
alpha = 0.05
significantly_different, diff, hwci = xskillscore.halfwidth_ci_test(
    fct_1d, fct_1d_worse, obs_1d, metric, time_dim="time", dim=[], alpha=alpha
)
print("ME DIFF:",diff.values)
print("ME HWCI:",hwci.values)
print(
    f"MEs significantly different at level {alpha} : {bool(significantly_different)}"
)

print ( "RMSE 2,1: ",xskillscore.rmse(fct_1d_worse,obs_1d, dim=[]).values.mean(),xskillscore.rmse(fct_1d,obs_1d, dim=[]).values.mean() )
print ( "RMSE DIFF 2-1:", (xskillscore.rmse(fct_1d_worse,obs_1d, dim=[]).values-xskillscore.rmse(fct_1d,obs_1d, dim=[]).values).mean() )
print ( "ME 2,1: ",(fct_1d_worse-obs_1d).mean().values,(fct_1d-obs_1d).mean().values )
print ( "ME DIFF 2-1: ",((fct_1d_worse-obs_1d).values-(fct_1d-obs_1d).values).mean() )

print ("xskillscore version:",xskillscore.__version__)

the result is listed as follow:

RMSE DIFF: 0.002145380792967111
RMSE HWCI: 0.03084942672029623
RMSEs significantly different at level 0.05 : False
ME DIFF: -0.10199999999999998
ME HWCI: 0.027772820244038265
MEs significantly different at level 0.05 : True
RMSE 2,1:  0.35575854107274524 0.3536131602797781
RMSE DIFF 2-1: 0.0021453807929670793
ME 2,1:  0.02998835594288012 -0.07201164405711986
ME DIFF 2-1:  0.102
xskillscore version: 0.0.24

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions