Skip to content

ScmRun.convert_unit fails on pandas 3.x with 'assignment destination is read-only' #320

@benmsanderson

Description

@benmsanderson

Summary

ScmRun.convert_unit raises ValueError: assignment destination is read-only on pandas 3.x. The error comes from apply_units inside convert_unit, which assigns into group._df.values[:]. Under pandas 3.x copy-on-write semantics, .values returns a read-only view and in-place assignment via the slice fails.

This is the second of two compat issues blocking convert_unit on the Python 3.12 / pandas 3.x / numpy 2.x stack. The first was #318 (np.issubdtype on StringDtype); fix is in #319. With #319's fix applied, convert_unit advances past RunGroupBy.__init__ and then trips on this one.

Repro

import scmdata
import pandas as pd

df = pd.DataFrame(
    [[1.0, 2.0]],
    index=pd.MultiIndex.from_tuples(
        [("m", "sA", "m", "World", "Emissions|CO2", "GtC/yr", 0)],
        names=[
            "climate_model", "scenario", "model", "region",
            "variable", "unit", "run_id",
        ],
    ),
    columns=[2020, 2021],
)
run = scmdata.ScmRun(df)
run.convert_unit("PgC/yr", variable="Emissions|CO2")

Traceback (with #319 applied; without #319 the call fails earlier in RunGroupBy.__init__):

File ".../scmdata/run.py", line 2180, in apply_units
    group._df.values[:] = uc.convert_from(group._df.values)
    ^^^^^^^^^^^^^^^^^^^
ValueError: assignment destination is read-only

Environment

Root cause

scmdata/run.py:2180:

def apply_units(group):
    orig_unit = group.get_unique_meta("unit", no_duplicates=True)
    uc = UnitConverter(orig_unit, unit, context=context)

    group._df.values[:] = uc.convert_from(group._df.values)
    group["unit"] = unit

    return group

Under pandas 3.x, DataFrame.values returns a read-only array (copy-on-write). The [:] = ... slice assignment then fails. Same pattern likely exists elsewhere in scmdata if .values[:] is used to write through to the underlying frame anywhere else (worth grepping).

Suggested fix

Replace the in-place pattern with a pandas-native assignment that goes through the proper write path:

def apply_units(group):
    orig_unit = group.get_unique_meta("unit", no_duplicates=True)
    uc = UnitConverter(orig_unit, unit, context=context)

    converted = uc.convert_from(group._df.values)
    group._df.iloc[:, :] = converted   # or: group._df.loc[:, :] = converted
    group["unit"] = unit

    return group

iloc[:, :] = ... triggers pandas's normal setitem machinery, which respects copy-on-write and produces a writable underlying buffer. On pandas 2.x this is equivalent to the old behaviour.

Happy to PR if you'd prefer. Filed from work at github.com/benmsanderson/openscm-runner (AR7 modernisation fork). See related #318 / #319.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions