Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions docs/advanced/embedding.rst
Original file line number Diff line number Diff line change
Expand Up @@ -345,6 +345,75 @@ Example:
}


Reusing a thread state across activations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, :class:`subinterpreter_scoped_activate` creates a fresh
``PyThreadState`` on entry and destroys it on exit. A thread that repeatedly
re-enters the same sub-interpreter therefore allocates and frees a thread state
every time, and does **not** preserve any per-thread interpreter state between
activations.

For workloads where a single OS thread re-enters one or more sub-interpreters
many times, pybind11 provides :class:`subinterpreter_thread_state` — an RAII
object that owns a ``PyThreadState`` and lets you swap it in for the duration
of each :class:`subinterpreter_scoped_activate` scope without destroying it
between activations:

.. code-block:: cpp

// Create the PyThreadState once. It is created in a "released" state:
// not current, no GIL acquired.
thread_local py::subinterpreter_thread_state ts(sub);

{
// Swap it in; the subinterpreter's GIL is acquired.
py::subinterpreter_scoped_activate guard(ts);
// ... use the sub-interpreter ...
}
// Swap-out only; the PyThreadState is kept alive in `ts`.

{
py::subinterpreter_scoped_activate guard(ts);
// The same PyThreadState is reused; its per-thread interpreter state
// is preserved across activations.
}

This composes naturally with multiple sub-interpreters on the same OS thread:
hold one :class:`subinterpreter_thread_state` per sub-interpreter and alternate
between them. Each ``PyThreadState`` is independent and is preserved across
activations.

.. code-block:: cpp

thread_local py::subinterpreter_thread_state ts_a(sub_a);
thread_local py::subinterpreter_thread_state ts_b(sub_b);

{ py::subinterpreter_scoped_activate guard(ts_a); /* in sub_a */ }
{ py::subinterpreter_scoped_activate guard(ts_b); /* in sub_b */ }
{ py::subinterpreter_scoped_activate guard(ts_a); /* same PyThreadState as before */ }

The default behavior is unchanged: the
:class:`subinterpreter_scoped_activate(subinterpreter const&)` overload still
creates and destroys a transient ``PyThreadState`` per scope, and it never
shares a thread state with any :class:`subinterpreter_thread_state` that may
also exist for the same sub-interpreter on the same thread.

.. warning::

Lifetime and threading requirements for :class:`subinterpreter_thread_state`:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that activing the thread state on another OS thread is also illegal, that should get added to this list.

- It must be constructed and destroyed on the **same OS thread**. A
``PyThreadState`` is bound to its creating thread; deleting it on another
thread is undefined behavior. Holding the object as a ``thread_local``
satisfies this automatically.
- It must only be activated (with :class:`subinterpreter_scoped_activate`)
on the **same OS thread** that constructed it. Activating it on a
different thread is illegal.
- It must be destroyed while its sub-interpreter is still alive.
- It must **not** be destroyed while a :class:`subinterpreter_scoped_activate`
referring to it is alive — the activator holds a reference into it.

GIL API for sub-interpreters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
175 changes: 173 additions & 2 deletions include/pybind11/subinterpreter.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,27 @@
PYBIND11_NAMESPACE_BEGIN(PYBIND11_NAMESPACE)

class subinterpreter;
class subinterpreter_thread_state;

/// Activate the subinterpreter and acquire its GIL, while also releasing any GIL and interpreter
/// currently held. Upon exiting the scope, the previous subinterpreter (if any) and its
/// associated GIL are restored to their state as they were before the scope was entered.
///
/// Two construction modes are supported:
///
/// 1. `subinterpreter_scoped_activate(subinterpreter const &)`:
/// Transient mode (the default). A fresh PyThreadState is created on entry and destroyed on
/// exit. This is the established behavior; existing code is unaffected.
///
/// 2. `subinterpreter_scoped_activate(subinterpreter_thread_state &)`:
/// Reuse mode. The PyThreadState owned by the given subinterpreter_thread_state is swapped
/// in on entry and swapped out (but NOT destroyed) on exit, so repeated activations on the
/// same OS thread reuse the same PyThreadState and preserve its per-thread interpreter state.
/// Use this when a single OS thread re-enters one or more subinterpreters many times.
class subinterpreter_scoped_activate {
public:
explicit subinterpreter_scoped_activate(subinterpreter const &si);
explicit subinterpreter_scoped_activate(subinterpreter_thread_state &ts);
~subinterpreter_scoped_activate();

subinterpreter_scoped_activate(subinterpreter_scoped_activate &&) = delete;
Expand All @@ -41,6 +55,9 @@ class subinterpreter_scoped_activate {
PyThreadState *tstate_ = nullptr;
PyGILState_STATE gil_state_;
bool simple_gil_ = false;
// When true, tstate_ is owned by a subinterpreter_thread_state and must NOT be destroyed
// when this scope exits (only swapped out).
bool borrowed_ = false;
};

/// Holds a Python subinterpreter instance
Expand Down Expand Up @@ -228,10 +245,71 @@ class subinterpreter {

private:
friend class subinterpreter_scoped_activate;
friend class subinterpreter_thread_state;
PyInterpreterState *istate_ = nullptr;
PyThreadState *creation_tstate_ = nullptr;
};

/// RAII wrapper that owns a PyThreadState bound to a specific subinterpreter on the OS thread
/// that constructed it. Intended to be held long-lived (e.g. as a `thread_local`, or inside a
/// per-thread struct) so that many subinterpreter_scoped_activate scopes on the same OS thread
/// can reuse a single PyThreadState instead of creating and destroying one each time.
///
/// The PyThreadState is created on construction in a *released* state: it is NOT made current,
/// and no GIL is acquired. Activation is the job of subinterpreter_scoped_activate.
///
/// A single OS thread can hold one of these per subinterpreter and alternate between them via
/// subinterpreter_scoped_activate without churning PyThreadState objects.
///
/// Lifetime / threading requirements:
///
/// - Construction and destruction must happen on the SAME OS thread (a PyThreadState is bound
/// to the OS thread that created it; deleting it on a different thread is undefined behavior).
/// - The owning subinterpreter must still be alive when this object is destroyed.
/// - This object must NOT be destroyed while a subinterpreter_scoped_activate referring to it is
/// still alive (the activator holds a reference into it).
///
/// Typical usage:
///
/// @code
/// thread_local py::subinterpreter_thread_state ts(sub);
/// {
/// py::subinterpreter_scoped_activate guard(ts); // swap-in only
/// // ... use the subinterpreter ...
/// } // swap-out, tstate kept alive
/// {
/// py::subinterpreter_scoped_activate guard(ts); // reuses the same PyThreadState
/// // ...
/// }
/// @endcode
class subinterpreter_thread_state {
public:
/// Create a PyThreadState for `si` on the calling OS thread. The new state is left in a
/// released state (not current, no GIL acquired).
explicit subinterpreter_thread_state(subinterpreter const &si);

/// Destroy the owned PyThreadState. Must run on the same OS thread that constructed this
/// object, while the owning subinterpreter is still alive, and while no
/// subinterpreter_scoped_activate referring to this object is alive.
~subinterpreter_thread_state();

subinterpreter_thread_state(subinterpreter_thread_state const &) = delete;
subinterpreter_thread_state(subinterpreter_thread_state &&) = delete;
subinterpreter_thread_state &operator=(subinterpreter_thread_state const &) = delete;
subinterpreter_thread_state &operator=(subinterpreter_thread_state &&) = delete;

/// The interpreter this thread state belongs to.
PyInterpreterState *interpreter_state() const { return istate_; }

/// The owned PyThreadState pointer; valid for the lifetime of this object.
PyThreadState *raw_thread_state() const { return tstate_; }

private:
friend class subinterpreter_scoped_activate;
PyThreadState *tstate_ = nullptr;
PyInterpreterState *istate_ = nullptr;
};
Comment thread
b-pass marked this conversation as resolved.

class scoped_subinterpreter {
public:
scoped_subinterpreter() : si_(subinterpreter::create()), scope_(si_) {}
Expand All @@ -244,6 +322,8 @@ class scoped_subinterpreter {
subinterpreter_scoped_activate scope_;
};

// --- subinterpreter_scoped_activate -----------------------------------------------------------

inline subinterpreter_scoped_activate::subinterpreter_scoped_activate(subinterpreter const &si) {
if (!si.istate_) {
pybind11_fail("null subinterpreter");
Expand All @@ -267,6 +347,47 @@ inline subinterpreter_scoped_activate::subinterpreter_scoped_activate(subinterpr
detail::get_internals().tstate = tstate_;
}

inline subinterpreter_scoped_activate::subinterpreter_scoped_activate(
subinterpreter_thread_state &ts) {
if (ts.tstate_ == nullptr) {
pybind11_fail("subinterpreter_scoped_activate: empty subinterpreter_thread_state");
}

if (detail::get_interpreter_state_unchecked() == ts.istate_) {
// We are already on this interpreter -- e.g. nested activation, or a different
// PyThreadState for the same interpreter is already current on this thread. Match the
// fast path of the (subinterpreter const&) overload: just ensure the GIL is held. The
// `ts` argument's PyThreadState is intentionally NOT swapped to here; the already-current
// tstate keeps being used until the outer scope exits.
simple_gil_ = true;
gil_state_ = PyGILState_Ensure();
return;
}

#if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
{
// A PyThreadState is bound to its creating OS thread; it may only be activated there.
bool same_thread = true;
# ifdef PY_HAVE_THREAD_NATIVE_ID
same_thread = PyThread_get_thread_native_id() == ts.tstate_->native_thread_id;
# endif
if (!same_thread) {
pybind11_fail("subinterpreter_scoped_activate: a subinterpreter_thread_state must be "
"activated on the same OS thread that constructed it!");
}
}
#endif

tstate_ = ts.tstate_;
borrowed_ = true;

// make the interpreter active and acquire the GIL
old_tstate_ = PyThreadState_Swap(tstate_);

// save this in internals for scoped_gil calls (see also: PR #5870)
detail::get_internals().tstate = tstate_;
}
Comment thread
b-pass marked this conversation as resolved.

inline subinterpreter_scoped_activate::~subinterpreter_scoped_activate() {
if (simple_gil_) {
// We were on this interpreter already, so just make sure the GIL goes back as it was
Expand All @@ -279,13 +400,63 @@ inline subinterpreter_scoped_activate::~subinterpreter_scoped_activate() {
}
#endif
Comment thread
b-pass marked this conversation as resolved.
Comment thread
b-pass marked this conversation as resolved.
detail::get_internals().tstate.reset();
PyThreadState_Clear(tstate_);
PyThreadState_DeleteCurrent();
if (!borrowed_) {
PyThreadState_Clear(tstate_);
PyThreadState_DeleteCurrent();
}
// When borrowed_, tstate_ stays alive in its owning subinterpreter_thread_state for
// reuse; the PyThreadState_Swap below merely detaches it from this thread.
}

// Go back the previous interpreter (if any) and acquire THAT gil
PyThreadState_Swap(old_tstate_);
}
}

// --- subinterpreter_thread_state --------------------------------------------------------------

inline subinterpreter_thread_state::subinterpreter_thread_state(subinterpreter const &si) {
if (!si.istate_) {
pybind11_fail("subinterpreter_thread_state: null subinterpreter");
}
istate_ = si.istate_;
// PyThreadState_New does not require holding any GIL and does not make the new state current.
tstate_ = PyThreadState_New(istate_);
if (tstate_ == nullptr) {
pybind11_fail("subinterpreter_thread_state: PyThreadState_New returned null");
}
}

inline subinterpreter_thread_state::~subinterpreter_thread_state() {
if (tstate_ == nullptr) {
return;
}
#if defined(PYBIND11_DETAILED_ERROR_MESSAGES)
{
// A PyThreadState must be cleared and deleted on the OS thread that created it.
bool same_thread = true;
# ifdef PY_HAVE_THREAD_NATIVE_ID
same_thread = PyThread_get_thread_native_id() == tstate_->native_thread_id;
# endif
if (!same_thread) {
pybind11_fail("~subinterpreter_thread_state: must be destroyed on the same OS thread "
"that constructed it!");
}
}
#endif
// The PyThreadState must be made current to be cleared and deleted on the owning OS thread.
// Swap it in (which acquires the subinterpreter's GIL), clear+delete, then restore whatever
// was active before.
PyThreadState *prev = PyThreadState_Swap(tstate_);
PyThreadState_Clear(tstate_);
PyThreadState_DeleteCurrent();
// If `prev` is tstate_ itself, the user destroyed this object while it was active via a
// subinterpreter_scoped_activate -- a contract violation, but be defensive: do NOT swap back
// to a now-deleted pointer. Leaving the thread with no current interpreter is consistent
// with the cached state having just been destroyed.
if (prev != nullptr && prev != tstate_) {
PyThreadState_Swap(prev);
Comment thread
b-pass marked this conversation as resolved.
}
}
Comment thread
b-pass marked this conversation as resolved.

PYBIND11_NAMESPACE_END(PYBIND11_NAMESPACE)
Loading
Loading