Skip to content

De-duplicate callback log reading with the task-instance log reading path #68822

Description

@jason810496

Depends on #66610 getting merged first.

Summary

PR #66610 added airflow-core/src/airflow/utils/log/callback_log_reader.py to read
deadline-callback execution logs for the UI. Its remote/local read helpers
re-implement logic that already exists in FileTaskHandler
(airflow-core/src/airflow/utils/log/file_task_handler.py) and is fronted by
TaskLogReader (airflow-core/src/airflow/utils/log/log_reader.py). This issue
tracks consolidating the two paths onto shared helpers as a follow-up.

This is tech debt, not a bug — the duplication is intentional in #66610 to keep
that PR focused.

Duplicated logic

callback_log_reader.py Existing equivalent in file_task_handler.py
_read_callback_remote_logs FileTaskHandler._read_remote_logs (:967)
_read_callback_local_logs (glob-by-prefix + os.path.commonpath containment) FileTaskHandler._read_from_local (:862, :880)
remote-first-then-local fallback + _interleave_logs + "Log message source details" group header FileTaskHandler.read (:737)
tuple[list[str], list[RawLogStream]] return annotation existing StreamingLogResponse alias

Why a shared abstraction needs design (why it was deferred)

FileTaskHandler's read path is TaskInstance-centric — it takes (ti, try_number, metadata) and renders paths from the TI. Callback logs have no TaskInstance: the
path is a fixed prefix (executor_callbacks/{dag_id}/{run_id}/{callback_id} or
triggerer_callbacks/...). Extracting the common core means separating the
path-resolution step from the storage-read step so both callers can supply their own
relative path(s).

Follow-up work

  • Extract the storage-read core (remote-then-local fallback, _interleave_logs,
    source-header emission, os.path.commonpath containment) into a shared helper that
    takes already-resolved relative path(s) rather than a TI.
  • Have both FileTaskHandler.read and callback_log_reader.read_callback_log call it.
  • Annotate the callback reader helpers with the existing StreamingLogResponse alias.

Acceptance criteria

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions