Skip to content

Feature request: Add deferrable support for invoking/waiting on Google Cloud Functions / Cloud Run functions #68908

Description

@BarBuccianti

Description

Add provider-maintained deferrable support for invoking and waiting on Google Cloud Functions / HTTP Cloud Run functions from apache-airflow-providers-google.

Today, the Google provider supports deferrable execution for several Google Cloud services, including BigQuery, GCS, Dataflow, Pub/Sub, and Cloud Run Jobs via CloudRunExecuteJobOperator. However, there does not appear to be an equivalent deferrable operator/sensor/trigger pattern for HTTP Cloud Functions or HTTP Cloud Run functions.

CloudFunctionInvokeFunctionOperator is synchronous and documented as intended for testing purposes with limited traffic. For production workflows that trigger a function and then need to wait for asynchronous completion, users currently need to either:

maintain custom Airflow trigger/sensor/operator code, including Google auth, polling, timeout, retry, and failure semantics; or
implement an indirect durable-status pattern, such as having the function write completion state to BigQuery/GCS and waiting on that state with an existing deferrable sensor.
It would be useful to have a first-class deferrable pattern in the Google provider for this use case, for example a deferrable Cloud Functions / HTTP Cloud Run function operator or sensor that handles invocation, authenticated HTTP requests, polling/completion checks, timeout handling, retries, and failure propagation.

Use case/motivation

We have Airflow DAGs that trigger Google Cloud Functions / HTTP Cloud Run functions to perform asynchronous work. The function invocation itself is short, but the downstream processing can take longer, and Airflow needs to wait for the work to complete before continuing the DAG.

Because there is no provider-maintained deferrable Cloud Functions / HTTP Cloud Run function sensor/trigger today, using a synchronous task or regular sensor would occupy worker resources while waiting. To avoid that, we currently use a workaround where the function writes completion/status data to BigQuery, and Airflow waits on that status using an existing deferrable BigQuery sensor.

This works, but it adds extra infrastructure and indirection only to compensate for the missing deferrable Cloud Functions / HTTP function pattern. A provider-supported deferrable operator/sensor would reduce maintenance burden, avoid custom triggerer code, and make this pattern more consistent with other Google provider integrations such as BigQuery, GCS, Pub/Sub, Dataflow, dbt-style async workflows, and Cloud Run Jobs.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions