A93: xDS ExtProc Support by markdroth · Pull Request #484 · grpc/proposal

markdroth · 2025-03-13T20:42:43Z

No description provided.

dfawley · 2026-02-23T21:56:58Z

+The ext_authz filter will export metrics using the non-per-call metrics
+architecture defined in [A79].  There will be a separate set of metrics


Oh maybe the implication is that it will get all of those labels already? If so we should call that out, probably?

Not sure what you mean by "all of those labels". A79 doesn't define any labels that are intended to be applied to all metrics.

Maybe you meant to refer to the per-call metric labels? If so, I've answered that below.

eshitachandwani · 2026-03-16T09:04:55Z

+breaking their ext_proc servers.
+
+gRPC implementations should structure their xDS HTTP filter APIs such that
+the filter has access to the serialized bytes of each message rather than


Correct me if I am wrong , but from what I know is that a decision has been made to bear the double serialization/deserialization cost as of now , at least in Go. Do you think it is worth mentioning that here?

As far as this spec is concerned, implementations should not do redundant serialization/deserialization. Go may choose to initially implement it that way in order to get the functionality done faster, but that really isn't a good state to be in for the long term, and I expect that people will complain about the performance.

Adds a new body send mode for gRPC traffic. Also adds a safe way for the ext_proc server to return OK status without losing data in FULL_DUPLEX_STREAMED and GRPC modes. See grpc/proposal#484 for context. Risk Level: Low Testing: N/A Docs Changes: Included in PR Release Notes: N/A Platform Specific Features: N/A --------- Signed-off-by: Mark D. Roth <roth@google.com> Co-authored-by: Adi (Suissa) Peleg <adip@google.com> Signed-off-by: Gustavo <grnmeira@gmail.com>

eshitachandwani · 2026-04-06T21:51:44Z

+
+Events on the data plane RPC should be sent on the ext_proc stream as
+they occur, even if the filter has not yet received a response from the
+ext_proc server for a previous event.  For example, if the filter is


Curious about the example here, from what I understand, sending and getting the modified headers should be a blocking operation and only after modifying the headers should we use them to create the dataplane stream. The Send and Receive functions should come after the stream is created. The example seems to suggest that Send can happen before getting the header modification. Just want to understand the sequence of events.

In the C-core API, the application does not need for sending the headers to complete before it sends the first message on the stream, which is why this can happen.

I am guessing from your comment that the grpc-go API does not allow the application to send the first message until it gets the stream object, which is not returned until the headers have gone out on the wire? If so, then I think there are two possible ways to deal with it:

Keep things the way they are, which means this specific case can't happen in Go.

Tell the application that the stream has been created as soon as headers are sent to the ext_proc server, without waiting for the headers to be sent to the data plane server. This would increase the amount of memory used for buffering headers, but it would allow better pipelining.

I think either option is probably fine. @easwars and/or @dfawley may have thoughts on which one makes more sense for Go.

Note that even if this specific example can't happen in Go, there are other cases that will happen. For example, in the server-to-client direction, the ext_proc filter does not need to wait until it receives the response headers from the ext_proc server before it sends the first response message.

@ejona86 Not sure if this is an issue for Java.

IIRC, in Go the client headers block until a stream is allocated. So it could make sense to wait for those to be processed by ext_proc before unblocking the client, to still wait for the stream to be allocated. That doesn't sound required though. And it would add significant latency, especially for unary RPCs.

Java doesn't have flow control for unary RPCs, so it doesn't impact that. We do typically delay saying the RPC is ready for data until we allocate the stream, but we can trivially implement this either way. (I think it'd only impact a single if.)

From what I can see, when an application attempts to create a stream:

the grpc layer creates an instance of the client stream interface and as part of doing this, it creates an attempt

the attempt retrieves the transport by doing a Pick and creates a stream on the transport

this results in an instance of the client stream interface being created at the transport layer

this allocates the stream ID and checks for write quota (which should exist, the default is 64K)

creates the header frame and queues it in the control buf

returns

At this point, the application should have a stream and should be able to send messages on it. They won't get sent out on the wire until the headers frame is sent out. I don't see the transport code even waiting to get a headers frame before sending out data (as long as there is write quota).

@eshitachandwani : If you think otherwise, let's talk and go over the code together.

@easwars The ext_proc filter would run between the first and second bullet in your list -- i.e., after the client stream instance is created but before the LB pick. This means that the workflow will go like this:

gRPC layer creates client stream interface, which creates an attempt.

Headers are sent to the ext_proc filter. The ext_proc filter sends them to the ext_proc server and waits for the response.

The ext_proc filter receives the headers from the ext_proc server. Now the ext_proc filter sends the headers on.

We do an LB pick, resulting in creating the client stream at the transport layer. This gets returned to the application, which can then send the first message.

The problem here is that there is a round-trip to the ext_proc server in steps 2 and 3, and the application will need to wait for that to finish before it can send the first message on the stream. This will hurt latency.

The alternative is to do something like this:

gRPC layer creates client stream interface, which creates an attempt.

Headers are sent to the ext_proc filter. The ext_proc filter sends them to the ext_proc server and waits for the response.

While waiting for the ext_proc response, return the client stream interface to the application, so that it can start sending messages on the stream.

The client application sends a message on a stream. This message gets down to the ext_proc filter, which sends the message to the ext_proc server.

The ext_proc filter receives the headers from the ext_proc server. Now the ext_proc filter sends the headers on.

We do an LB pick, resulting in creating the client stream at the transport layer.

This approach would improve latency by allowing the application to start sending messages on the stream without waiting to get the headers back from the ext_proc server.

Understood. I think the alternative should be doable in Go too.

markdroth · 2026-04-24T15:11:51Z

@kannanjgithub @eshitachandwani @rishesh007 FYI, it looks like we don't actually have a use-case for the mode override feature, and the logic for that is a little ugly, so I've removed it from the design. We can consider adding it in the future if/when we encounter a use-case for it.

kannanjgithub · 2026-04-27T17:07:15Z

+    request trailers.
+- [request_attributes](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto#L188)
+  and
+  [response_attributes](https://github.com/envoyproxy/envoy/blob/cdd19052348f7f6d85910605d957ba4fe0538aec/api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto#L195):


Should we add more details on how to determine the value of some of these response attributes? For example client stream tracer would help getting the response.size and response.backend_latency but response.header_size poses a problem because the Netty/Ok http don't surface the raw header size on the wire to the tracer callback, but only the parsed headers.

Also A106 doesn't talk about response attributes yet, so should we support them in ext_proc filter or not?

kannanjgithub · 2026-04-28T07:57:49Z

+order as on the data plane.  For client-to-server events, the order must
+be headers, followed by zero or more messages, followed by a half-close.
+For server-to-client events, the order must be headers (skipped for
+Trailers-Only), followed by zero or more messages, followed by trailers.


It is not skipped for trailers-only. It is sent as HttpHeaders with end_of_stream set to true.

kannanjgithub · 2026-04-28T18:41:37Z

+client sends a client headers event and a client message event and the
+ext_proc server responds to the client message event first, that is
+considered a protocol error.  The filter will treat that as if the
+ext_proc stream failed with a non-OK status.


That implies we return a status code UNAVAILABLE. Doesn't INTERNAL seem more appropriate here? That's what envoy does (HTTP status code 500). Same question for invalid header mutation.

kannanjgithub · 2026-04-29T07:37:34Z

+client sends a client headers event and a client message event and the
+ext_proc server responds to the client message event first, that is
+considered a protocol error.  The filter will treat that as if the
+ext_proc stream failed with a non-OK status.


We are saying to use UNAVAILABLE everywhere except in the case of immediate response that specifies the status code to use. For protocol errors like this and in other places would it not be more appropriate to use the INTERNAL status? Envoy uses 500 HTTP status code in these cases.

markdroth added 20 commits March 12, 2025 23:43

WIP

8028bf3

more WIP

7a9afce

nail down some details

bb9125a

reference A102

c90bc7d

flow control for o11y mode

634cfec

added a lot more details, and a lot of TODOs

85de0ec

lots more details

8a9fefc

fix section header scope

be854e0

trace context

1d9e41d

flow control, and remove event list

15ad179

no message timeout

f99a868

add TODO about metrics

08230f0

avoid the need for flow control by using full duplex streaming

ba1133a

use streamed body response, and add request_drain field

3849858

flesh out metrics

1a896bb

fix typo

e532a11

add end_of_stream_without_message fields

0807986

move metrics section

ccaee21

spell out drain workflow

51add0a

document filter state retention

f483125

markdroth mentioned this pull request Sep 18, 2025

xDS: ext_proc: add GRPC body send mode envoyproxy/envoy#38753

Merged

markdroth marked this pull request as ready for review September 18, 2025 22:50

markdroth requested review from dfawley and ejona86 September 18, 2025 22:50

markdroth added 6 commits September 18, 2025 22:58

add mailing list link

3404ced

improve description of body send mode

17a8fe4

document interaction with compression

376ecd7

fix typo

e2da6bd

refer back to A92 for header rewriting

ef27648

refer to A92 also for header mutation rules

36fb214

markdroth mentioned this pull request Feb 21, 2026

[xDS] ext_proc filter grpc/grpc#41704

Draft

easwars mentioned this pull request Feb 23, 2026

Public Extension API for Custom xDS HTTP Filters in grpc-go grpc/grpc-go#8934

Closed

dfawley reviewed Feb 23, 2026

View reviewed changes

Comment thread A93-xds-ext-proc.md

rishesh007 reviewed Mar 6, 2026

View reviewed changes

Comment thread A93-xds-ext-proc.md

review changes

cf5e8d8

eshitachandwani reviewed Mar 16, 2026

View reviewed changes

markdroth requested a review from dfawley March 17, 2026 23:57

markdroth added 3 commits March 25, 2026 01:19

add links to A106 and A103

f1434bd

fix section on mode overrides

2093369

clarify flow control

31a4ad0

eshitachandwani reviewed Apr 6, 2026

View reviewed changes

eshitachandwani reviewed Apr 8, 2026

View reviewed changes