From 3f4f10704bd973038d43ce7a5f27a13f912215b0 Mon Sep 17 00:00:00 2001 From: nugaon Date: Thu, 30 Apr 2026 18:12:04 +0200 Subject: [PATCH 1/9] pubsub init --- SWIPs/swip-.md | 235 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 235 insertions(+) create mode 100644 SWIPs/swip-.md diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md new file mode 100644 index 0000000..839e991 --- /dev/null +++ b/SWIPs/swip-.md @@ -0,0 +1,235 @@ +--- +SWIP: +title: PubSub protocol +author: Viktor Tóth (@nugaon), Viktor Trón (@zelig) +discussions-to: +status: Draft +type: Standards Track (Networking) +created: 2026-04-30 +--- + +## Simple Summary + +A real-time messaging feature for dApps: WebSocket clients publish and subscribe to topic streams through Bee nodes, which act as the transport layer by leveraging their existing libp2p connections and bandwidth incentive system. + +## Abstract + +One designated node operates as a **Broker**: it accepts long-lived p2p streams and broadcasts them to all connected receivers. Other nodes connect as either a **Publisher** (send + receive) or a **Subscriber** (receive only). A WebSocket API on each Bee node serves as the bidirectional bridge between dApps and the p2p stream. Message format, validation and handshake logic are defined by a pluggable `Mode`; the initial mode `gsoc-ephemeral` uses SOC-style signing to authenticate pubsub messages in transit — these are not stored on the Swarm network as GSOC chunks. This SWIP also covers a decentralised broker discovery mechanism that locates a suitable broker for a topic based on Kademlia routing, with load balancing across multiple brokers deferred to a later milestone. + +## Motivation + +Swarm has two event-based primitives — GSOC and PSS — but both require full-node operation: the events arrive via Kademlia routing as part of pull/push syncing, which light clients do not participate in. For anyone not running a full node the only option is polling storage, which is slow and fundamentally not real-time. This leaves two unaddressed needs: real-time message exchange that does not require storing chunks on the network, and a way to channel network events that full nodes observe naturally out to light clients. + +A brokered pub/sub layer fills several gaps at once: + +- **Real-time applications** can exchange messages without long-term storage or polling. +- **Swarm network events** (e.g. incoming GSOC notifications) can be fanned out to light clients that would otherwise never see them. +- **Bandwidth incentives** — brokers are compensated for the data they transmit, creating a sustainable relay economy within Swarm. +- **Store-less uploads** — a publisher mode could let light clients push chunks to the network and pay by bandwidth rather than postage stamp. + +The mode system ensures the protocol is not locked to any single message format and can evolve to cover these use cases incrementally. + +## Specification + +### Roles + +``` +Subscriber ──► (p2p stream, read-only) ──►┐ + Broker ──► rebroadcast to all subscribers +Publisher ──► (p2p stream, read+write) ──►┘ +``` + +| Role | Description | +|---|---| +| **Broker** | Opt-in (`--pubsub-broker-mode`). Validates publisher identity; re-broadcasts to all subscribers. | +| **Subscriber** | Dials broker; receives all broadcasts. | +| **Publisher** | Upgraded subscriber; sends mode-specific messages to the broker; also receives broadcasts. | + +### Protocol + +- **libp2p**: `pubsub/1.0.0`, stream name `msg` +- Topic address and mode are negotiated via **libp2p stream headers** (not the stream name) + +#### Stream headers (client → broker) + +| Key | Value | +|---|---| +| `pubsub-topic-address` | 32-byte topic address | +| `pubsub-mode` | 1-byte mode ID | +| `pubsub-readwrite` | `0x01` publisher / `0x00` subscriber | +| `pubsub-gsoc-owner` | 20-byte ETH address _(GSOC-Ephemeral mode, publisher only)_ | +| `pubsub-gsoc-id` | 32-byte SOC ID _(GSOC-Ephemeral mode, publisher only)_ | + +#### Wire format + +All broker→subscriber frames share a common 1-byte type prefix. `0x01` is permanently reserved at the service level (ping, valid across all modes); the broker sends a ping every 30 s to keep the long-lived stream alive. +Mode-specific types start at `0x02`. + +``` +Broker → any subscriber: +[ 0x01 ] ping (service level, all modes — no further fields) +[ 0x02+ ] mode-specific frame +``` + +Publisher→Broker framing is mode-specific and carries **no message type prefix** — the broker knows the stream is a publisher stream from the `pubsub-readwrite` header set at connect time. + +#### GSOC Ephemeral mode (mode 1) + +Messages are SOC chunks. The topic address is `soc.CreateAddress(socID, ownerAddr)`, so only the holder of the topic private key can publish. The broker verifies the ECDSA signature on every message before broadcasting. + +``` +Publisher → Broker: +[ sig: 65 B ][ span: 8 B LE ][ payload: up to 4 KB ] + +Broker → Subscriber: +[ 0x02 ][ SOC ID: 32 B ][ owner: 20 B ][ sig: 65 B ][ span: 8 B ][ payload ] handshake (first msg) +[ 0x03 ][ sig: 65 B ][ span: 8 B ][ payload ] data (subsequent) +``` + +The handshake frame carries SOC identity once on first broadcast; subsequent messages are data-only. The subscriber verifies `soc.CreateAddress(id, owner) == topicAddress` on handshake receipt. + +### WebSocket API + +``` +GET /pubsub/{topic} — WebSocket upgrade (subscriber or publisher) +GET /pubsub/ — list active topics +``` + +Connection parameters are accepted as HTTP headers or query params (query param fallback for browser WebSocket clients that cannot set custom headers): + +- `Swarm-Pubsub-Peer` (required): multiaddr of the broker +- `Swarm-Pubsub-Gsoc-Eth-Address` + `Swarm-Pubsub-Gsoc-Topic` (optional, GSOC Ephemeral mode): enable publisher role + +The WebSocket client sees the mode's raw payload; all p2p framing is transparent. For GSOC-Ephemeral mode: `[sig: 65 B][span: 8 B][payload]`. + +### Multi-session multiplexer + +Multiple WebSocket sessions on the same node and topic share one p2p stream: + +``` +WS session 1 ──┐ +WS session 2 ──┤ SubscriberConn (shared stream + runMux goroutine) ──► Broker +WS session N ──┘ +``` + +`runMux` reads from the stream and fans out to per-session channels. Ref-counting (`refs`) ensures `FullClose` is called exactly once when the last session exits. If the stream dies, the shared conn is cleared immediately so new sessions open a fresh stream. + +### Mode extensibility + +The `Mode` interface decouples the protocol machinery from message semantics: + +``` +type Mode interface { + Connect(...) // open stream with appropriate headers + HandleBroker(...) // broker-side stream handler + ReadBrokerMessage() // decode one broker→subscriber frame + FormatBroadcast() // encode one broker→subscriber frame + ValidatePublisher() // verify publisher identity + ... +} +``` + +New modes can be added by implementing `Mode` and registering a mode ID. Candidates include: unauthenticated broadcast, stake-gated publishing, Swarm-event fan-out, or bandwidth-incentivised chunk upload. + +## Roadmap + +### Milestone 1 — Direct messaging _(this SWIP)_ + +Two-directional messaging between a broker and its direct peers over a dedicated libp2p channel. Top-down message broadcast with per-message authentication. + +Deliverables: pubsub protocol in Bee, WebSocket + topic-list API endpoints, pubsub JS library. + +### Milestone 2 — Bandwidth incentives + +The broker–subscriber stream is a metered channel: the subscriber pays the broker/forwarder per byte via chequebook cheques (incorporating Swarm's bandwidth incentive model). + +- Subscription connection query returns incentive params (price in PLUR/byte, cheque threshold). +- Bee gains a pubsub cashout option for accumulated cheques. +- Light clients require a funded chequebook and a blockchain connection. + +### Milestone 3 — Decentralised broker discovery + +Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers connection data from the topic's responsible neighbourhood using a two-step MIC-GSOC handshake (see MIC/MOC [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)). + +``` +Subscriber Chosen broker peer (P) Topic neighbourhood (E_a) + │ (from current connections) │ + │ │ │ + │ PubSub subscribe to │ │ + │ Sub Resp GSOC ─────►│ mined: PO(SubRes_a, P) = 16 │ + │ │ │ + │── Sub Request MIC ──┼──────────────────────────────►│ PO(Req_a, E_a) >= d+1 + │ payload: E_a, │ │ (routed by pull/push sync) + │ chequebook addr, │ │ + │ Sub Resp SOC params (ID + ephemeral key) │ + │ │ │ + │ │◄─ Sub Response GSOC(s) ───────│ brokers sign with ephemeral key + │ │ payload: overlay, underlay, │ (routed to P by pull/push sync) + │ │ incentive params, │ + │ │ HIVE connection list│ + │◄──── GSOC event ────│ │ + │ │ │ + │── libp2p connect ───┼──────────────────────────────►│ subscriber picks a pubsub network +``` + +The Sub Request signing key is derived from a well-known string, requiring no out-of-band coordination: + +``` +SubReqKey = keccak256("SUB_REQUEST") +``` + +The Sub Request is a MIC chunk (SOC signed by `SubReqKey`). Its ID is mined so the chunk address falls in the topic neighbourhood; pull/push sync routes it there naturally by proximity. The Sub Request identity must be mined until `PO(Req_a, E_a) >= storage_depth + 1` (or `= 16` if the current storage depth is unavailable). + +The Sub Response is a GSOC rather than a MIC deliberately: a MIC subscription listens by Ethereum address, so a well-known signing key would cause all concurrent discovery sessions on P to receive each other's responses. A GSOC subscription listens on a specific SOC address `soc.CreateAddress(randomID, ephemeralAddr)` — unique per subscriber — so responses are always isolated. + +The subscriber pre-mines a Sub Response SOC identifier and generates an ephemeral key, both included in the Sub Request payload. Broker nodes in the topic neighbourhood sign the Sub Response as a GSOC using the provided ephemeral key. The subscriber listens for GSOC events on the mined Sub Response address to collect broker replies. + +The Sub Response SOC address must be mined very close to P's overlay (`PO = 16`). This is required because the current GSOC implementation at the moment stores only one payload per address: if multiple brokers respond to the same address, the last writer wins and earlier responses are lost before pull syncing them. Full multi-response support would require GSOC to retain multiple payloads per address, which is left as a future improvement. + +Both sides require postage stamps for their uploads: the subscriber needs a mutable stamp for the Sub Request MIC, and each responding broker needs a mutable stamp for its Sub Response GSOC. Alternatively, once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, both stamp requirements can be lifted. + +New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns connection data from the topic's neighbourhood. + +### Milestone 4 — Load balancing and multi-level forwarding + +Balance subscriber load across multiple brokers. Introduce HIVE-like forwarder discovery and a multi-level forwarding tree so traffic is distributed across willing relay nodes rather than concentrated on a single broker. + +``` + Root (broker / neighbourhood node) + / | \ + Relay A Relay B Relay C + / \ | + Sub 1 Sub 2 Sub 3 ... +``` + +- Forwarders earn relay fees; they are incentivised to forward to more than one downstream client. +- Light-client-to-light-client connections (both behind NAT) use DCUtR with the broker as the relay, enabling direct p2p streams without a persistent intermediary. + +## Rationale + +- **Broker topology** keeps the subscriber implementation simple and connection count low; brokers can be specialised nodes. +- **GSOC Ephemeral mode** reuses existing SOC signing infrastructure and provides per-message authenticity without additional key exchange. It is the first mode, not the only one. +- **Shared p2p stream per topic per node** avoids redundant connections when multiple browser tabs open the same topic. +- **Type-byte framing** with a reserved service-level slot (`0x01` = ping) allows future modes to be added without breaking the keepalive mechanism. + +## Backwards Compatibility + +This is a new protocol (`pubsub/1.0.0`) with no overlap with existing Bee protocols. Broker mode is opt-in. No existing behaviour is affected. + +## Test Cases + +- Broker correctly re-broadcasts a valid publisher message to all connected subscribers. +- Broker rejects a message that fails mode validation (e.g. invalid SOC signature in GSOC-Ephemeral mode). +- Multiple WebSocket sessions on the same topic share one p2p stream (ref count increments/decrements correctly). +- Stream failure clears the shared conn; next session opens a fresh stream. +- Ping frames are consumed at service level and not forwarded to the WebSocket client. + +## Implementation + +Reference implementation (Milestone 1): +- Bee node: [ethersphere/bee#5435](https://github.com/ethersphere/bee/pull/5435) (`feat/pubsub` branch) +- bee-js client: [ethersphere/bee-js#1151](https://github.com/ethersphere/bee-js/pull/1151) + +## Copyright + +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). From f625e6684ee9435deef99380cebeeabe44035c47 Mon Sep 17 00:00:00 2001 From: nugaon Date: Tue, 19 May 2026 16:30:42 +0200 Subject: [PATCH 2/9] chunk payments instead of bytes --- SWIPs/swip-.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index 839e991..af87ffe 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -141,9 +141,9 @@ Deliverables: pubsub protocol in Bee, WebSocket + topic-list API endpoints, pubs ### Milestone 2 — Bandwidth incentives -The broker–subscriber stream is a metered channel: the subscriber pays the broker/forwarder per byte via chequebook cheques (incorporating Swarm's bandwidth incentive model). +The broker–subscriber stream is a metered channel: the subscriber pays the broker/forwarder per chunk via chequebook cheques (incorporating Swarm's bandwidth incentive model). -- Subscription connection query returns incentive params (price in PLUR/byte, cheque threshold). +- Subscription connection query returns incentive params (price in PLUR/chunk, cheque threshold). - Bee gains a pubsub cashout option for accumulated cheques. - Light clients require a funded chequebook and a blockchain connection. From f71eda782645987e131b50ef530f6211f43eb0e9 Mon Sep 17 00:00:00 2001 From: nugaon Date: Tue, 19 May 2026 18:42:29 +0200 Subject: [PATCH 3/9] milestone 3 with MOC --- SWIPs/swip-.md | 157 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 130 insertions(+), 27 deletions(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index af87ffe..db9370f 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -149,46 +149,149 @@ The broker–subscriber stream is a metered channel: the subscriber pays the bro ### Milestone 3 — Decentralised broker discovery -Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers connection data from the topic's responsible neighbourhood using a two-step MIC-GSOC handshake (see MIC/MOC [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)). +Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers connection data from a specific broker node using a MOC-based dead-drop handshake (see MOC [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)). Unlike neighbourhood-broadcast approaches, the request targets a single broker: the subscriber mines a SOC owner key so the resulting chunk address is closest to the broker's overlay address. During push-sync the chunk is delivered directly to the closest node (analogous to [bee#5081](https://github.com/ethersphere/bee/pull/5081)), ensuring the broker receives it without neighbourhood-wide replication. The subscriber encrypts the payload to the broker's public key, and the broker responds by overwriting the same chunk address. + +#### On-chain broker registry + +The subscriber must know the target broker's overlay address and public key before initiating discovery. A smart contract — either extending the balanced neighbourhood registry ([SWIP-39](https://github.com/ethersphere/SWIPs/pull/74)) with pubkey field or deployed as a standalone PubSub registry — maps topics to broker nodes. Each registry entry contains: + +| Field | Description | +|---|---| +| `overlay` | 32-byte Swarm overlay address | +| `pubkey` | 65-byte secp256k1 public key (Swarm key) | + +Brokers register on-chain when they opt into `--pubsub-broker-mode`. The subscriber queries the registry to obtain `(overlay_B, PK_B)` for a chosen broker. + +The broker's secp256k1 public key must be the same key whose private counterpart the node uses for Swarm chunk-level operations (its Swarm key), so that ECIES decryption in the detection step works without additional key management. + +#### Protocol constants + +``` +SOC_ID = keccak256("PUBSUB-REQUEST") // 32-byte fixed SOC identifier +``` + +All discovery requests across the network share this single SOC ID. Isolation between concurrent sessions is achieved by the uniqueness of the mined owner key, not by the ID. + +#### Workflow + +``` +Subscriber (S) Broker (B) + │ │ + │ 1. Registry lookup │ + │ (overlay_B, PK_B) ◄── on-chain registry │ + │ │ + │ 2. Mine secp256k1 key pair (k, K = k·G): │ + │ a = ethAddr(K) │ + │ SOC_a = keccak256(SOC_ID ‖ a) │ + │ PO(SOC_a, overlay_B) ≥ depth + 1 │ + │ │ + │ 3. Build request payload P: │ + │ { topic, k, chequebook_addr, ... } │ + │ C_req = ECIES_Encrypt(PK_B, P) │ + │ │ + │── 4. Upload MOC(id=SOC_ID, key=k, data=C_req) ────►│ + │ (pull/push sync routes to B's neighbourhood) │ + │ │ + │ 5. Detect incoming SOC: │ + │ id == SOC_ID ? │ + │ chunk in my neighbourhood ? │ + │ ECIES_Decrypt(sk_B, C_req) │ + │ → success: chunk is for me │ + │ → failure: ignore │ + │ │ + │ 6. Extract k from payload │ + │ Build response R: │ + │ { overlay, underlay, │ + │ incentive_params, │ + │ hive_conn_list } │ + │ sym_key = keccak256(k) │ + │ C_res = AES-256-GCM(sym_key, │ + │ nonce, R) │ + │ Sign new SOC with k │ + │ → same SOC_a (overwrites) │ + │ │ + │ ─── 7. Store response SOC │ + │ locally (same chunk addr) │ + │ │ + │◄── 8. Fetch SOC_a ─────────────────────────────────│ + │ Decrypt: AES-256-GCM(keccak256(k), nonce, │ + │ C_res) → R │ + │ Extract broker connection info │ + │ │ + │── 9. libp2p connect(underlay_B) ───────────────────►│ +``` + +#### Mining the request key + +The subscriber iterates secp256k1 private keys deterministically starting from a seed derived from its own public key until the resulting SOC address is closest to the target broker's overlay: ``` -Subscriber Chosen broker peer (P) Topic neighbourhood (E_a) - │ (from current connections) │ - │ │ │ - │ PubSub subscribe to │ │ - │ Sub Resp GSOC ─────►│ mined: PO(SubRes_a, P) = 16 │ - │ │ │ - │── Sub Request MIC ──┼──────────────────────────────►│ PO(Req_a, E_a) >= d+1 - │ payload: E_a, │ │ (routed by pull/push sync) - │ chequebook addr, │ │ - │ Sub Resp SOC params (ID + ephemeral key) │ - │ │ │ - │ │◄─ Sub Response GSOC(s) ───────│ brokers sign with ephemeral key - │ │ payload: overlay, underlay, │ (routed to P by pull/push sync) - │ │ incentive params, │ - │ │ HIVE connection list│ - │◄──── GSOC event ────│ │ - │ │ │ - │── libp2p connect ───┼──────────────────────────────►│ subscriber picks a pubsub network +seed ← keccak256(PK_subscriber) // deterministic starting point +i ← 0 +repeat: + k ← keccak256(seed ‖ i) // 32-byte candidate private key + K ← secp256k1_pubkey(k) + a ← keccak256(K) [12:] // 20-byte Ethereum address + sa ← keccak256(SOC_ID ‖ a) // 32-byte SOC chunk address + i ← i + 1 +until PO(sa, overlay_B) ≥ storage_depth + 1 ``` -The Sub Request signing key is derived from a well-known string, requiring no out-of-band coordination: +Each iteration requires one secp256k1 scalar multiplication plus two Keccak-256 hashes. The expected number of iterations is `2^d` for a target depth `d`. At depth 12 this is ~4 096 iterations — well under a second on commodity hardware. + +#### Request encryption — ECIES on secp256k1 + +The request payload is encrypted with the Elliptic Curve Integrated Encryption Scheme (ECIES) — the same scheme and library used in Ethereum's devp2p/RLPx handshake (`go-ethereum/crypto/ecies`): + +1. Generate ephemeral key pair `(e, E = e·G)`. +2. Shared secret `S = ECDH(e, PK_B)`. +3. Key derivation: `(enc_key ‖ mac_key) = HKDF-SHA256(S)`. +4. `ciphertext = AES-128-CTR(enc_key, plaintext)`. +5. `tag = HMAC-SHA256(mac_key, ciphertext)`. +6. Output: `E ‖ ciphertext ‖ tag`. + +Only the holder of `sk_B` can derive the shared secret and decrypt. The ephemeral key `e` is discarded after encryption, providing forward secrecy per discovery session. Neighbourhood peers that store or forward the chunk cannot read its contents. + +#### Response encryption — AES-256-GCM (symmetric) + +The response is encrypted symmetrically using the mined private key `k` as key material. Both parties possess `k`: the subscriber mined it; the broker extracted it from the ECIES payload. ``` -SubReqKey = keccak256("SUB_REQUEST") +sym_key = keccak256(k) // 32 bytes → AES-256 key +nonce = keccak256(keccak256(k)) [:12] // 12 bytes, deterministic +C_res = AES-256-GCM_Encrypt(sym_key, nonce, response_payload) ``` -The Sub Request is a MIC chunk (SOC signed by `SubReqKey`). Its ID is mined so the chunk address falls in the topic neighbourhood; pull/push sync routes it there naturally by proximity. The Sub Request identity must be mined until `PO(Req_a, E_a) >= storage_depth + 1` (or `= 16` if the current storage depth is unavailable). +AES-256-GCM provides authenticated encryption: on decryption the subscriber verifies both confidentiality and integrity. Because `k` is unique per discovery session (freshly mined), the `(sym_key, nonce)` pair is never reused, satisfying GCM's uniqueness requirement. + +No party other than the subscriber and the target broker can derive `sym_key`, since `k` was transmitted inside the ECIES envelope. + +#### Broker detection logic + +A node running in broker mode applies the following filter to every incoming SOC chunk synced to its neighbourhood: + +1. **ID check** — SOC ID equals `keccak256("PUBSUB-REQUEST")`? +2. **Neighbourhood check** — chunk address within this node's storage responsibility? +3. **Decryption attempt** — `ECIES_Decrypt(sk_self, payload)`. Failure means the chunk is addressed to a different broker; discard silently. +4. **Payload validation** — extract `(topic, k, ...)`. Verify that `keccak256(SOC_ID ‖ ethAddr(k·G))` matches the chunk address. +5. **Response** — construct, encrypt, and store the modified response MOC signed with `k`. + +Step 3 is the key isolation mechanism: even though all broker nodes in the neighbourhood may see the same constant SOC ID, only the broker whose public key was used for ECIES encryption can successfully decrypt and act on the request. -The Sub Response is a GSOC rather than a MIC deliberately: a MIC subscription listens by Ethereum address, so a well-known signing key would cause all concurrent discovery sessions on P to receive each other's responses. A GSOC subscription listens on a specific SOC address `soc.CreateAddress(randomID, ephemeralAddr)` — unique per subscriber — so responses are always isolated. +#### Postage stamps -The subscriber pre-mines a Sub Response SOC identifier and generates an ephemeral key, both included in the Sub Request payload. Broker nodes in the topic neighbourhood sign the Sub Response as a GSOC using the provided ephemeral key. The subscriber listens for GSOC events on the mined Sub Response address to collect broker replies. +The subscriber needs a postage stamp for the MOC request upload. The broker needs a postage stamp for the MOC response upload. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, both stamp requirements can be lifted. -The Sub Response SOC address must be mined very close to P's overlay (`PO = 16`). This is required because the current GSOC implementation at the moment stores only one payload per address: if multiple brokers respond to the same address, the last writer wins and earlier responses are lost before pull syncing them. Full multi-response support would require GSOC to retain multiple payloads per address, which is left as a future improvement. +#### Known limitations -Both sides require postage stamps for their uploads: the subscriber needs a mutable stamp for the Sub Request MIC, and each responding broker needs a mutable stamp for its Sub Response GSOC. Alternatively, once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, both stamp requirements can be lifted. +1. **Single-node targeting** — The request reaches exactly one broker. If that broker is offline or unresponsive, the subscriber must time out and retry with another registry entry or initiating multiple parallel requests toward different brokers. +2. **On-chain dependency** — The subscriber must read the broker registry contract to learn `(overlay, pubkey)`. Light clients already require blockchain access for Swarm (postage stamp verification), so the marginal cost is low. The registry can be cached or mirrored off-chain. +3. **Concurrent requester collision** — If two subscribers targeting the same broker mine the same owner key (i.e. arrive at the same SOC address), the second request overwrites the first. Because key derivation is seeded from the subscriber's own public key, this can only happen if two nodes share the same Swarm key — an invalid network state. In practice the probability is negligible +4. **ECIES decryption cost** — The SOC ID `keccak256("PUBSUB-REQUEST")` is a global constant shared by all discovery requests. Every broker node must attempt ECIES decryption (one ECDH scalar multiplication) on every incoming SOC with this ID that falls within its neighbourhood, even if the chunk is addressed to a different broker. The cost scales with the number of concurrent discovery requests across all brokers in a neighbourhood. Under normal load this is negligible, but the asymmetry is exploitable (see point 6). +5. **Replay attacks** — A neighbourhood peer that observes a discovery request chunk can re-upload the identical bytes, causing the broker to re-process the request and overwrite its previous response. The attacker cannot read the request (ECIES-encrypted) or the response (AES-256-GCM-encrypted with `k`), so the damage is limited to wasted broker computation and potential disruption if the legitimate subscriber has not yet fetched the response. Implementations should consider deduplicating detection by chunk address to suppress repeated processing. +6. **DoS via discovery flooding** — An attacker can cheaply mine many keys targeting a specific broker's neighbourhood and flood it with discovery request chunks. Each chunk forces an ECIES decryption attempt on the broker. The attacker's cost is key mining (~2^d iterations at depth d) plus a postage stamp per chunk; the broker's cost is one ECDH operation per chunk. Postage stamp economics provide a baseline rate limit, but brokers may additionally rate-limit detection processing or require a proof-of-work token inside the ECIES payload. -New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns connection data from the topic's neighbourhood. +New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns broker connection data for the given topic. ### Milestone 4 — Load balancing and multi-level forwarding From 6511b99ea19847034f284af37897cbb38d486c46 Mon Sep 17 00:00:00 2001 From: nugaon Date: Thu, 21 May 2026 11:11:15 +0200 Subject: [PATCH 4/9] postage stamp requirement fix --- SWIPs/swip-.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index db9370f..1295f42 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -280,7 +280,7 @@ Step 3 is the key isolation mechanism: even though all broker nodes in the neigh #### Postage stamps -The subscriber needs a postage stamp for the MOC request upload. The broker needs a postage stamp for the MOC response upload. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, both stamp requirements can be lifted. +The subscriber needs a postage stamp for the MOC request upload. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, the stamp requirement can be lifted. #### Known limitations From a82dcb007a9969430130613b50b5e314acb98390 Mon Sep 17 00:00:00 2001 From: nugaon Date: Thu, 21 May 2026 19:16:18 +0200 Subject: [PATCH 5/9] caching problem --- SWIPs/swip-.md | 1 + 1 file changed, 1 insertion(+) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index 1295f42..0b47e4b 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -290,6 +290,7 @@ The subscriber needs a postage stamp for the MOC request upload. Once [SWIP-36]( 4. **ECIES decryption cost** — The SOC ID `keccak256("PUBSUB-REQUEST")` is a global constant shared by all discovery requests. Every broker node must attempt ECIES decryption (one ECDH scalar multiplication) on every incoming SOC with this ID that falls within its neighbourhood, even if the chunk is addressed to a different broker. The cost scales with the number of concurrent discovery requests across all brokers in a neighbourhood. Under normal load this is negligible, but the asymmetry is exploitable (see point 6). 5. **Replay attacks** — A neighbourhood peer that observes a discovery request chunk can re-upload the identical bytes, causing the broker to re-process the request and overwrite its previous response. The attacker cannot read the request (ECIES-encrypted) or the response (AES-256-GCM-encrypted with `k`), so the damage is limited to wasted broker computation and potential disruption if the legitimate subscriber has not yet fetched the response. Implementations should consider deduplicating detection by chunk address to suppress repeated processing. 6. **DoS via discovery flooding** — An attacker can cheaply mine many keys targeting a specific broker's neighbourhood and flood it with discovery request chunks. Each chunk forces an ECIES decryption attempt on the broker. The attacker's cost is key mining (~2^d iterations at depth d) plus a postage stamp per chunk; the broker's cost is one ECDH operation per chunk. Postage stamp economics provide a baseline rate limit, but brokers may additionally rate-limit detection processing or require a proof-of-work token inside the ECIES payload. +7. **Caching problem** — The response retrieval query must use the negation of the previously uploaded payload (constrained SOC query, detailed in [SWIP-191](https://github.com/ethersphere/SWIPs/pull/90)). This ensures the subscriber retrieves the broker's overwritten response rather than a cached copy of its own request from intermediate nodes. New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns broker connection data for the given topic. From 23d66defbfcdc467747af94ca1409b98128d1206 Mon Sep 17 00:00:00 2001 From: nugaon Date: Thu, 28 May 2026 12:40:22 +0200 Subject: [PATCH 6/9] micmocgsoc --- SWIPs/swip-.md | 177 +++++++++++++++++++------------------------------ 1 file changed, 68 insertions(+), 109 deletions(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index 0b47e4b..75941a8 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -149,99 +149,71 @@ The broker–subscriber stream is a metered channel: the subscriber pays the bro ### Milestone 3 — Decentralised broker discovery -Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers connection data from a specific broker node using a MOC-based dead-drop handshake (see MOC [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)). Unlike neighbourhood-broadcast approaches, the request targets a single broker: the subscriber mines a SOC owner key so the resulting chunk address is closest to the broker's overlay address. During push-sync the chunk is delivered directly to the closest node (analogous to [bee#5081](https://github.com/ethersphere/bee/pull/5081)), ensuring the broker receives it without neighbourhood-wide replication. The subscriber encrypts the payload to the broker's public key, and the broker responds by overwriting the same chunk address. - -#### On-chain broker registry - -The subscriber must know the target broker's overlay address and public key before initiating discovery. A smart contract — either extending the balanced neighbourhood registry ([SWIP-39](https://github.com/ethersphere/SWIPs/pull/74)) with pubkey field or deployed as a standalone PubSub registry — maps topics to broker nodes. Each registry entry contains: - -| Field | Description | -|---|---| -| `overlay` | 32-byte Swarm overlay address | -| `pubkey` | 65-byte secp256k1 public key (Swarm key) | - -Brokers register on-chain when they opt into `--pubsub-broker-mode`. The subscriber queries the registry to obtain `(overlay_B, PK_B)` for a chosen broker. - -The broker's secp256k1 public key must be the same key whose private counterpart the node uses for Swarm chunk-level operations (its Swarm key), so that ECIES decryption in the detection step works without additional key management. +Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers an eligible broker node through a two-phase handshake using MIC and MOC chunks (see [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)) targeting the topic's neighbourhood. No on-chain registry is required — broker public keys are discovered in-band via storage receipts. The protocol requires targeted chunk delivery and retrieval to/from the closest responsible node (see e.g. [bee#5081](https://github.com/ethersphere/bee/pull/5081)). #### Protocol constants ``` -SOC_ID = keccak256("PUBSUB-REQUEST") // 32-byte fixed SOC identifier +DISCOVERY_KEY = keccak256("PUBSUB-REQUEST") // well-known private key +DISCOVERY_OWNER = ethAddr(secp256k1_pubkey(DISCOVERY_KEY)) // derived ETH address ``` -All discovery requests across the network share this single SOC ID. Isolation between concurrent sessions is achieved by the uniqueness of the mined owner key, not by the ID. +Broker nodes continuously watch for incoming SOCs whose owner matches `DISCOVERY_OWNER`. This is a single, network-wide subscription filter. #### Workflow +```mermaid +sequenceDiagram + participant S as Subscriber + participant N as Topic Neighbourhood + participant B as Broker + + Note over S: 1. Mine MIC ID so
SOC addr ∈ topic neighbourhood (depth ≥ 16) + S->>N: 2. Upload MIC(key=DISCOVERY_KEY, id=mined_id, payload=msg_id) + N->>B: (sync delivers to closest broker) + Note over B: 3. Detect: owner == DISCOVERY_OWNER?
Store chunk, extract msg_id + B-->>S: 4. Storage receipt (contains PK_B) + Note over B: Subscribe to MOC messages with id=msg_id (timeout 30s) + + Note over S: 5. Mine response SOC params:
(resp_id, resp_key) so SOC addr
closest to broker overlay + S->>B: 6. Upload MOC(id=msg_id, payload=ECIES(PK_B, {topic, resp_key, resp_id, ...})) + Note over B: 7. Decrypt MOC payload
Build response R={overlay, underlay, ...}
Encrypt R with resp_key
Create SOC(id=resp_id, owner=ethAddr(resp_key)) + + Note over B: 8. Store response SOC locally + S->>B: 9. Fetch response SOC (Kademlia lookup) + B-->>S: Response chunk + Note over S: 10. Decrypt response, extract
broker connection info + S->>B: 11. libp2p connect(underlay_B) ``` -Subscriber (S) Broker (B) - │ │ - │ 1. Registry lookup │ - │ (overlay_B, PK_B) ◄── on-chain registry │ - │ │ - │ 2. Mine secp256k1 key pair (k, K = k·G): │ - │ a = ethAddr(K) │ - │ SOC_a = keccak256(SOC_ID ‖ a) │ - │ PO(SOC_a, overlay_B) ≥ depth + 1 │ - │ │ - │ 3. Build request payload P: │ - │ { topic, k, chequebook_addr, ... } │ - │ C_req = ECIES_Encrypt(PK_B, P) │ - │ │ - │── 4. Upload MOC(id=SOC_ID, key=k, data=C_req) ────►│ - │ (pull/push sync routes to B's neighbourhood) │ - │ │ - │ 5. Detect incoming SOC: │ - │ id == SOC_ID ? │ - │ chunk in my neighbourhood ? │ - │ ECIES_Decrypt(sk_B, C_req) │ - │ → success: chunk is for me │ - │ → failure: ignore │ - │ │ - │ 6. Extract k from payload │ - │ Build response R: │ - │ { overlay, underlay, │ - │ incentive_params, │ - │ hive_conn_list } │ - │ sym_key = keccak256(k) │ - │ C_res = AES-256-GCM(sym_key, │ - │ nonce, R) │ - │ Sign new SOC with k │ - │ → same SOC_a (overwrites) │ - │ │ - │ ─── 7. Store response SOC │ - │ locally (same chunk addr) │ - │ │ - │◄── 8. Fetch SOC_a ─────────────────────────────────│ - │ Decrypt: AES-256-GCM(keccak256(k), nonce, │ - │ C_res) → R │ - │ Extract broker connection info │ - │ │ - │── 9. libp2p connect(underlay_B) ───────────────────►│ -``` - -#### Mining the request key -The subscriber iterates secp256k1 private keys deterministically starting from a seed derived from its own public key until the resulting SOC address is closest to the target broker's overlay: - -``` -seed ← keccak256(PK_subscriber) // deterministic starting point -i ← 0 -repeat: - k ← keccak256(seed ‖ i) // 32-byte candidate private key - K ← secp256k1_pubkey(k) - a ← keccak256(K) [12:] // 20-byte Ethereum address - sa ← keccak256(SOC_ID ‖ a) // 32-byte SOC chunk address - i ← i + 1 -until PO(sa, overlay_B) ≥ storage_depth + 1 -``` - -Each iteration requires one secp256k1 scalar multiplication plus two Keccak-256 hashes. The expected number of iterations is `2^d` for a target depth `d`. At depth 12 this is ~4 096 iterations — well under a second on commodity hardware. +#### Phase 1 — Broker announcement (MIC) + +1. The subscriber generates a random 32-byte `msg_id`. +2. The subscriber mines a SOC ID such that `soc.CreateAddress(mined_id, DISCOVERY_OWNER)` falls within the topic's neighbourhood (PO ≥ 16 relative to topic address). +3. The subscriber uploads a MIC signed with `DISCOVERY_KEY`, with `mined_id` as SOC ID and `msg_id` as payload. Push-sync routes the chunk to the topic neighbourhood. +4. A broker node in the topic neighbourhood detects the incoming SOC (owner == `DISCOVERY_OWNER`), stores it, and returns a **storage receipt**. The receipt contains the broker's public key `PK_B`. +5. The broker subscribes (via the pubsub protocol internally) to incoming MOC messages with `id = msg_id`. This subscription times out after 30 seconds if no MOC arrives. + +#### Phase 2 — Encrypted handshake (MOC) + +6. The subscriber extracts `PK_B` and the broker's overlay address from the storage receipt. It then mines a response key pair `(resp_key, resp_pubkey)` and a `resp_id` such that `soc.CreateAddress(resp_id, ethAddr(resp_pubkey))` is closest to the broker's overlay. +7. The subscriber uploads a MOC with `id = msg_id`. The payload is ECIES-encrypted with `PK_B`: + ``` + ECIES_Encrypt(PK_B, { topic, resp_key, resp_id, chequebook_addr, ... }) + ``` +8. The broker (already subscribed to `msg_id`) receives the MOC, decrypts the payload, and extracts `resp_key` and `resp_id`. +9. The broker builds response `R = { overlay, underlay, incentive_params, hive_conn_list }`, encrypts it symmetrically: + ``` + sym_key = keccak256(resp_key) + nonce = keccak256(keccak256(resp_key)) [:12] + C_res = AES-256-GCM(sym_key, nonce, R) + ``` +10. The broker creates a new SOC signed with `resp_key` at address `soc.CreateAddress(resp_id, ethAddr(resp_pubkey))` and stores it locally. +11. The subscriber fetches the response SOC address via Kademlia lookup (the request is routed to the broker as the closest responsible node), decrypts with `sym_key`, and connects to the broker via libp2p. #### Request encryption — ECIES on secp256k1 -The request payload is encrypted with the Elliptic Curve Integrated Encryption Scheme (ECIES) — the same scheme and library used in Ethereum's devp2p/RLPx handshake (`go-ethereum/crypto/ecies`): +The MOC request payload is encrypted with the Elliptic Curve Integrated Encryption Scheme (ECIES) — the same scheme and library used in Ethereum's devp2p/RLPx handshake (`go-ethereum/crypto/ecies`): 1. Generate ephemeral key pair `(e, E = e·G)`. 2. Shared secret `S = ECDH(e, PK_B)`. @@ -250,47 +222,34 @@ The request payload is encrypted with the Elliptic Curve Integrated Encryption S 5. `tag = HMAC-SHA256(mac_key, ciphertext)`. 6. Output: `E ‖ ciphertext ‖ tag`. -Only the holder of `sk_B` can derive the shared secret and decrypt. The ephemeral key `e` is discarded after encryption, providing forward secrecy per discovery session. Neighbourhood peers that store or forward the chunk cannot read its contents. +Only the holder of `sk_B` can derive the shared secret and decrypt. The ephemeral key `e` is discarded after encryption, providing forward secrecy per discovery session. #### Response encryption — AES-256-GCM (symmetric) -The response is encrypted symmetrically using the mined private key `k` as key material. Both parties possess `k`: the subscriber mined it; the broker extracted it from the ECIES payload. - -``` -sym_key = keccak256(k) // 32 bytes → AES-256 key -nonce = keccak256(keccak256(k)) [:12] // 12 bytes, deterministic -C_res = AES-256-GCM_Encrypt(sym_key, nonce, response_payload) -``` +The response is encrypted symmetrically using the subscriber-mined `resp_key`. Both parties possess it: the subscriber mined it; the broker extracted it from the ECIES payload. -AES-256-GCM provides authenticated encryption: on decryption the subscriber verifies both confidentiality and integrity. Because `k` is unique per discovery session (freshly mined), the `(sym_key, nonce)` pair is never reused, satisfying GCM's uniqueness requirement. +AES-256-GCM provides authenticated encryption. Because `resp_key` is unique per discovery session (freshly mined), the `(sym_key, nonce)` pair is never reused, satisfying GCM's uniqueness requirement. -No party other than the subscriber and the target broker can derive `sym_key`, since `k` was transmitted inside the ECIES envelope. - -#### Broker detection logic - -A node running in broker mode applies the following filter to every incoming SOC chunk synced to its neighbourhood: +#### Postage stamps -1. **ID check** — SOC ID equals `keccak256("PUBSUB-REQUEST")`? -2. **Neighbourhood check** — chunk address within this node's storage responsibility? -3. **Decryption attempt** — `ECIES_Decrypt(sk_self, payload)`. Failure means the chunk is addressed to a different broker; discard silently. -4. **Payload validation** — extract `(topic, k, ...)`. Verify that `keccak256(SOC_ID ‖ ethAddr(k·G))` matches the chunk address. -5. **Response** — construct, encrypt, and store the modified response MOC signed with `k`. +The subscriber needs a postage stamp for the MIC and MOC uploads. The broker does not need a stamp for the response SOC — it is stored locally and served directly on fetch. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, the subscriber's stamp requirements can be lifted. -Step 3 is the key isolation mechanism: even though all broker nodes in the neighbourhood may see the same constant SOC ID, only the broker whose public key was used for ECIES encryption can successfully decrypt and act on the request. +#### Rationale -#### Postage stamps +The two-phase MIC/MOC handshake avoids several problems that a simpler single-round or registry-based discovery would face: -The subscriber needs a postage stamp for the MOC request upload. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, the stamp requirement can be lifted. +- **No on-chain registry** — the broker's public key and overlay are discovered in-band via the storage receipt, removing any blockchain dependency for discovery. +- **No concurrent requester collision** — the response is a separate SOC at a unique mined address per session; multiple subscribers never interfere with each other. +- **No caching problem** — the response SOC is a new chunk stored locally by the broker, not an overwrite of the request chunk, so stale cached copies are not an issue. +- **No single-node targeting** — any broker in the topic neighbourhood can respond to the MIC; if one is offline, another picks it up. +- **No blind ECIES decryption** — the broker only decrypts MOC payloads for `msg_id`s it actively subscribed to, rather than attempting decryption on every incoming SOC with a global constant ID. -#### Known limitations +#### Security considerations -1. **Single-node targeting** — The request reaches exactly one broker. If that broker is offline or unresponsive, the subscriber must time out and retry with another registry entry or initiating multiple parallel requests toward different brokers. -2. **On-chain dependency** — The subscriber must read the broker registry contract to learn `(overlay, pubkey)`. Light clients already require blockchain access for Swarm (postage stamp verification), so the marginal cost is low. The registry can be cached or mirrored off-chain. -3. **Concurrent requester collision** — If two subscribers targeting the same broker mine the same owner key (i.e. arrive at the same SOC address), the second request overwrites the first. Because key derivation is seeded from the subscriber's own public key, this can only happen if two nodes share the same Swarm key — an invalid network state. In practice the probability is negligible -4. **ECIES decryption cost** — The SOC ID `keccak256("PUBSUB-REQUEST")` is a global constant shared by all discovery requests. Every broker node must attempt ECIES decryption (one ECDH scalar multiplication) on every incoming SOC with this ID that falls within its neighbourhood, even if the chunk is addressed to a different broker. The cost scales with the number of concurrent discovery requests across all brokers in a neighbourhood. Under normal load this is negligible, but the asymmetry is exploitable (see point 6). -5. **Replay attacks** — A neighbourhood peer that observes a discovery request chunk can re-upload the identical bytes, causing the broker to re-process the request and overwrite its previous response. The attacker cannot read the request (ECIES-encrypted) or the response (AES-256-GCM-encrypted with `k`), so the damage is limited to wasted broker computation and potential disruption if the legitimate subscriber has not yet fetched the response. Implementations should consider deduplicating detection by chunk address to suppress repeated processing. -6. **DoS via discovery flooding** — An attacker can cheaply mine many keys targeting a specific broker's neighbourhood and flood it with discovery request chunks. Each chunk forces an ECIES decryption attempt on the broker. The attacker's cost is key mining (~2^d iterations at depth d) plus a postage stamp per chunk; the broker's cost is one ECDH operation per chunk. Postage stamp economics provide a baseline rate limit, but brokers may additionally rate-limit detection processing or require a proof-of-work token inside the ECIES payload. -7. **Caching problem** — The response retrieval query must use the negation of the previously uploaded payload (constrained SOC query, detailed in [SWIP-191](https://github.com/ethersphere/SWIPs/pull/90)). This ensures the subscriber retrieves the broker's overwritten response rather than a cached copy of its own request from intermediate nodes. +1. **MIC flooding (DoS on Phase 1)** — An attacker can flood MIC chunks to a topic neighbourhood. Each MIC only causes the broker to create a lightweight subscription hook (msg_id → 30s timeout), so the cost to the broker is minimal (memory for pending subscriptions). Bandwidth incentives provide a baseline rate limit: the attacker pays per chunk forwarded. Brokers can cap the number of concurrent pending subscriptions. +2. **MOC flooding (DoS on Phase 2)** — Sending a MOC requires knowing a valid `msg_id` that a broker is subscribed to. An attacker observing the MIC payload learns `msg_id`, but the MOC still requires ECIES encryption with the broker's public key — a garbage MOC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `PK_B` (obtained only via storage receipt to the original requester). Additionally, the subscriber must mine a key for the MOC (computational cost). +3. **Response SOC mining cost** — The subscriber must mine `(resp_id, resp_key)` such that the response SOC address is close to the broker overlay. +4. **Timing window** — The broker's subscription to `msg_id` has a 30s timeout. The subscriber must complete Phase 2 (mine response key + upload MOC) within this window. Mining at depth 16 is fast, so this is not a practical concern. New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns broker connection data for the given topic. From 0442b92c5a9964b747b24541327091eb55430a63 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Viktor=20Tr=C3=B3n?= Date: Thu, 28 May 2026 17:40:15 +0200 Subject: [PATCH 7/9] update MOC-MIC handshake --- SWIPs/swip-.md | 69 +++++++++++++++++++++++++++----------------------- 1 file changed, 38 insertions(+), 31 deletions(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index 75941a8..4ec7bbf 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -154,11 +154,10 @@ Make the broker underlay address parameter optional. Instead of the client hardc #### Protocol constants ``` -DISCOVERY_KEY = keccak256("PUBSUB-REQUEST") // well-known private key -DISCOVERY_OWNER = ethAddr(secp256k1_pubkey(DISCOVERY_KEY)) // derived ETH address +DISCOVERY_ID = keccak256("PUBSUB-REQUEST") // well-known private key ``` -Broker nodes continuously watch for incoming SOCs whose owner matches `DISCOVERY_OWNER`. This is a single, network-wide subscription filter. +Broker nodes continuously watch for incoming SOCs whose ID matches `DISCOVERY_ID`. This is a single, network-wide subscription filter. #### Workflow @@ -168,40 +167,48 @@ sequenceDiagram participant N as Topic Neighbourhood participant B as Broker - Note over S: 1. Mine MIC ID so
SOC addr ∈ topic neighbourhood (depth ≥ 16) - S->>N: 2. Upload MIC(key=DISCOVERY_KEY, id=mined_id, payload=msg_id) + Note over S: Mine MOC OWNER keypair (sk_S, pk_S), so SOC addr
a1=SOC_ADDR(id=DISCOVERY_ID, owner=ETH(pk_S))
∈ topic neighbourhood (depth ≥ 16) + Note over S: Mine response MIC ID id_S:
so SOC addr
a1=SOC_ADDR(id_S, owner=eth_S)
∈ topic neighbourhood (depth ≥ 16) + S->>N: Upload MOC(id=DISCOVERY_ID, owner=ETH(pk_S), payload=id_S) N->>B: (sync delivers to closest broker) - Note over B: 3. Detect: owner == DISCOVERY_OWNER?
Store chunk, extract msg_id - B-->>S: 4. Storage receipt (contains PK_B) - Note over B: Subscribe to MOC messages with id=msg_id (timeout 30s) - - Note over S: 5. Mine response SOC params:
(resp_id, resp_key) so SOC addr
closest to broker overlay - S->>B: 6. Upload MOC(id=msg_id, payload=ECIES(PK_B, {topic, resp_key, resp_id, ...})) - Note over B: 7. Decrypt MOC payload
Build response R={overlay, underlay, ...}
Encrypt R with resp_key
Create SOC(id=resp_id, owner=ethAddr(resp_key)) - - Note over B: 8. Store response SOC locally - S->>B: 9. Fetch response SOC (Kademlia lookup) - B-->>S: Response chunk - Note over S: 10. Decrypt response, extract
broker connection info - S->>B: 11. libp2p connect(underlay_B) + Note over B: 4. Detect: id == DISCOVERY_ID?
Store chunk, extract id_S, associate id_S to pk_S + B-->>S: 5. Storage receipt (extract pk_B from signature) + Note over B: Subscribe to MIC messages with owner=eth_S (timeout 30s) + + Note over S: Mine response MIC ID id_B:
so SOC addr
a_2=SOC_ADDR(id_B,eth_B)
closest to broker overlay + Note over S: Subscribe to SOC a_2 + S->>B: Upload MOC(id=id_S, owner=eth_S, payload=ECIES(pk_B, topic | id_B })) + N->>B: (sync delivers to closest broker) + Note over B: Receive MIC with subscription to eth_S owner + Note over B: Decrypt MIC payload
extract topic and id_B + Note over B: Check MIC addr a_2=SOC_ADDR(id_B,eth_B)
∈ topic neighbourhood (depth ≥ 16) + + + Build response R=hive connection info
Encrypt R
Create SOC(id=id_B, owner=eth_B) + + Note over B: Store response SOC R locally + S->>B: Fetch SOC R by requesting (Kademlia lookup) a_2 + B-->>S: Retrieve Response chunk R + Note over S: Decrypt response, extract
broker connection info + S->>B: libp2p connect(underlay_B) ``` #### Phase 1 — Broker announcement (MIC) -1. The subscriber generates a random 32-byte `msg_id`. +1. The subscriber generates a random 32-byte `id_S`. 2. The subscriber mines a SOC ID such that `soc.CreateAddress(mined_id, DISCOVERY_OWNER)` falls within the topic's neighbourhood (PO ≥ 16 relative to topic address). -3. The subscriber uploads a MIC signed with `DISCOVERY_KEY`, with `mined_id` as SOC ID and `msg_id` as payload. Push-sync routes the chunk to the topic neighbourhood. -4. A broker node in the topic neighbourhood detects the incoming SOC (owner == `DISCOVERY_OWNER`), stores it, and returns a **storage receipt**. The receipt contains the broker's public key `PK_B`. -5. The broker subscribes (via the pubsub protocol internally) to incoming MOC messages with `id = msg_id`. This subscription times out after 30 seconds if no MOC arrives. +3. The subscriber uploads a MIC signed with `DISCOVERY_ID`, with `mined_id` as SOC ID and `id_S` as payload. Push-sync routes the chunk to the topic neighbourhood. +4. A broker node in the topic neighbourhood detects the incoming SOC (owner == `DISCOVERY_OWNER`), stores it, and returns a **storage receipt**. The receipt contains the broker's public key `pk_B`. +5. The broker subscribes (via the pubsub protocol internally) to incoming MOC messages with `id = id_S`. This subscription times out after 30 seconds if no MOC arrives. #### Phase 2 — Encrypted handshake (MOC) -6. The subscriber extracts `PK_B` and the broker's overlay address from the storage receipt. It then mines a response key pair `(resp_key, resp_pubkey)` and a `resp_id` such that `soc.CreateAddress(resp_id, ethAddr(resp_pubkey))` is closest to the broker's overlay. -7. The subscriber uploads a MOC with `id = msg_id`. The payload is ECIES-encrypted with `PK_B`: +6. The subscriber extracts `pk_B` and the broker's overlay address from the storage receipt. It then mines a response key pair `(resp_key, resp_pubkey)` and a `resp_id` such that `soc.CreateAddress(resp_id, ethAddr(resp_pubkey))` is closest to the broker's overlay. +7. The subscriber uploads a MOC with `id = id_S`. The payload is ECIES-encrypted with `pk_B`: ``` - ECIES_Encrypt(PK_B, { topic, resp_key, resp_id, chequebook_addr, ... }) + ECIES_Encrypt(pk_B, { topic, resp_key, resp_id, chequebook_addr, ... }) ``` -8. The broker (already subscribed to `msg_id`) receives the MOC, decrypts the payload, and extracts `resp_key` and `resp_id`. +8. The broker (already subscribed to `id_S`) receives the MOC, decrypts the payload, and extracts `resp_key` and `resp_id`. 9. The broker builds response `R = { overlay, underlay, incentive_params, hive_conn_list }`, encrypts it symmetrically: ``` sym_key = keccak256(resp_key) @@ -216,7 +223,7 @@ sequenceDiagram The MOC request payload is encrypted with the Elliptic Curve Integrated Encryption Scheme (ECIES) — the same scheme and library used in Ethereum's devp2p/RLPx handshake (`go-ethereum/crypto/ecies`): 1. Generate ephemeral key pair `(e, E = e·G)`. -2. Shared secret `S = ECDH(e, PK_B)`. +2. Shared secret `S = ECDH(e, pk_B)`. 3. Key derivation: `(enc_key ‖ mac_key) = HKDF-SHA256(S)`. 4. `ciphertext = AES-128-CTR(enc_key, plaintext)`. 5. `tag = HMAC-SHA256(mac_key, ciphertext)`. @@ -242,14 +249,14 @@ The two-phase MIC/MOC handshake avoids several problems that a simpler single-ro - **No concurrent requester collision** — the response is a separate SOC at a unique mined address per session; multiple subscribers never interfere with each other. - **No caching problem** — the response SOC is a new chunk stored locally by the broker, not an overwrite of the request chunk, so stale cached copies are not an issue. - **No single-node targeting** — any broker in the topic neighbourhood can respond to the MIC; if one is offline, another picks it up. -- **No blind ECIES decryption** — the broker only decrypts MOC payloads for `msg_id`s it actively subscribed to, rather than attempting decryption on every incoming SOC with a global constant ID. +- **No blind ECIES decryption** — the broker only decrypts MOC payloads for `id_S`s it actively subscribed to, rather than attempting decryption on every incoming SOC with a global constant ID. #### Security considerations -1. **MIC flooding (DoS on Phase 1)** — An attacker can flood MIC chunks to a topic neighbourhood. Each MIC only causes the broker to create a lightweight subscription hook (msg_id → 30s timeout), so the cost to the broker is minimal (memory for pending subscriptions). Bandwidth incentives provide a baseline rate limit: the attacker pays per chunk forwarded. Brokers can cap the number of concurrent pending subscriptions. -2. **MOC flooding (DoS on Phase 2)** — Sending a MOC requires knowing a valid `msg_id` that a broker is subscribed to. An attacker observing the MIC payload learns `msg_id`, but the MOC still requires ECIES encryption with the broker's public key — a garbage MOC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `PK_B` (obtained only via storage receipt to the original requester). Additionally, the subscriber must mine a key for the MOC (computational cost). +1. **MIC flooding (DoS on Phase 1)** — An attacker can flood MIC chunks to a topic neighbourhood. Each MIC only causes the broker to create a lightweight subscription hook (id_S → 30s timeout), so the cost to the broker is minimal (memory for pending subscriptions). Bandwidth incentives provide a baseline rate limit: the attacker pays per chunk forwarded. Brokers can cap the number of concurrent pending subscriptions. +2. **MOC flooding (DoS on Phase 2)** — Sending a MOC requires knowing a valid `id_S` that a broker is subscribed to. An attacker observing the MIC payload learns `id_S`, but the MOC still requires ECIES encryption with the broker's public key — a garbage MOC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `pk_B` (obtained only via storage receipt to the original requester). Additionally, the subscriber must mine a key for the MOC (computational cost). 3. **Response SOC mining cost** — The subscriber must mine `(resp_id, resp_key)` such that the response SOC address is close to the broker overlay. -4. **Timing window** — The broker's subscription to `msg_id` has a 30s timeout. The subscriber must complete Phase 2 (mine response key + upload MOC) within this window. Mining at depth 16 is fast, so this is not a practical concern. +4. **Timing window** — The broker's subscription to `id_S` has a 30s timeout. The subscriber must complete Phase 2 (mine response key + upload MOC) within this window. Mining at depth 16 is fast, so this is not a practical concern. New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns broker connection data for the given topic. From a4d631128a40a9e57f69232dc809e165873f964b Mon Sep 17 00:00:00 2001 From: nugaon Date: Fri, 29 May 2026 10:50:58 +0200 Subject: [PATCH 8/9] fix: milestone 3 corrections --- SWIPs/swip-.md | 113 +++++++++++++++++++++++-------------------------- 1 file changed, 54 insertions(+), 59 deletions(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index 4ec7bbf..2ef9daa 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -154,7 +154,7 @@ Make the broker underlay address parameter optional. Instead of the client hardc #### Protocol constants ``` -DISCOVERY_ID = keccak256("PUBSUB-REQUEST") // well-known private key +DISCOVERY_ID = keccak256("PUBSUB-REQUEST") // MOC ID for discovery ``` Broker nodes continuously watch for incoming SOCs whose ID matches `DISCOVERY_ID`. This is a single, network-wide subscription filter. @@ -167,75 +167,70 @@ sequenceDiagram participant N as Topic Neighbourhood participant B as Broker - Note over S: Mine MOC OWNER keypair (sk_S, pk_S), so SOC addr
a1=SOC_ADDR(id=DISCOVERY_ID, owner=ETH(pk_S))
∈ topic neighbourhood (depth ≥ 16) - Note over S: Mine response MIC ID id_S:
so SOC addr
a1=SOC_ADDR(id_S, owner=eth_S)
∈ topic neighbourhood (depth ≥ 16) + Note over S: Mine MOC OWNER keypair (sk_S, pk_S), so that SOC addr
a_1=SOC_ADDR(id=DISCOVERY_ID, owner=ETH(pk_S))
∈ topic neighbourhood (depth ≥ 16) S->>N: Upload MOC(id=DISCOVERY_ID, owner=ETH(pk_S), payload=id_S) N->>B: (sync delivers to closest broker) - Note over B: 4. Detect: id == DISCOVERY_ID?
Store chunk, extract id_S, associate id_S to pk_S - B-->>S: 5. Storage receipt (extract pk_B from signature) - Note over B: Subscribe to MIC messages with owner=eth_S (timeout 30s) + Note over B: Detect: id == DISCOVERY_ID
extract id_S, associate pk_S with request + B-->>S: storage receipt (extract pk_B from signature, and overlay_B from the receipt payload) + Note over B: subscribe to MIC with with owner=ETH(pk_S) (timeout 30s) - Note over S: Mine response MIC ID id_B:
so SOC addr
a_2=SOC_ADDR(id_B,eth_B)
closest to broker overlay - Note over S: Subscribe to SOC a_2 - S->>B: Upload MOC(id=id_S, owner=eth_S, payload=ECIES(pk_B, topic | id_B })) + Note over S: mine response MIC ID id_B
so that SOC_ADDR(id_B, ETH(pk_B))
is closest to overlay_B (depth ≥ 16) + S->>N: Upload MIC(id=id_S, owner=ETH(pk_S), payload=AES-GCM(req_key, {topic, id_B, ...})) N->>B: (sync delivers to closest broker) - Note over B: Receive MIC with subscription to eth_S owner - Note over B: Decrypt MIC payload
extract topic and id_B - Note over B: Check MIC addr a_2=SOC_ADDR(id_B,eth_B)
∈ topic neighbourhood (depth ≥ 16) - - - Build response R=hive connection info
Encrypt R
Create SOC(id=id_B, owner=eth_B) - - Note over B: Store response SOC R locally - S->>B: Fetch SOC R by requesting (Kademlia lookup) a_2 - B-->>S: Retrieve Response chunk R - Note over S: Decrypt response, extract
broker connection info + Note over B: MIC owner=ETH(pk_S) matches subscription
decrypt payload → extract topic, id_B
Check MIC addr a_2=SOC_ADDR(id_B,eth_B)
∈ topic neighbourhood (depth ≥ 16) + Note over B: build response R={overlay, underlay, ...}
encrypt with res_key
store SOC(id=id_B, owner=ETH(pk_B)) locally + S->>N: fetch SOC_ADDR(id_B, ETH(pk_B)) via Kademlia + N-->>B: lookup routed to broker (closest node) + B-->>S: response SOC R + Note over S: decrypt R with res_key
extract broker connection info S->>B: libp2p connect(underlay_B) ``` -#### Phase 1 — Broker announcement (MIC) +#### Phase 1 — Discovery request (MOC) 1. The subscriber generates a random 32-byte `id_S`. -2. The subscriber mines a SOC ID such that `soc.CreateAddress(mined_id, DISCOVERY_OWNER)` falls within the topic's neighbourhood (PO ≥ 16 relative to topic address). -3. The subscriber uploads a MIC signed with `DISCOVERY_ID`, with `mined_id` as SOC ID and `id_S` as payload. Push-sync routes the chunk to the topic neighbourhood. -4. A broker node in the topic neighbourhood detects the incoming SOC (owner == `DISCOVERY_OWNER`), stores it, and returns a **storage receipt**. The receipt contains the broker's public key `pk_B`. -5. The broker subscribes (via the pubsub protocol internally) to incoming MOC messages with `id = id_S`. This subscription times out after 30 seconds if no MOC arrives. +2. The subscriber mines a keypair `(sk_S, pk_S)` such that `soc.CreateAddress(DISCOVERY_ID, ETH(pk_S))` falls within the topic's neighbourhood (PO ≥ 16 relative to topic address). +3. The subscriber uploads a MOC with `id = DISCOVERY_ID`, `owner = ETH(pk_S)`, and `id_S` as payload. Push-sync routes the chunk to the topic neighbourhood. +4. A broker node in the topic neighbourhood detects the incoming SOC (`id == DISCOVERY_ID`), stores it, extracts `id_S`, and associates `pk_S` with the request. +5. The broker returns a **storage receipt**. The subscriber extracts `pk_B` and the broker's overlay address from the receipt signature. +6. The broker subscribes to incoming MIC messages with `owner = ETH(pk_S)`. This subscription times out after 30 seconds if no MIC arrives. -#### Phase 2 — Encrypted handshake (MOC) +#### Phase 2 — Encrypted handshake (MIC) -6. The subscriber extracts `pk_B` and the broker's overlay address from the storage receipt. It then mines a response key pair `(resp_key, resp_pubkey)` and a `resp_id` such that `soc.CreateAddress(resp_id, ethAddr(resp_pubkey))` is closest to the broker's overlay. -7. The subscriber uploads a MOC with `id = id_S`. The payload is ECIES-encrypted with `pk_B`: +7. The subscriber mines a SOC ID `id_B` such that `soc.CreateAddress(id_B, ETH(pk_B))` is closest to the broker's overlay (PO ≥ 16). +8. The subscriber uploads a MIC with `id = id_S`, `owner = ETH(pk_S)`. The payload is encrypted with the ECDH-derived key: ``` - ECIES_Encrypt(pk_B, { topic, resp_key, resp_id, chequebook_addr, ... }) + shared = ECDH(sk_S, pk_B) + req_key = keccak256(shared ‖ 0x00) + nonce = keccak256(req_key) [:12] + payload = AES-256-GCM(req_key, nonce, { topic, id_B, chequebook_addr, ... }) ``` -8. The broker (already subscribed to `id_S`) receives the MOC, decrypts the payload, and extracts `resp_key` and `resp_id`. -9. The broker builds response `R = { overlay, underlay, incentive_params, hive_conn_list }`, encrypts it symmetrically: - ``` - sym_key = keccak256(resp_key) - nonce = keccak256(keccak256(resp_key)) [:12] - C_res = AES-256-GCM(sym_key, nonce, R) - ``` -10. The broker creates a new SOC signed with `resp_key` at address `soc.CreateAddress(resp_id, ethAddr(resp_pubkey))` and stores it locally. -11. The subscriber fetches the response SOC address via Kademlia lookup (the request is routed to the broker as the closest responsible node), decrypts with `sym_key`, and connects to the broker via libp2p. - -#### Request encryption — ECIES on secp256k1 +9. The broker (subscribed to MIC with `owner = ETH(pk_S)`) receives the MIC, decrypts the payload, and extracts `topic` and `id_B`. +10. The broker verifies that `soc.CreateAddress(id_B, ETH(pk_B))` falls within the topic neighbourhood (PO ≥ 16). +11. The broker builds response `R = { overlay, underlay, incentive_params, hive_conn_list }`, encrypts it symmetrically: + ``` + shared = ECDH(sk_B, pk_S) + res_key = keccak256(shared ‖ 0x01) + nonce = keccak256(res_key) [:12] + C_res = AES-256-GCM(res_key, nonce, R) + ``` +12. The broker creates a SOC signed with `sk_B` at address `soc.CreateAddress(id_B, ETH(pk_B))` and stores it locally. +13. The subscriber fetches the response SOC via Kademlia lookup (routed to the broker as the closest responsible node), decrypts with `res_key` derived from the same ECDH shared secret, and connects to the broker via libp2p. -The MOC request payload is encrypted with the Elliptic Curve Integrated Encryption Scheme (ECIES) — the same scheme and library used in Ethereum's devp2p/RLPx handshake (`go-ethereum/crypto/ecies`): +#### Encryption — ECDH + AES-256-GCM -1. Generate ephemeral key pair `(e, E = e·G)`. -2. Shared secret `S = ECDH(e, pk_B)`. -3. Key derivation: `(enc_key ‖ mac_key) = HKDF-SHA256(S)`. -4. `ciphertext = AES-128-CTR(enc_key, plaintext)`. -5. `tag = HMAC-SHA256(mac_key, ciphertext)`. -6. Output: `E ‖ ciphertext ‖ tag`. +Both the MIC payload (Phase 2) and the response SOC payload use AES-256-GCM keyed by an ECDH shared secret. Both parties can compute the shared secret independently: `ECDH(sk_S, pk_B) = ECDH(sk_B, pk_S)` — the subscriber knows `sk_S` (mined in Phase 1) and `pk_B` (from the storage receipt); the broker knows `sk_B` and `pk_S` (from the Phase 1 MOC). -Only the holder of `sk_B` can derive the shared secret and decrypt. The ephemeral key `e` is discarded after encryption, providing forward secrecy per discovery session. +Request and response derive separate keys to avoid nonce reuse: -#### Response encryption — AES-256-GCM (symmetric) - -The response is encrypted symmetrically using the subscriber-mined `resp_key`. Both parties possess it: the subscriber mined it; the broker extracted it from the ECIES payload. +``` +shared = ECDH(sk_S, pk_B) // = ECDH(sk_B, pk_S) +req_key = keccak256(shared ‖ 0x00) // MIC payload encryption +res_key = keccak256(shared ‖ 0x01) // response SOC encryption +nonce_* = keccak256(key) [:12] // deterministic per key +``` -AES-256-GCM provides authenticated encryption. Because `resp_key` is unique per discovery session (freshly mined), the `(sym_key, nonce)` pair is never reused, satisfying GCM's uniqueness requirement. +AES-256-GCM provides authenticated encryption. Because `sk_S` is unique per discovery session (freshly mined), the derived keys and nonces are never reused, satisfying GCM's uniqueness requirement. Forward secrecy is provided by the ephemeral nature of `sk_S`. #### Postage stamps @@ -243,20 +238,20 @@ The subscriber needs a postage stamp for the MIC and MOC uploads. The broker doe #### Rationale -The two-phase MIC/MOC handshake avoids several problems that a simpler single-round or registry-based discovery would face: +The two-phase MOC/MIC handshake avoids several problems that a simpler single-round or registry-based discovery would face: - **No on-chain registry** — the broker's public key and overlay are discovered in-band via the storage receipt, removing any blockchain dependency for discovery. - **No concurrent requester collision** — the response is a separate SOC at a unique mined address per session; multiple subscribers never interfere with each other. - **No caching problem** — the response SOC is a new chunk stored locally by the broker, not an overwrite of the request chunk, so stale cached copies are not an issue. -- **No single-node targeting** — any broker in the topic neighbourhood can respond to the MIC; if one is offline, another picks it up. -- **No blind ECIES decryption** — the broker only decrypts MOC payloads for `id_S`s it actively subscribed to, rather than attempting decryption on every incoming SOC with a global constant ID. +- **No single-node targeting** — any broker in the topic neighbourhood can respond to the MOC; if one is offline, another picks it up. +- **No blind ECIES decryption** — the broker only decrypts MIC payloads for `owner`s it actively subscribed to, rather than attempting decryption on every incoming SOC with a global constant ID. #### Security considerations -1. **MIC flooding (DoS on Phase 1)** — An attacker can flood MIC chunks to a topic neighbourhood. Each MIC only causes the broker to create a lightweight subscription hook (id_S → 30s timeout), so the cost to the broker is minimal (memory for pending subscriptions). Bandwidth incentives provide a baseline rate limit: the attacker pays per chunk forwarded. Brokers can cap the number of concurrent pending subscriptions. -2. **MOC flooding (DoS on Phase 2)** — Sending a MOC requires knowing a valid `id_S` that a broker is subscribed to. An attacker observing the MIC payload learns `id_S`, but the MOC still requires ECIES encryption with the broker's public key — a garbage MOC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `pk_B` (obtained only via storage receipt to the original requester). Additionally, the subscriber must mine a key for the MOC (computational cost). -3. **Response SOC mining cost** — The subscriber must mine `(resp_id, resp_key)` such that the response SOC address is close to the broker overlay. -4. **Timing window** — The broker's subscription to `id_S` has a 30s timeout. The subscriber must complete Phase 2 (mine response key + upload MOC) within this window. Mining at depth 16 is fast, so this is not a practical concern. +1. **MOC flooding (DoS on Phase 1)** — An attacker can flood MOC chunks with `id = DISCOVERY_ID` to a topic neighbourhood. Each MOC only causes the broker to create a lightweight subscription hook (30s timeout), so the cost to the broker is minimal (memory for pending subscriptions). The attacker must also mine a keypair per chunk targeting the topic neighbourhood. Bandwidth incentives provide a baseline rate limit: the attacker pays per chunk forwarded. Brokers can cap the number of concurrent pending subscriptions. +2. **MIC flooding (DoS on Phase 2)** — Sending a MIC requires knowing a valid `ETH(pk_S)` owner that a broker is subscribed to. An attacker observing the Phase 1 MOC learns `ETH(pk_S)`, but the MIC payload must be ECIES-encrypted with the broker's public key — a garbage MIC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `pk_B` (obtained only via storage receipt to the original requester). +3. **Response SOC mining cost** — The subscriber must mine `id_B` such that `soc.CreateAddress(id_B, ETH(pk_B))` is close to the broker overlay. +4. **Timing window** — The broker's subscription to MIC with `owner = ETH(pk_S)` has a 30s timeout. The subscriber must complete Phase 2 (mine `id_B` + upload MIC) within this window. Mining at depth 16 is fast, so this is not a practical concern. New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns broker connection data for the given topic. From 390c980be2d0d438257de8c0e089e1004f390429 Mon Sep 17 00:00:00 2001 From: nugaon Date: Thu, 4 Jun 2026 17:55:53 +0200 Subject: [PATCH 9/9] moc gsoc handshake --- SWIPs/swip-.md | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/SWIPs/swip-.md b/SWIPs/swip-.md index 2ef9daa..485f018 100644 --- a/SWIPs/swip-.md +++ b/SWIPs/swip-.md @@ -149,7 +149,7 @@ The broker–subscriber stream is a metered channel: the subscriber pays the bro ### Milestone 3 — Decentralised broker discovery -Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers an eligible broker node through a two-phase handshake using MIC and MOC chunks (see [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)) targeting the topic's neighbourhood. No on-chain registry is required — broker public keys are discovered in-band via storage receipts. The protocol requires targeted chunk delivery and retrieval to/from the closest responsible node (see e.g. [bee#5081](https://github.com/ethersphere/bee/pull/5081)). +Make the broker underlay address parameter optional. Instead of the client hardcoding a broker, it discovers an eligible broker node through a two-phase handshake using MOC and GSOC chunks (see [SWIP-42](https://github.com/ethersphere/SWIPs/pull/80)) targeting the topic's neighbourhood. No on-chain registry is required — broker public keys are discovered in-band via storage receipts. The protocol requires targeted chunk delivery and retrieval to/from the closest responsible node (see e.g. [bee#5081](https://github.com/ethersphere/bee/pull/5081)). #### Protocol constants @@ -172,12 +172,12 @@ sequenceDiagram N->>B: (sync delivers to closest broker) Note over B: Detect: id == DISCOVERY_ID
extract id_S, associate pk_S with request B-->>S: storage receipt (extract pk_B from signature, and overlay_B from the receipt payload) - Note over B: subscribe to MIC with with owner=ETH(pk_S) (timeout 30s) + Note over B: subscribe to GSOC at SOC_ADDR(id_S, ETH(pk_S)) (timeout 30s) - Note over S: mine response MIC ID id_B
so that SOC_ADDR(id_B, ETH(pk_B))
is closest to overlay_B (depth ≥ 16) - S->>N: Upload MIC(id=id_S, owner=ETH(pk_S), payload=AES-GCM(req_key, {topic, id_B, ...})) + Note over S: mine response SOC ID id_B
so that SOC_ADDR(id_B, ETH(pk_B))
is closest to overlay_B (depth ≥ 16) + S->>N: Upload SOC(id=id_S, owner=ETH(pk_S), payload=AES-GCM(req_key, {topic, id_B, ...})) N->>B: (sync delivers to closest broker) - Note over B: MIC owner=ETH(pk_S) matches subscription
decrypt payload → extract topic, id_B
Check MIC addr a_2=SOC_ADDR(id_B,eth_B)
∈ topic neighbourhood (depth ≥ 16) + Note over B: SOC address matches GSOC subscription
decrypt payload → extract topic, id_B
Check response addr a_2=SOC_ADDR(id_B,eth_B)
∈ topic neighbourhood (depth ≥ 16) Note over B: build response R={overlay, underlay, ...}
encrypt with res_key
store SOC(id=id_B, owner=ETH(pk_B)) locally S->>N: fetch SOC_ADDR(id_B, ETH(pk_B)) via Kademlia N-->>B: lookup routed to broker (closest node) @@ -193,19 +193,19 @@ sequenceDiagram 3. The subscriber uploads a MOC with `id = DISCOVERY_ID`, `owner = ETH(pk_S)`, and `id_S` as payload. Push-sync routes the chunk to the topic neighbourhood. 4. A broker node in the topic neighbourhood detects the incoming SOC (`id == DISCOVERY_ID`), stores it, extracts `id_S`, and associates `pk_S` with the request. 5. The broker returns a **storage receipt**. The subscriber extracts `pk_B` and the broker's overlay address from the receipt signature. -6. The broker subscribes to incoming MIC messages with `owner = ETH(pk_S)`. This subscription times out after 30 seconds if no MIC arrives. +6. The broker subscribes to GSOC events on address `soc.CreateAddress(id_S, ETH(pk_S))`. This subscription times out after 30 seconds if no matching SOC arrives. -#### Phase 2 — Encrypted handshake (MIC) +#### Phase 2 — Encrypted handshake (GSOC) 7. The subscriber mines a SOC ID `id_B` such that `soc.CreateAddress(id_B, ETH(pk_B))` is closest to the broker's overlay (PO ≥ 16). -8. The subscriber uploads a MIC with `id = id_S`, `owner = ETH(pk_S)`. The payload is encrypted with the ECDH-derived key: +8. The subscriber uploads a SOC with `id = id_S`, `owner = ETH(pk_S)`. The payload is encrypted with the ECDH-derived key: ``` shared = ECDH(sk_S, pk_B) req_key = keccak256(shared ‖ 0x00) nonce = keccak256(req_key) [:12] payload = AES-256-GCM(req_key, nonce, { topic, id_B, chequebook_addr, ... }) ``` -9. The broker (subscribed to MIC with `owner = ETH(pk_S)`) receives the MIC, decrypts the payload, and extracts `topic` and `id_B`. +9. The broker (subscribed to GSOC at `soc.CreateAddress(id_S, ETH(pk_S))`) receives the SOC, decrypts the payload, and extracts `topic` and `id_B`. 10. The broker verifies that `soc.CreateAddress(id_B, ETH(pk_B))` falls within the topic neighbourhood (PO ≥ 16). 11. The broker builds response `R = { overlay, underlay, incentive_params, hive_conn_list }`, encrypts it symmetrically: ``` @@ -219,13 +219,13 @@ sequenceDiagram #### Encryption — ECDH + AES-256-GCM -Both the MIC payload (Phase 2) and the response SOC payload use AES-256-GCM keyed by an ECDH shared secret. Both parties can compute the shared secret independently: `ECDH(sk_S, pk_B) = ECDH(sk_B, pk_S)` — the subscriber knows `sk_S` (mined in Phase 1) and `pk_B` (from the storage receipt); the broker knows `sk_B` and `pk_S` (from the Phase 1 MOC). +Both the Phase 2 SOC payload and the response SOC payload use AES-256-GCM keyed by an ECDH shared secret. Both parties can compute the shared secret independently: `ECDH(sk_S, pk_B) = ECDH(sk_B, pk_S)` — the subscriber knows `sk_S` (mined in Phase 1) and `pk_B` (from the storage receipt); the broker knows `sk_B` and `pk_S` (from the Phase 1 MOC). Request and response derive separate keys to avoid nonce reuse: ``` shared = ECDH(sk_S, pk_B) // = ECDH(sk_B, pk_S) -req_key = keccak256(shared ‖ 0x00) // MIC payload encryption +req_key = keccak256(shared ‖ 0x00) // Phase 2 payload encryption res_key = keccak256(shared ‖ 0x01) // response SOC encryption nonce_* = keccak256(key) [:12] // deterministic per key ``` @@ -234,24 +234,24 @@ AES-256-GCM provides authenticated encryption. Because `sk_S` is unique per disc #### Postage stamps -The subscriber needs a postage stamp for the MIC and MOC uploads. The broker does not need a stamp for the response SOC — it is stored locally and served directly on fetch. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, the subscriber's stamp requirements can be lifted. +The subscriber needs a postage stamp for the SOC uploads. The broker does not need a stamp for the response SOC — it is stored locally and served directly on fetch. Once [SWIP-36](https://github.com/ethersphere/SWIPs/pull/70) (free uploads) is adopted, the subscriber's stamp requirements can be lifted. #### Rationale -The two-phase MOC/MIC handshake avoids several problems that a simpler single-round or registry-based discovery would face: +The two-phase MOC/GSOC handshake avoids several problems that a simpler single-round or registry-based discovery would face: - **No on-chain registry** — the broker's public key and overlay are discovered in-band via the storage receipt, removing any blockchain dependency for discovery. - **No concurrent requester collision** — the response is a separate SOC at a unique mined address per session; multiple subscribers never interfere with each other. - **No caching problem** — the response SOC is a new chunk stored locally by the broker, not an overwrite of the request chunk, so stale cached copies are not an issue. - **No single-node targeting** — any broker in the topic neighbourhood can respond to the MOC; if one is offline, another picks it up. -- **No blind ECIES decryption** — the broker only decrypts MIC payloads for `owner`s it actively subscribed to, rather than attempting decryption on every incoming SOC with a global constant ID. +- **Address-level filtering over owner-level filtering** — GSOC subscription matches on the exact SOC address `soc.CreateAddress(id_S, ETH(pk_S))` rather than on `owner` alone (as MIC subscription would), so the broker processes only the specific chunk it expects and ignores any unrelated SOCs that happen to share the same owner key. #### Security considerations 1. **MOC flooding (DoS on Phase 1)** — An attacker can flood MOC chunks with `id = DISCOVERY_ID` to a topic neighbourhood. Each MOC only causes the broker to create a lightweight subscription hook (30s timeout), so the cost to the broker is minimal (memory for pending subscriptions). The attacker must also mine a keypair per chunk targeting the topic neighbourhood. Bandwidth incentives provide a baseline rate limit: the attacker pays per chunk forwarded. Brokers can cap the number of concurrent pending subscriptions. -2. **MIC flooding (DoS on Phase 2)** — Sending a MIC requires knowing a valid `ETH(pk_S)` owner that a broker is subscribed to. An attacker observing the Phase 1 MOC learns `ETH(pk_S)`, but the MIC payload must be ECIES-encrypted with the broker's public key — a garbage MIC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `pk_B` (obtained only via storage receipt to the original requester). +2. **Phase 2 flooding (DoS on GSOC)** — An attacker observing the Phase 1 MOC learns `ETH(pk_S)` and `id_S`, but the GSOC subscription matches on the full SOC address `soc.CreateAddress(id_S, ETH(pk_S))`, so the attacker must forge a SOC at that exact address. Even then, the payload must be encrypted with the ECDH-derived key — a garbage SOC will fail decryption and be discarded. The attacker cannot produce a valid encrypted payload without `pk_B` (obtained only via storage receipt to the original requester). 3. **Response SOC mining cost** — The subscriber must mine `id_B` such that `soc.CreateAddress(id_B, ETH(pk_B))` is close to the broker overlay. -4. **Timing window** — The broker's subscription to MIC with `owner = ETH(pk_S)` has a 30s timeout. The subscriber must complete Phase 2 (mine `id_B` + upload MIC) within this window. Mining at depth 16 is fast, so this is not a practical concern. +4. **Timing window** — The broker's GSOC subscription on `soc.CreateAddress(id_S, ETH(pk_S))` has a 30s timeout. The subscriber must complete Phase 2 (mine `id_B` + upload SOC) within this window. Mining at depth 16 is fast, so this is not a practical concern. New API endpoint: `GET /pubsub/discover/{topic}?mode=` — returns broker connection data for the given topic.