What & why
ClickHouse is a popular OLAP database, often the downstream analytics layer for
graph-computing results. You can technically connect today via the JDBC
connector, but it lacks ClickHouse-tuned batch writes (buffering + columnar
writes), so performance suffers.
ClickHouse 是流行的 OLAP 数据库,常作为图计算结果的下游分析层。现在用 JDBC connector
勉强能连,但缺针对 ClickHouse 的高效批量写(攒批 + 列式写入),性能差。
The task
Create geaflow-dsl-connector-clickhouse, focusing the sink on efficient batched
writes (buffer + flush threshold); the source should support parallel partitioned reads.
新建 geaflow-dsl-connector-clickhouse,重点优化 sink 批量写(buffer + flush 阈值),
source 支持按分区并行读。
Where to look / 怎么做
- Use
geaflow-dsl-connector-jdbc as a baseline and spot the differences
between "generic JDBC" and "ClickHouse-specific".
- Sink batching:
write() goes into a buffer; commit in bulk on threshold or flush().
- SPI registration + parent/aggregation pom + docs + tests.
Done when
What & why
ClickHouse is a popular OLAP database, often the downstream analytics layer for
graph-computing results. You can technically connect today via the JDBC
connector, but it lacks ClickHouse-tuned batch writes (buffering + columnar
writes), so performance suffers.
ClickHouse 是流行的 OLAP 数据库,常作为图计算结果的下游分析层。现在用 JDBC connector
勉强能连,但缺针对 ClickHouse 的高效批量写(攒批 + 列式写入),性能差。
The task
Create
geaflow-dsl-connector-clickhouse, focusing the sink on efficient batchedwrites (buffer + flush threshold); the source should support parallel partitioned reads.
新建
geaflow-dsl-connector-clickhouse,重点优化 sink 批量写(buffer + flush 阈值),source 支持按分区并行读。
Where to look / 怎么做
geaflow-dsl-connector-jdbcas a baseline and spot the differencesbetween "generic JDBC" and "ClickHouse-specific".
write()goes into a buffer; commit in bulk on threshold orflush().Done when