feat(datafusion): support SHOW CREATE TABLE via get_table_definition#444
feat(datafusion): support SHOW CREATE TABLE via get_table_definition#444shyjsarah wants to merge 7 commits into
Conversation
DataFusion 53 rewrites `SHOW CREATE TABLE` into a query against `information_schema.views`, whose `definition` column is populated by `TableProvider::get_table_definition()`. PaimonTableProvider did not override this method, so the column came back empty for Paimon tables. Add a cached DDL string built from the table's identifier, schema (fields, primary keys, partition keys, options), and a recursive `data_type_to_sql` renderer covering all DataType variants. Override `get_table_definition()` to return it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Previously data_type_to_sql rendered every type as nullable, so a column declared `payload BLOB NOT NULL` came back as `payload BLOB`, and replaying the DDL would silently widen the schema. Append ` NOT NULL` when DataType::is_nullable() returns false. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
sqlparser's GenericDialect parses `MAP(k, v)` (ClickHouse style, parentheses) into SqlType::Map, not `MAP<k, v>` (angle brackets). The previous angle-bracket output was not parseable by paimon-rust's own CREATE TABLE path, so map columns could not round-trip. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…LE output sqlparser's GenericDialect parses `STRUCT<name type, ...>` (BigQuery style, angle brackets with space-separated field name and type) into SqlType::Struct, which paimon-rust maps to Paimon Row. The previous `ROW<name: type, ...>` output was not parseable by paimon-rust's own CREATE TABLE path, so row/struct columns could not round-trip. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…TABLE `NOT NULL` is a column constraint, not a type modifier — it is only valid at the top of a column definition, not nested inside `MAP`, `ARRAY`, or `STRUCT` arguments. The previous rendering appended `NOT NULL` to every non-nullable type, producing output like `MAP(INT NOT NULL, VARCHAR)` that paimon-rust's own CREATE TABLE parser rejects (`Expected: ,, found: NOT`). Move the `NOT NULL` suffix out of `data_type_to_sql` (which is called recursively for nested type arguments) and into `build_table_definition` (which renders one column at a time). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The DDL returned by `SHOW CREATE TABLE` must be executable by paimon-rust's own `CREATE TABLE` parser and reproduce an equivalent schema (fields, primary keys, partition keys). This guards against regressions where the rendered DDL drifts away from what the parser accepts (e.g. `ROW<name: type>` vs `STRUCT<name type>`, `MAP<k: v>` vs `MAP(k, v)`, or `NOT NULL` leaking into nested type arguments). The test creates a table with NOT NULL, ARRAY, MAP, STRUCT, and nested composite types, drops it, re-executes the rendered DDL, and asserts schema equivalence via a new `assert_schema_equivalent` helper. Options are intentionally not compared because the CREATE TABLE path may inject catalog defaults (e.g. `bucket`) that the user did not specify. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| let mut ddl = String::new(); | ||
| let _ = write!( | ||
| ddl, | ||
| "CREATE TABLE {}.{} (", |
There was a problem hiding this comment.
SHOW CREATE TABLE should emit replayable SQL, but the table name is written without identifier quoting. A table created with quoted identifiers such as CREATE TABLE paimon.test_db."select" ("order" INT) would be rendered as CREATE TABLE test_db.select (order INT), which is invalid or changes the identifier semantics when re-executed. The same issue applies to column names, primary/partition keys, and nested struct field names below. Please quote identifiers when required and escape embedded quotes, then add a round-trip case with a reserved-word or otherwise quoted identifier.
| if i > 0 { | ||
| ddl.push_str(", "); | ||
| } | ||
| let _ = write!(ddl, "'{}' = '{}'", k, v); |
There was a problem hiding this comment.
This needs SQL string escaping before interpolation. Option values are arbitrary table metadata, so a value containing a single quote, for example WITH ('comment' = 'Bob's table'), makes the returned definition invalid when users copy or re-execute SHOW CREATE TABLE. Please render string literals with SQL escaping (doubling ' to '') for both keys and values, and add a round-trip test with an option value containing a quote.
| DataType::Char(t) => format!("CHAR({})", t.length()), | ||
| DataType::VarChar(t) => format!("VARCHAR({})", t.length()), | ||
| DataType::Date(_) => "DATE".to_string(), | ||
| DataType::Time(t) => format!("TIME({})", t.precision()), |
There was a problem hiding this comment.
This arm (and the TIMESTAMP_LTZ, MULTISET, and VECTOR arms below) emits syntax that the current SQLContext cannot round-trip. sql_data_type_to_paimon_type has no SqlType::Time branch and falls through to Unsupported SQL data type; similarly TIMESTAMP_LTZ, MULTISET, and VECTOR are not accepted by that converter. A table loaded from existing Paimon metadata with any of these types will therefore return a SHOW CREATE TABLE definition that cannot be executed by paimon-rust. Please either add parser/converter support for the emitted syntax with round-trip tests, or avoid advertising these variants as replayable DDL.
Summary
SHOW CREATE TABLEon a Paimon table returned no DDL because DataFusion 53 rewrites the statement into a query againstinformation_schema.views, whosedefinitioncolumn is populated byTableProvider::get_table_definition().PaimonTableProviderdid not override this method, so the column came back empty.table_definitionstring onPaimonTableProvider, built once intry_newfrom the table's identifier and schema (fields, primary keys, partition keys, options).data_type_to_sqlcovering all 22DataTypevariants (recursive forArray/Map/Multiset/Row/Vector).TableProvider::get_table_definition()to return the cached DDL.Test plan
cargo check -p paimon-datafusioncargo clippy -p paimon-datafusion --lib --tests(zero warnings)sql_context_tests.rs(4 cases): simple table, table with primary key, table with partition + options, table with various data typessql_context_testssuite (39/39) — no regressionSHOW CREATE TABLE <db>.<table>returns the DDL string🤖 Generated with Claude Code