Skip to content

Type Mapping

PGStyx uses type mappings when it creates a target table, compares an existing table to a DataFrame, and decides whether a schema change is compatible.

Spark → PostgreSQL

Spark type	PostgreSQL type
`ByteType`	`SMALLINT`
`ShortType`	`SMALLINT`
`IntegerType`	`INTEGER`
`LongType`	`BIGINT`
`FloatType`	`REAL`
`DoubleType`	`DOUBLE PRECISION`
`DecimalType(p, s)`	`NUMERIC(p, s)`
`StringType`	`TEXT`
`BooleanType`	`BOOLEAN`
`DateType`	`DATE`
`TimestampType`	`TIMESTAMP`
`TimestampNTZType`	`TIMESTAMP`
`BinaryType`	`BYTEA`
`ArrayType(e)`	`e[]`
`MapType(k, v)`	`JSONB`
`StructType`	`JSONB`
anything else	`TEXT`

Arrays and semi-structured values

Arrays land as PostgreSQL arrays.
MapType and StructType land as JSONB.
NUMERIC[] columns read back with a generic decimal element type during schema comparison because array-element precision is not available from metadata.

String-backed PostgreSQL types

When the target column expects types such as uuid, jsonb, inet, or enums, a StringType source can still be valid as long as PostgreSQL can parse the value.

PostgreSQL → Spark during comparison

PostgreSQL type	Spark type
`smallint`	`ShortType`
`integer`, `serial`	`IntegerType`
`bigint`, `bigserial`	`LongType`
`real`	`FloatType`
`double precision`	`DoubleType`
`numeric(p, s)`	`DecimalType(p, s)`
`char`, `varchar`, `text`	`StringType`
`date`	`DateType`
`timestamp`, `timestamptz`	`TimestampType`
`boolean`	`BooleanType`
`bytea`	`BinaryType`
`uuid`, `json`, `jsonb`, `inet`	`StringType`

Widenings accepted by `addColumns`

When schemaEvolution=addColumns, these widenings are accepted automatically:

Source	Target	Why
`ShortType`	`IntegerType` or `LongType`	Lossless widening
`IntegerType`	`LongType`	Lossless widening
`ByteType`	`ShortType`, `IntegerType`, `LongType`	Lossless widening
`FloatType`	`DoubleType`	Lossless widening
`IntegerType`	`DecimalType(p, s)` where p ≥ 10	Integer fits the target
`LongType`	`DecimalType(p, s)` where p ≥ 19	Long fits the target
`VarcharType(n)`	`StringType`	Unbounded text is wider
`CharType(n)`	`StringType`	Unbounded text is wider
Same type	Same type	No promotion needed

Some widenings still require schemaEvolutionAllowTableRewrite=true. See Schema Evolution.

Changes that still need manual intervention

Source	Target	Reason
`StringType`	any numeric	Requires parsing, can fail
`BooleanType`	`IntegerType`	Semantics change
`IntegerType`	`FloatType`	Lossy precision change
`TimestampType`	`DateType`	Loses time component
`DateType`	`TimestampType`	Midnight assumptions are easy to get wrong
`DecimalType(p1, s1)`	`DecimalType(p2, s2)` where p2 < p1	Precision loss

Any of these needs manual intervention in the target schema.

Planning tips

Be explicit about decimal precision and scale when the target table already enforces them.
Decide early whether nested data should land as JSONB or be flattened upstream.
Pair this page with Schema Evolution when schemas drift over time.