Skip to content

Schema Evolution

Before PGStyx writes rows, it compares the DataFrame schema with the target table and decides whether to fail, adapt the schema, or align the write to the existing shape.

ModeBehavior
strictMissing target columns or incompatible types fail the write.
addColumnsMissing columns are added and supported widenings are applied.
alignToTargetExtra source columns are ignored; target-only columns are left to database defaults.
warnSame behavior as alignToTarget, plus warning logs for dropped or filled columns.

addColumns accepts two kinds of widening.

Metadata-only widenings (instant, no table rewrite, no lock escalation)

Source → Target
varchar(n) → varchar(m) (longer)
varchar(n) → text
numeric(p, s) → numeric(p2, s) (higher precision, same scale)
numeric(p, s) → numeric

Rewrite-required widenings (ACCESS EXCLUSIVE lock, table rewrite, blocks reads and writes)

Source → Target
smallint → integer
smallint → bigint
integer → bigint
real → double precision

Rewrite-required widenings need schemaEvolutionAllowTableRewrite=true. They rewrite the table and block reads and writes for the duration.

Without that opt-in, the job fails with:

Column "<col>" requires widening from INTEGER to BIGINT which rewrites the entire table.
Set option 'schemaEvolutionAllowTableRewrite' to 'true' to allow this.

Changes such as integer → real, boolean → integer, or timestamp → date still need manual intervention.

Column "<col>": widening from <X> to <Y> is not supported. Manually ALTER the column or adjust your source schema.
df.write
.format("pgstyx")
.option("url", "jdbc:postgresql://localhost:5432/warehouse")
.option("dbtable", "users")
.option("user", "postgres")
.option("password", "secret")
.option("writeMode", "upsert")
.option("mergeKeys", "user_id")
.option("schemaEvolution", "addColumns")
.save()
df.write
.format("pgstyx")
.option("url", "jdbc:postgresql://localhost:5432/warehouse")
.option("dbtable", "users")
.option("user", "postgres")
.option("password", "secret")
.option("schemaEvolution", "addColumns")
.option("schemaEvolutionAllowTableRewrite", "true")
.save()

columnCaseSensitive controls identifier matching between the DataFrame and PostgreSQL. Default false:

  • Names are compared case-insensitively.
  • New columns created by PGStyx fold to lowercase.
  • Duplicate detection catches names that differ only by case.

Set columnCaseSensitive=true only when the target table really depends on quoted identifiers.

addColumns and alignToTarget are paid-plan schema-evolution modes. schemaEvolutionAllowTableRewrite is also paid. See Pricing for current tier boundaries.