Skip to content

Troubleshooting

Entries are the exact strings you will see in a stack trace or log. Grep for yours.

Reduce the write to:

  1. required JDBC options
  2. writeMode=append
  3. no SSL
  4. no validation
  5. no schema evolution

If that works, layer options back one at a time until the failure returns. The option you just added is the cause.

JAR not on the classpath. Check --packages, --jars, or the cluster library. Run the verify snippet from Install.

IllegalArgumentException: url is required (or dbtable, user, password)

Section titled “IllegalArgumentException: url is required (or dbtable, user, password)”

A required option is missing. These failures happen before Spark starts writing rows. Check spelling, make sure the option is present, and confirm you are using the expected keys.

IllegalArgumentException: mergekeys required for upsert mode

Section titled “IllegalArgumentException: mergekeys required for upsert mode”

writeMode=upsert without mergeKeys. Either set mergeKeys, or change the mode.

PGStyxException: Table <name> not found or has no columns

Section titled “PGStyxException: Table <name> not found or has no columns”

PGStyx connected to PostgreSQL but could not resolve a usable target table. Common causes:

  • Wrong schema. Set currentSchema=<schema> in the JDBC URL.
  • Wrong case on the table name.
  • The target genuinely does not exist.

Verify with \dt in psql.

PGStyxException: Duplicate column names after case folding: <names>

Section titled “PGStyxException: Duplicate column names after case folding: <names>”

The DataFrame has two columns that differ only in case (Id and ID), and columnCaseSensitive=false. Rename one of them, or set columnCaseSensitive=true if the target table really has both.

PGStyxException: Target table has columns that differ only in casing: <names>

Section titled “PGStyxException: Target table has columns that differ only in casing: <names>”

Same shape, target-side. Either rename the PostgreSQL columns or set columnCaseSensitive=true.

PGStyxException: Schema validation failed: Missing in target: ..., Mismatches: ...

Section titled “PGStyxException: Schema validation failed: Missing in target: ..., Mismatches: ...”

schemaEvolution=strict saw a DataFrame column that does not exist in the target, or a type that does not widen safely. Three paths:

  • Add the column to the target manually.
  • Switch schemaEvolution to addColumns.
  • Adjust the source schema to match the target.

PGStyxException: Column "<col>" requires widening from <X> to <Y> which rewrites the entire table. Set option 'schemaEvolutionAllowTableRewrite' to 'true' to allow this.

Section titled “PGStyxException: Column "<col>" requires widening from <X> to <Y> which rewrites the entire table. Set option 'schemaEvolutionAllowTableRewrite' to 'true' to allow this.”

A rewrite-kind widening (smallint → integer | bigint, integer → bigint, real → double precision) needs explicit opt-in. Either set the flag and accept the rewrite window, or alter the column manually during maintenance.

PGStyxException: Column "<col>": widening from <X> to <Y> is not supported.

Section titled “PGStyxException: Column "<col>": widening from <X> to <Y> is not supported.”

The source-to-target type change is not in the accepted widening set. Manually alter the target column, or adjust the source data type.

PGStyxException: Target column '<name>' not found in source schema.

Section titled “PGStyxException: Target column '<name>' not found in source schema.”

The target has a column the DataFrame does not. In alignToTarget or warn, check casing and columnCaseSensitive first.

PGStyxException: Merge key '<name>' does not match any column in target schema [...]

Section titled “PGStyxException: Merge key '<name>' does not match any column in target schema [...]”

mergeKeys references a column that is not on the target table. Check spelling and case against the list of column names in the error.

PGStyxException: writeMode=overwrite is not supported for streaming writes; use append or upsert

Section titled “PGStyxException: writeMode=overwrite is not supported for streaming writes; use append or upsert”

Streaming rejects writeMode=overwrite at stream start. Switch to append or upsert, then use the stream’s output mode and keys to control how rows change over time.

PGStyxException: Operation failed after <N> retries: <original>

Section titled “PGStyxException: Operation failed after <N> retries: <original>”

Retries exhausted on a transient failure. The original exception is in the message.

  • Connection errors (08xxx) — check maxPoolSize, write parallelism, and PostgreSQL connection limits. See Tuning and Metrics.
  • Deadlocks (40P01) — reduce write parallelism or batch size.
  • Resource errors (53xxx) — server is out of disk, memory, or connections.

PGStyxException: Non-retriable error: <original>

Section titled “PGStyxException: Non-retriable error: <original>”

The underlying exception is not on the retry allow-list. Constraint violations (23xxx), permission errors (42xxx), and type mismatches land here. Fix the root cause — retries will not help.

LicenseException: <feature> requires <tier>

Section titled “LicenseException: <feature> requires <tier>”

A paid-plan capability is enabled without a matching license. Options that trigger this:

  • More than one mergeKeys value.
  • schemaEvolution=addColumns or schemaEvolution=alignToTarget.
  • validationMode other than skip.
  • ssl=true.
  • Any of sslCert, sslKey, sslRootCert with ssl=true.
  • .writeStream.format("pgstyx").
  • streamingExactlyOnce=true.

Either remove the option, or provide a licenseKey. See Pricing.

Connection succeeds locally but fails on cluster with FileNotFoundException on SSL cert files

Section titled “Connection succeeds locally but fails on cluster with FileNotFoundException on SSL cert files”

Cert files live on the driver filesystem but not on executors. Distribute them to every worker. Details on SSL and Security.

Insert-only writes appear to drop rows silently

Section titled “Insert-only writes appear to drop rows silently”

validationMode=warnAndFilter can filter invalid rows instead of failing the job. If the target also ignores duplicate inserts, replayed rows can be skipped instead of written twice. Check Rows Filtered in Metrics.getReport() and review the target constraints before assuming the pipeline lost data.