Tuning and Metrics

Two concerns share this page because they share one vocabulary: batches, retries, pools, counters. Use it after the job works, when you need it to be faster or more visible.

Batch size

batchSize (default 1000) controls how many rows PGStyx buffers before it flushes them to PostgreSQL.

Larger batches trade memory for fewer round trips. Typical tuning range is 500 to 5000. Above that, returns usually flatten while per-task memory pressure rises.

df.write
  .format("pgstyx")
  .option("batchSize", "5000")
  // ...
  .save()

Connection pool

The pool settings determine how many database connections PGStyx can hold and how long idle connections stay around.

That matters for capacity planning:

maxPoolSize defaults to spark.executor.cores when Spark provides it, or 2 otherwise. High core counts can become high connection counts quickly.

Option	Default	Notes
`maxPoolSize`	`spark.executor.cores` (fallback: `2`)	Hard ceiling on concurrent connections per active Spark process
`minIdle`	`0`	Warm connections kept open
`connectionTimeout`	`30000` ms	How long a task waits for a connection before failing
`idleTimeout`	`30000` ms	When idle connections close
`maxLifetime`	`1800000` ms	When a connection is retired and replaced

All five pool options are available on every plan.

If the database is connection-bound, reduce the number of writer tasks before you change anything else:

df.coalesce(n)
  .write
  .format("pgstyx")
  .options(opts)
  .save()

Rule of thumb:

n ≤ (postgres_max_connections − headroom) / maxPoolSize

Prefer coalesce when reducing partitions. It avoids a full shuffle.

Retries

Default: three total attempts with exponential backoff. With retryBackoffMs=1000 and retryBackoffMultiplier=2.0, the delays are 1 second, then 2 seconds.

Only these failures retry:

SQLSTATE 08000, 08003, 08006 — connection errors.
SQLSTATE 40001, 40P01 — serialization failure, deadlock.
SQLSTATE 53000, 53100, 53200, 53300, 53400 — resource errors.

Everything else throws on the first failure. Constraint violations (23xxx), permission errors (42xxx), and type mismatches do not retry.

df.write
  .format("pgstyx")
  .option("maxRetries", "5")
  .option("retryBackoffMs", "2000")
  .option("retryBackoffMultiplier", "1.5")
  // ...
  .save()

After maxRetries attempts, the original exception is wrapped in PGStyxException with a message starting Operation failed after N retries:.

Metrics

While metricsEnabled=true (the default), PGStyx tracks four counters:

pgstyx.rows.written
pgstyx.rows.filtered
pgstyx.retries
pgstyx.errors

Read the latest batch report with Metrics.getReport():

import com.pgstyx.metrics.Metrics

df.write
  .format("pgstyx")
  .options(opts)
  .save()

println(Metrics.getReport())

The report is a multi-line string:

PGStyx Metrics Report:
  Rows Written: 1234567
  Rows Filtered: 42
  Retries: 0
  Errors: 0

Metrics.getReport() must be called on the driver after the write completes. Calling it from an executor throws IllegalStateException.

Turning metrics off

metricsEnabled=false stops counter updates. Metrics.getReport() then returns zeros. Only useful when per-row overhead matters; the updates are cheap but not free.