Install
PGStyx publishes signed artifacts to Maven Central. Pick the install path that matches your build environment, then run the verify snippet at the bottom to confirm the datasource is registered.
Coordinates
Section titled “Coordinates”| Spark | Scala | Artifact |
|---|---|---|
| 3.3.4 | 2.12 | com.pgstyx:pgstyx-spark3_2.12:1.0.0 |
| 3.5.8 | 2.13 | com.pgstyx:pgstyx-spark3_2.13:1.0.0 |
| 4.1.1 | 2.13 | com.pgstyx:pgstyx-spark4_2.13:1.0.0 |
The artifact pattern is pgstyx-spark{sparkMajor}_{scalaBinary}. Spark 3 artifacts target JDK 11+; Spark 4 artifacts target JDK 17+.
spark-submit
Section titled “spark-submit”spark-submit \ --packages com.pgstyx:pgstyx-spark3_2.12:1.0.0 \ my-job.jarFor a locally-built JAR, swap --packages for --jars target/scala-2.12/pgstyx-spark3_2.12-1.0.0.jar.
Build tools
Section titled “Build tools”libraryDependencies += "com.pgstyx" %% "pgstyx-spark3" % "1.0.0"The %% operator resolves the active scalaBinaryVersion against the published matrix. For Spark 4, use pgstyx-spark4 and pin scalaVersion := "2.13.17".
<dependency> <groupId>com.pgstyx</groupId> <artifactId>pgstyx-spark3_2.12</artifactId> <version>1.0.0</version></dependency>Replace pgstyx-spark3_2.12 with pgstyx-spark3_2.13 or pgstyx-spark4_2.13 to match your Spark + Scala target.
implementation 'com.pgstyx:pgstyx-spark3_2.12:1.0.0'Managed runtimes
Section titled “Managed runtimes”Databricks
Section titled “Databricks”Attach PGStyx as a cluster library so every notebook and job on that cluster resolves the datasource.
- Compute → your cluster → Libraries → Install new.
- Source: Maven.
- Coordinates:
com.pgstyx:pgstyx-spark3_2.12:1.0.0. Adjust for the runtime’s Scala binary version. - Install and restart the cluster.
Do not add the JAR to the driver application separately — the cluster library covers both driver and executors.
EMR and Dataproc
Section titled “EMR and Dataproc”Pass --packages com.pgstyx:pgstyx-spark3_2.12:1.0.0 to your spark-submit step. On EMR step definitions, add it to Args. On Dataproc, use --properties spark.jars.packages=com.pgstyx:pgstyx-spark3_2.12:1.0.0.
Scala version pinning
Section titled “Scala version pinning”The _2.12 / _2.13 suffix must match the Spark runtime’s Scala binary version. Mixing them fails at class-loading time, not at dependency resolution.
- Spark 3.x ships Scala 2.12 and 2.13 builds.
- Spark 4.x is Scala 2.13 only.
If unsure, run spark.version and scala.util.Properties.versionNumberString on the target cluster and match exactly.
Verify install
Section titled “Verify install”The smallest snippet that proves the datasource resolved:
val df = spark.createDataFrame(Seq((1, "ok"))).toDF("id", "label")
df.write .format("pgstyx") .option("url", "jdbc:postgresql://localhost:5432/warehouse") .option("dbtable", "pgstyx_verify") .option("user", "postgres") .option("password", "secret") .save()A successful run creates pgstyx_verify with one row. Failed to find data source: pgstyx means the JAR is not on the classpath — recheck --packages or your cluster libraries.