Replace a default library jar

Learn how to replace a default Java or Scala library jar with another version.

Written by ram.sankarasubramanian

Last published at: May 16th, 2022

Databricks includes a number of default Java and Scala libraries. You can replace any of these libraries with another version by using a cluster-scoped init script to remove the default library jar and then install the version you require.

Delete

Warning

Removing default libraries and installing new versions may cause instability or completely break your Databricks cluster. You should thoroughly test any new library version in your environment before running production jobs.

Identify the artifact id

To identify the name of the jar file you want to remove:

  1. Click the Databricks Runtime version you are using from the list of supported releases (AWS | Azure | GCP).
  2. Navigate to the Java and Scala libraries section.
  3. Identify the Artifact ID for the library you want to remove.

Use the artifact id to find the jar filename

Use the ls -l command in a notebook to find the jar that contains the artifact id. For example, to find the jar filename for the spark-snowflake_2.12 artifact id in Databricks Runtime 7.0 you can use the following code:

%sh

ls -l /databricks/jars/*spark-snowflake_2.12*

This returns the jar filename

`----workspace_spark_3_0--maven-trees--hive-2.3__hadoop-2.7--net.snowflake--spark-snowflake_2.12--net.snowflake__spark-snowflake_2.12__2.5.9-spark_2.4.jar`.

Upload the replacement jar file

Upload your replacement jar file to a DBFS path.

Create the init script

Use the following template to create a cluster-scoped init script.

%sh

#!/bin/bash
rm -rf /databricks/jars/<jar_filename_to_remove>.jar
cp /dbfs/<path_to_replacement_jar>/<replacement_jar_filename>.jar /databricks/jars/

Using the spark-snowflake_2.12 example from the prior step would result in an init script similar to the following:

%sh

#!/bin/bash
rm -rf /databricks/jars/----workspace_spark_3_0--maven-trees--hive-2.3__hadoop-2.7--net.snowflake--spark-snowflake_2.12--net.snowflake__spark-snowflake_2.12__2.5.9-spark_2.4.jar
cp /dbfs/FileStore/jars/e43fe9db_c48d_412b_b142_cdde10250800-spark_snowflake_2_11_2_7_1_spark_2_4-b2adc.jar /databricks/jars/

Install the init script and restart

  1. Install the cluster-scoped init script on the cluster, following the instructions in Configure a cluster-scoped init script (AWS | Azure | GCP).
  2. Restart the cluster.
Was this article helpful?