Dont set experimental userClassPathFirst configuration #296

razvan · 2023-10-16T13:12:22Z

Affected version

No response

Current and expected behavior

We set the experimental spark.driver.userClassPathFirst and spark.executor.userClassPathFirst configs to true.
This causes Classpath issues once you pull in Java dependencies.

The problem is that there is no right or wrong way of doing things... Some jobs need this to be enabled, some need it to be disabled. We might just want to document the current state.

Possible solution

Don't set these experimental features.

Additional context

When dynamically loading extensions, like this:

      deps:
        packages:
          - org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.4.0
          - org.apache.spark:spark-sql-kafka-0-10_2.12:3.4.0
      sparkConf:
        spark.driver.userClassPathFirst: "false"
        spark.executor.userClassPathFirst: "false"

An error occurs:

:: resolution report :: resolve 5610ms :: artifacts dl 1730ms
        :: modules in use:
        com.google.code.findbugs#jsr305;3.0.0 from central in [default]
        commons-logging#commons-logging;1.1.3 from central in [default]
        org.apache.commons#commons-pool2;2.11.1 from central in [default]
        org.apache.hadoop#hadoop-client-api;3.3.4 from central in [default]
        org.apache.hadoop#hadoop-client-runtime;3.3.4 from central in [default]
        org.apache.iceberg#iceberg-spark-runtime-3.4_2.12;1.4.0 from central in [default]
        org.apache.kafka#kafka-clients;3.3.2 from central in [default]
        org.apache.spark#spark-sql-kafka-0-10_2.12;3.4.0 from central in [default]
        org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.4.0 from central in [default]
        org.lz4#lz4-java;1.8.0 from central in [default]
        org.slf4j#slf4j-api;2.0.6 from central in [default]
        org.xerial.snappy#snappy-java;1.1.9.1 from central in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   12  |   12  |   12  |   0   ||   12  |   12  |
        ---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-69012960-2916-4d39-9ea8-c688cb61be81
        confs: [default]
        12 artifacts copied, 0 already retrieved (84990kB/127ms)
SLF4J: A SLF4J service provider failed to instantiate:
org.slf4j.spi.SLF4JServiceProvider: org.apache.logging.slf4j.SLF4JServiceProvider not a subtype
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback not org.apache.hadoop.security.GroupMappingServiceProvider
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
        at org.apache.hadoop.security.Groups.<init>(Groups.java:107)
        at org.apache.hadoop.security.Groups.<init>(Groups.java:102)
        at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:451)
        at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:338)
        at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:300)
        at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:575)
        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:3746)
        at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:3736)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3520)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
        at org.apache.spark.util.DependencyUtils$.resolveGlobPath(DependencyUtils.scala:317)
        at org.apache.spark.util.DependencyUtils$.$anonfun$resolveGlobPaths$2(DependencyUtils.scala:273)
        at org.apache.spark.util.DependencyUtils$.$anonfun$resolveGlobPaths$2$adapted(DependencyUtils.scala:271)
        at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
        at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
        at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
        at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
        at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
        at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
        at org.apache.spark.util.DependencyUtils$.resolveGlobPaths(DependencyUtils.scala:271)
        at org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$4(SparkSubmit.scala:390)
        at scala.Option.map(Option.scala:230)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:390)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: class org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback not org.apache.hadoop.security.GroupMappingServiceProvider
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2714)
        ... 31 more

Environment

No response

Would you like to work on fixing this bug?

None

The text was updated successfully, but these errors were encountered:

sbernauer · 2024-02-14T08:04:38Z

@razvan I guess this is a duplicate of #354?

razvan · 2024-02-14T08:07:36Z

Yes. Forgot about it.

razvan added the type/bug label Oct 16, 2023

sbernauer added this to Stackable End-to-End Coordination Oct 18, 2023

sbernauer moved this to Next in Stackable End-to-End Coordination Oct 18, 2023

sbernauer changed the title ~~userClassPathFirst cannot be overwritten and it causes problems~~ Dont set userClassPathFirst experimental configuration Oct 18, 2023

sbernauer changed the title ~~Dont set userClassPathFirst experimental configuration~~ Dont set experimental userClassPathFirst configuration Oct 18, 2023

sbernauer removed the type/bug label Oct 18, 2023

lfrancke moved this from Next to Ideas Backlog in Stackable End-to-End Coordination Nov 8, 2023

razvan closed this as completed Feb 14, 2024

lfrancke moved this from Ideas Backlog to Done in Stackable End-to-End Coordination Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Dont set experimental userClassPathFirst configuration #296

Dont set experimental userClassPathFirst configuration #296

razvan commented Oct 16, 2023 •

edited by sbernauer

Loading

sbernauer commented Feb 14, 2024

Uh oh!

razvan commented Feb 14, 2024

Uh oh!

Uh oh!

Dont set experimental userClassPathFirst configuration #296

Dont set experimental userClassPathFirst configuration #296

Comments

razvan commented Oct 16, 2023 • edited by sbernauer Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Affected version

Current and expected behavior

Possible solution

Additional context

Environment

Would you like to work on fixing this bug?

sbernauer commented Feb 14, 2024

Uh oh!

razvan commented Feb 14, 2024

Uh oh!

razvan commented Oct 16, 2023 •

edited by sbernauer

Loading