Make volume mounts on the job pod configurable. #342

razvan · 2024-01-23T20:08:26Z

Description

The CR allows users to mount volumes on the driver and executor pods, but not on job pods.

But in kerberized environments, users need the possibility to mount key tabs on the pod executing the spark-submit command. This currently only possible with pod overrides which is clumsy.

Add a SparkApplication.spec.job.config.volumeMounts field
So that SparkApplication.spec.volumes can be mounted in Jobs just like in the Driver and Executor
This was an oversight in the initial design because we never needed to integrate with e.g. HDFS so far, now we do

Proposed change:

spec:
    job:
      config:
        volumeMounts:

With volumeMounts be the exact same as for driver and executor.

Acceptance criteria

- [x] Users can mount volumes on the job pod by specifying them directly in the `SparkApplication` resource without the need to create pod overrides.
- [x] Documentation: Check if we have places where podOverrides are used to achieve the same and update this
- [x] Documentation: Update CRD Reference page

The text was updated successfully, but these errors were encountered:

fhennig · 2024-02-20T10:40:32Z

I just wanted to raise awareness that I saw a bunch of different ways of mounting volumes across the platform, and we should keep it similar. But I think spark is the odd one out anyways. Link to relevant ticket: stackabletech/issues#471

lfrancke · 2024-02-20T12:29:13Z

Good point, thanks!

@razvan et. al. any opinions?
Can someone go through all operators and briefly check how consistent they are?

razvan · 2024-02-20T13:01:18Z

TL;DR: IMO naming should be consistent but structure should be flexible.

I agree that the property names should be consistent. Druid's choice of clusterConfig.extraVolumes is off considering there seem to be no clusterConfig.volumes to begin with (from the user's perspective).

But there are several constraints to be considered when thinking about the CRD structure in general:

Does the product even have the concept of clusterConfig ? Not for Spark applications (at least until we add support for spark connector) .
What volume mounts are supported by different product roles ? Can all volumes be mounted everywhere ? Short answer: no. Example: Druid local cache volumes only make sense on one role.
operator-rs design: logging, secret volumes, listener volumes

nightkr · 2024-02-21T07:36:09Z

Do we expect users to mount arbitrary volumes or is this primarily for the Kerberos use case?

Secret-op is implemented using volumes, but I don't think they're the right abstraction to expose unless you actually want to store arbitrary data. For the Kerberos use case I'd assume you'd also need to tell spark-submit about where to find the Kerberos config/keytab, right?

nightkr · 2024-02-21T07:38:04Z

As for consistency, AFAIK we always treat secrets as separate from storage volumes elsewhere, even for products that use both.

lfrancke · 2024-02-21T07:54:18Z

Do we expect users to mount arbitrary volumes or is this primarily for the Kerberos use case?

Yes, this came up due to Kerberos.
But it is also an inconsistency as the other two (driver & executor) already allow it
And it is a valid use-case to mount arbitrary stuff you need in your job.

So, I think it's worth having this field in either case but we might also think about whether there is a better way to "support" Kerberos.

We had a chat about that yesterday after the daily and should have added it to the ticket: We decided to go with this simple solution for now, knowing that "Discovery 2.0" is a thing we want to tackle at some point.

razvan self-assigned this Jan 23, 2024

lfrancke added the size/S label Feb 20, 2024

lfrancke mentioned this issue Feb 20, 2024

Demo: Spark connecting to Kerberized HDFS & Hive stackabletech/issues#530

Closed

lfrancke added this to Stackable Engineering Feb 20, 2024

lfrancke moved this to Next in Stackable Engineering Feb 20, 2024

razvan mentioned this issue Feb 20, 2024

Add volumeMounts to job pods #359

Merged

razvan moved this from Next to Development: In Progress in Stackable Engineering Feb 21, 2024

razvan moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Feb 22, 2024

adwk67 moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Feb 22, 2024

razvan closed this as completed in #359 Feb 22, 2024

razvan moved this from Development: In Review to Development: Done in Stackable Engineering Mar 5, 2024

lfrancke added release/24.3.0 release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 5, 2024

lfrancke moved this from Development: Done to Done in Stackable Engineering Mar 5, 2024

Jimvin mentioned this issue Mar 29, 2024

volumeMounts is not working for Job resource. #385

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Make volume mounts on the job pod configurable. #342

Make volume mounts on the job pod configurable. #342

razvan commented Jan 23, 2024 •

edited

Loading

fhennig commented Feb 20, 2024

Uh oh!

lfrancke commented Feb 20, 2024

Uh oh!

razvan commented Feb 20, 2024

Uh oh!

nightkr commented Feb 21, 2024

Uh oh!

nightkr commented Feb 21, 2024

Uh oh!

lfrancke commented Feb 21, 2024

Uh oh!

Uh oh!

Make volume mounts on the job pod configurable. #342

Make volume mounts on the job pod configurable. #342

Comments

razvan commented Jan 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Acceptance criteria

fhennig commented Feb 20, 2024

Uh oh!

lfrancke commented Feb 20, 2024

Uh oh!

razvan commented Feb 20, 2024

Uh oh!

nightkr commented Feb 21, 2024

Uh oh!

nightkr commented Feb 21, 2024

Uh oh!

lfrancke commented Feb 21, 2024

Uh oh!

razvan commented Jan 23, 2024 •

edited

Loading