Skip to content

[Merged by Bors] - Add config maps #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 55 commits into from
Closed
Show file tree
Hide file tree
Changes from 52 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
959c774
removed obsolete helm chart; added spark cluster role
Mar 29, 2022
24921ce
Added spark service account
Mar 29, 2022
60d6640
regenerated helm chart
Mar 29, 2022
979dba8
clippy fix
Mar 29, 2022
7329f44
clippy fix
Mar 29, 2022
3da2b05
fmt fix
Mar 29, 2022
8b9720d
clippy fix
Mar 29, 2022
cde716f
main merge
razvan Mar 30, 2022
d97d3cf
cargo fmt --all
razvan Mar 30, 2022
a64b8b9
Cluster Role
Mar 30, 2022
c176a80
added role permissions for PVCs
adwk67 Mar 30, 2022
0710e14
regenerate role yamls
adwk67 Mar 30, 2022
408f729
fixed indents
adwk67 Mar 30, 2022
c02ee47
fixed indents: manifests
adwk67 Mar 30, 2022
cc62802
Added recommended labels
Mar 30, 2022
7d37a0e
Merge branch 'rbac-setup' of github.com:stackabletech/spark-k8s-opera…
Mar 30, 2022
d2f56c9
Added a comment
Mar 30, 2022
00100f7
Changelog entry
Mar 30, 2022
b483452
reduce pvc permissions
adwk67 Mar 30, 2022
60e6002
initial missing standard docs
adwk67 Mar 30, 2022
9e4b602
added placeholders for pvc/rbac docs
adwk67 Mar 30, 2022
fd18d0d
added external storage comments
adwk67 Mar 31, 2022
a1b4ce8
Merge branch 'main' into documentation
adwk67 Mar 31, 2022
dd3cabb
added pvc example
adwk67 Mar 31, 2022
a754cc5
lint warning
adwk67 Mar 31, 2022
fc288b2
lint warning II
adwk67 Mar 31, 2022
26cd122
pv/pvc example
adwk67 Apr 1, 2022
c527606
initial RBAC page
adwk67 Apr 1, 2022
f1c3c0d
Merge branch 'main' into documentation
adwk67 Apr 1, 2022
e0f9307
rbac overview
adwk67 Apr 1, 2022
e3864c4
annotated some of the examples
adwk67 Apr 1, 2022
5b3971c
updated usage examples
adwk67 Apr 6, 2022
8206531
fixed yaml lint errors
adwk67 Apr 6, 2022
dd76aef
cleaned up examples/CRD defs
adwk67 Apr 6, 2022
4ae144a
initial impl of cmaps
adwk67 Apr 8, 2022
e2b25aa
Merge branch 'main' into add-config-maps
adwk67 Apr 8, 2022
11c1c91
wip: quote properties
adwk67 Apr 8, 2022
3a35501
fixed sorting
adwk67 Apr 11, 2022
17ceb14
regenerate charts
adwk67 Apr 11, 2022
876e726
resolved merge conflict
adwk67 Apr 11, 2022
b5deb99
resolved merge conflict II
adwk67 Apr 11, 2022
4528749
fixed test
adwk67 Apr 11, 2022
033c201
wip
adwk67 Apr 11, 2022
63ff6e7
working copy with arguments-via-config map
adwk67 Apr 12, 2022
f359ac2
regenerate charts
adwk67 Apr 12, 2022
e52d2bf
format corrections
adwk67 Apr 12, 2022
a498eba
move duplicated code to a function
adwk67 Apr 12, 2022
90b20fd
moved config map mounts to driver/executors
adwk67 Apr 12, 2022
8112601
removed redundant element
adwk67 Apr 12, 2022
11015b4
updated changelog and updated usage doc
adwk67 Apr 12, 2022
8b77c6a
lint fix
adwk67 Apr 12, 2022
f332efb
removed dead code
adwk67 Apr 12, 2022
919491f
add kuttl test
adwk67 Apr 19, 2022
8b36ff1
use VolumeMount rather than a new struct
adwk67 Apr 19, 2022
b48436f
use standard configmap-backed volume
adwk67 Apr 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ All notable changes to this project will be documented in this file.
- Initial commit
- ServiceAccount, ClusterRole and RoleBinding for Spark driver ([#39])
- S3 credentials can be provided via a Secret ([#42])
- Job information can be passed via a configuration map ([#50])

[#39]: https://github.com/stackabletech/spark-k8s-operator/pull/39
[#42]: https://github.com/stackabletech/spark-k8s-operator/pull/42
[#50]: https://github.com/stackabletech/spark-k8s-operator/pull/50
26 changes: 26 additions & 0 deletions deploy/crd/sparkapplication.crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,19 @@ spec:
driver:
nullable: true
properties:
configMapMounts:
items:
properties:
configMapName:
type: string
path:
type: string
required:
- configMapName
- path
type: object
nullable: true
type: array
coreLimit:
nullable: true
type: string
Expand Down Expand Up @@ -209,6 +222,19 @@ spec:
executor:
nullable: true
properties:
configMapMounts:
items:
properties:
configMapName:
type: string
path:
type: string
required:
- configMapName
- path
type: object
nullable: true
type: array
cores:
format: uint
minimum: 0.0
Expand Down
26 changes: 26 additions & 0 deletions deploy/helm/spark-k8s-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,19 @@ spec:
driver:
nullable: true
properties:
configMapMounts:
items:
properties:
configMapName:
type: string
path:
type: string
required:
- configMapName
- path
type: object
nullable: true
type: array
coreLimit:
nullable: true
type: string
Expand Down Expand Up @@ -211,6 +224,19 @@ spec:
executor:
nullable: true
properties:
configMapMounts:
items:
properties:
configMapName:
type: string
path:
type: string
required:
- configMapName
- path
type: object
nullable: true
type: array
cores:
format: uint
minimum: 0.0
Expand Down
26 changes: 26 additions & 0 deletions deploy/manifests/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,19 @@ spec:
driver:
nullable: true
properties:
configMapMounts:
items:
properties:
configMapName:
type: string
path:
type: string
required:
- configMapName
- path
type: object
nullable: true
type: array
coreLimit:
nullable: true
type: string
Expand Down Expand Up @@ -212,6 +225,19 @@ spec:
executor:
nullable: true
properties:
configMapMounts:
items:
properties:
configMapName:
type: string
path:
type: string
required:
- configMapName
- path
type: object
nullable: true
type: array
cores:
format: uint
minimum: 0.0
Expand Down
8 changes: 8 additions & 0 deletions docs/modules/ROOT/examples/example-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: cm-job-arguments # <1>
data:
job-args.txt: |
s3a://nyc-tlc/trip data/yellow_tripdata_2021-07.csv # <2>
42 changes: 42 additions & 0 deletions docs/modules/ROOT/examples/example-sparkapp-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: ny-tlc-report-configmap
namespace: default
spec:
version: "1.0"
sparkImage: docker.stackable.tech/stackable/spark-k8s:3.2.1-hadoop3.2-stackable0.4.0
mode: cluster
mainApplicationFile: s3a://stackable-spark-k8s-jars/jobs/ny-tlc-report-1.1.0.jar # <3>
mainClass: tech.stackable.demo.spark.NYTLCReport
volumes:
- name: job-deps
persistentVolumeClaim:
claimName: pvc-ksv
args:
- "--input /arguments/job-args.txt" # <4>
sparkConf:
"spark.hadoop.fs.s3a.aws.credentials.provider": "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
"spark.driver.extraClassPath": "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
"spark.executor.extraClassPath": "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
volumeMounts:
- name: job-deps
mountPath: /dependencies
configMapMounts:
- configMapName: cm-job-arguments # <5>
path: /arguments # <6>
executor:
cores: 1
instances: 3
memory: "512m"
volumeMounts:
- name: job-deps
mountPath: /dependencies
configMapMounts:
- configMapName: cm-job-arguments # <5>
path: /arguments # <6>
2 changes: 1 addition & 1 deletion docs/modules/ROOT/pages/rbac.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ then the cluster-role has to be created assigned to the service account manually
[source,bash]
----
kubectl create clusterrolebinding spark-role --clusterrole=spark-driver-edit-role --serviceaccount=default:default
----
----
33 changes: 31 additions & 2 deletions docs/modules/ROOT/pages/usage.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -92,15 +92,31 @@ include::example$example-sparkapp-pvc.yaml[]
include::example$example-sparkapp-s3-private.yaml[]
----

<1> Job python artifact (local)
<1> Job python artifact (located in S3)
<2> Artifact class
<3> S3 section, specifying the existing secret and S3 end-point ( in this case, Min-IO)
<3> S3 section, specifying the existing secret and S3 end-point (in this case, MinIO)
<4> Credentials secret
<5> Spark dependencies: the credentials provider (the user knows what is relevant here) plus dependencies needed to access external resources...
<6> ...in this case, in s3, accessed with the credentials defined in the secret
<7> the name of the volume mount backed by a `PersistentVolumeClaim` that must be pre-existing
<8> the path on the volume mount: this is referenced in the `sparkConf` section where the extra class path is defined for the driver and executors

=== JVM (Scala): externally located artifact accessed with job arguments provided via configuration map

[source,yaml]
----
include::example$example-configmap.yaml[]
----
[source,yaml]
----
include::example$example-sparkapp-configmap.yaml[]
----
<1> Name of the configuration map
<2> Argument required by the job
<3> Job scala artifact that requires an input argument
<4> The expected job argument, accessed via the mounted configuration map file
<5> The name of the configuration map that will be mounted to the driver/executor
<6> The mount location of the configuration map (this will contain a file `/arguments/job-args.txt`)

== CRD argument coverage

Expand Down Expand Up @@ -187,6 +203,12 @@ Below are listed the CRD fields that can be defined by the user:
|`spec.driver.volumeMounts.mountPath`
|Volume mount path

|`spec.driver.configMapMounts.configMapName`
|Name of configuration map to be mounted in the driver

|`spec.driver.configMapMounts.path`
|Mount path of the configuration map in the driver

|`spec.executor.cores`
|Number of cores for each executor

Expand All @@ -204,4 +226,11 @@ Below are listed the CRD fields that can be defined by the user:

|`spec.executor.volumeMounts.mountPath`
|Volume mount path

|`spec.executor.configMapMounts.configMapName`
|Name of configuration map to be mounted in the executor

|`spec.executor.configMapMounts.path`
|Mount path of the configuration map in the executor
|===

50 changes: 50 additions & 0 deletions examples/ny-tlc-report-configmap.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
apiVersion: v1
kind: ConfigMap
metadata:
name: cm-job-arguments
data:
job-args.txt: |
s3a://nyc-tlc/trip data/yellow_tripdata_2021-07.csv
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: ny-tlc-report-configmap
namespace: default
spec:
version: "1.0"
sparkImage: docker.stackable.tech/stackable/spark-k8s:3.2.1-hadoop3.2-stackable0.4.0
mode: cluster
mainApplicationFile: s3a://stackable-spark-k8s-jars/jobs/ny-tlc-report-1.1.0.jar
mainClass: tech.stackable.demo.spark.NYTLCReport
volumes:
- name: job-deps
persistentVolumeClaim:
claimName: pvc-ksv
args:
- "--input /arguments/job-args.txt"
sparkConf:
"spark.hadoop.fs.s3a.aws.credentials.provider": "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
"spark.driver.extraClassPath": "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
"spark.executor.extraClassPath": "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
volumeMounts:
- name: job-deps
mountPath: /dependencies
configMapMounts:
- configMapName: cm-job-arguments
path: /arguments
executor:
cores: 1
instances: 3
memory: "512m"
volumeMounts:
- name: job-deps
mountPath: /dependencies
configMapMounts:
- configMapName: cm-job-arguments
path: /arguments
31 changes: 31 additions & 0 deletions rust/crd/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,26 @@ impl SparkApplication {
tmp.iter().flat_map(|v| v.iter()).cloned().collect()
}

pub fn executor_config_map_mounts(&self) -> Vec<ConfigMapMount> {
let tmp = self
.spec
.executor
.as_ref()
.and_then(|executor_conf| executor_conf.config_map_mounts.clone());

tmp.iter().flat_map(|v| v.iter()).cloned().collect()
}

pub fn driver_config_map_mounts(&self) -> Vec<ConfigMapMount> {
let tmp = self
.spec
.driver
.as_ref()
.and_then(|driver_conf| driver_conf.config_map_mounts.clone());

tmp.iter().flat_map(|v| v.iter()).cloned().collect()
}

pub fn executor_volume_mounts(&self) -> Vec<VolumeMount> {
let tmp = self
.spec
Expand Down Expand Up @@ -287,6 +307,13 @@ pub struct CommonConfig {
pub enable_monitoring: Option<bool>,
}

#[derive(Clone, Debug, Default, Deserialize, JsonSchema, PartialEq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct ConfigMapMount {
pub config_map_name: String,
pub path: String,
}

#[derive(Clone, Debug, Default, Deserialize, JsonSchema, PartialEq, Serialize)]
#[serde(rename_all = "camelCase")]
pub struct DriverConfig {
Expand All @@ -295,6 +322,8 @@ pub struct DriverConfig {
pub memory: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub volume_mounts: Option<Vec<VolumeMount>>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub config_map_mounts: Option<Vec<ConfigMapMount>>,
}

impl DriverConfig {
Expand Down Expand Up @@ -323,6 +352,8 @@ pub struct ExecutorConfig {
pub memory: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub volume_mounts: Option<Vec<VolumeMount>>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub config_map_mounts: Option<Vec<ConfigMapMount>>,
}

impl ExecutorConfig {
Expand Down
Loading