Skip to content

Commit 11015b4

Browse files
committed
updated changelog and updated usage doc
1 parent 8112601 commit 11015b4

File tree

4 files changed

+82
-2
lines changed

4 files changed

+82
-2
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ All notable changes to this project will be documented in this file.
99
- Initial commit
1010
- ServiceAccount, ClusterRole and RoleBinding for Spark driver ([#39])
1111
- S3 credentials can be provided via a Secret ([#42])
12+
- Job information can be passed via a configuration map ([#50])
1213

1314
[#39]: https://github.com/stackabletech/spark-k8s-operator/pull/39
1415
[#42]: https://github.com/stackabletech/spark-k8s-operator/pull/42
16+
[#50]: https://github.com/stackabletech/spark-k8s-operator/pull/50
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
apiVersion: v1
3+
kind: ConfigMap
4+
metadata:
5+
name: cm-job-arguments # <1>
6+
data:
7+
job-args.txt: |
8+
s3a://nyc-tlc/trip data/yellow_tripdata_2021-07.csv # <2>
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
---
2+
apiVersion: spark.stackable.tech/v1alpha1
3+
kind: SparkApplication
4+
metadata:
5+
name: ny-tlc-report-configmap
6+
namespace: default
7+
spec:
8+
version: "1.0"
9+
sparkImage: docker.stackable.tech/stackable/spark-k8s:3.2.1-hadoop3.2-stackable0.4.0
10+
mode: cluster
11+
mainApplicationFile: s3a://stackable-spark-k8s-jars/jobs/ny-tlc-report-1.1.0.jar # <3>
12+
mainClass: tech.stackable.demo.spark.NYTLCReport
13+
volumes:
14+
- name: job-deps
15+
persistentVolumeClaim:
16+
claimName: pvc-ksv
17+
args:
18+
- "--input /arguments/job-args.txt" # <4>
19+
sparkConf:
20+
"spark.hadoop.fs.s3a.aws.credentials.provider": "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
21+
"spark.driver.extraClassPath": "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
22+
"spark.executor.extraClassPath": "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
23+
driver:
24+
cores: 1
25+
coreLimit: "1200m"
26+
memory: "512m"
27+
volumeMounts:
28+
- name: job-deps
29+
mountPath: /dependencies
30+
configMapMounts:
31+
- configMapName: cm-job-arguments # <5>
32+
path: /arguments # <6>
33+
executor:
34+
cores: 1
35+
instances: 3
36+
memory: "512m"
37+
volumeMounts:
38+
- name: job-deps
39+
mountPath: /dependencies
40+
configMapMounts:
41+
- configMapName: cm-job-arguments # <5>
42+
path: /arguments # <6>

docs/modules/ROOT/pages/usage.adoc

Lines changed: 30 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,15 +92,31 @@ include::example$example-sparkapp-pvc.yaml[]
9292
include::example$example-sparkapp-s3-private.yaml[]
9393
----
9494

95-
<1> Job python artifact (local)
95+
<1> Job python artifact (located in S3)
9696
<2> Artifact class
97-
<3> S3 section, specifying the existing secret and S3 end-point ( in this case, Min-IO)
97+
<3> S3 section, specifying the existing secret and S3 end-point (in this case, MinIO)
9898
<4> Credentials secret
9999
<5> Spark dependencies: the credentials provider (the user knows what is relevant here) plus dependencies needed to access external resources...
100100
<6> ...in this case, in s3, accessed with the credentials defined in the secret
101101
<7> the name of the volume mount backed by a `PersistentVolumeClaim` that must be pre-existing
102102
<8> the path on the volume mount: this is referenced in the `sparkConf` section where the extra class path is defined for the driver and executors
103103

104+
=== JVM (Scala): externally located artifact accessed with job arguments provided via configuration map
105+
106+
[source,yaml]
107+
----
108+
include::example$example-configmap.yaml[]
109+
----
110+
[source,yaml]
111+
----
112+
include::example$example-sparkapp-configmap.yaml[]
113+
----
114+
<1> Name of the configuration map
115+
<2> Argument required by the job
116+
<3> Job scala artifact that requires an input argument
117+
<4> The expected job argument, accessed via the mounted configuration map file
118+
<5> The name of the configuration map that will be mounted to the driver/executor
119+
<6> The mount location of the configuration map (this will contain a file `/arguments/job-args.txt`)
104120

105121
== CRD argument coverage
106122

@@ -187,6 +203,12 @@ Below are listed the CRD fields that can be defined by the user:
187203
|`spec.driver.volumeMounts.mountPath`
188204
|Volume mount path
189205

206+
|`spec.driver.configMapMounts.configMapName`
207+
|Name of configuration map to be mounted in the driver
208+
209+
|`spec.driver.configMapMounts.path`
210+
|Mount path of the configuration map in the driver
211+
190212
|`spec.executor.cores`
191213
|Number of cores for each executor
192214

@@ -204,5 +226,11 @@ Below are listed the CRD fields that can be defined by the user:
204226

205227
|`spec.executor.volumeMounts.mountPath`
206228
|Volume mount path
229+
230+
|`spec.executor.configMapMounts.configMapName`
231+
|Name of configuration map to be mounted in the executor
232+
233+
|`spec.executor.configMapMounts.path`
234+
|Mount path of the configuration map in the executor
207235
|===
208236

0 commit comments

Comments
 (0)