Skip to content

Commit 0a9a85d

Browse files
razvanadwk67
andcommitted
Spark history server (#187)
Fixes #124 #162 Jenkins: [Azure:heavy_check_mark:](https://ci.stackable.tech/view/02%20Operator%20Tests%20(custom)/job/spark-k8s-operator-it-custom/60/) Co-authored-by: Andrew Kenworthy <[email protected]>
1 parent 8dc8cf2 commit 0a9a85d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+2550
-283
lines changed

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,19 @@ All notable changes to this project will be documented in this file.
44

55
## [Unreleased]
66

7+
### Added
8+
9+
- Create and manage history servers ([#187])
10+
11+
[#187]: https://github.com/stackabletech/spark-k8s-operator/pull/187
12+
713
### Changed
814

915
- Updated stackable image versions ([#176])
1016
- `operator-rs` `0.22.0``0.27.1` ([#178])
17+
- `operator-rs` `0.27.1` -> `0.30.2` ([#187])
1118
- Don't run init container as root and avoid chmod and chowning ([#183])
19+
- [BREAKING] Implement fix for S3 reference inconsistency as described in the issue #162 ([#187])
1220

1321
[#176]: https://github.com/stackabletech/spark-k8s-operator/pull/176
1422
[#178]: https://github.com/stackabletech/spark-k8s-operator/pull/178
@@ -43,7 +51,6 @@ All notable changes to this project will be documented in this file.
4351
- Update RBAC properties for OpenShift compatibility ([#126]).
4452

4553
[#112]: https://github.com/stackabletech/spark-k8s-operator/pull/112
46-
[#114]: https://github.com/stackabletech/spark-k8s-operator/pull/114
4754
[#126]: https://github.com/stackabletech/spark-k8s-operator/pull/126
4855

4956
## [0.4.0] - 2022-08-03

Cargo.lock

Lines changed: 4 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

deploy/helm/spark-k8s-operator/crds/crds.yaml

Lines changed: 769 additions & 88 deletions
Large diffs are not rendered by default.

deploy/helm/spark-k8s-operator/templates/roles.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ rules:
8484
- spark.stackable.tech
8585
resources:
8686
- sparkapplications
87+
- sparkhistoryservers
8788
verbs:
8889
- get
8990
- list

deploy/helm/spark-k8s-operator/templates/spark-clusterrole.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ rules:
5252
- ""
5353
resources:
5454
- configmaps
55+
- persistentvolumeclaims
5556
- pods
5657
- secrets
5758
- serviceaccounts
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
---
2+
apiVersion: spark.stackable.tech/v1alpha1
3+
kind: SparkApplication
4+
metadata:
5+
name: spark-pi-s3-1
6+
spec:
7+
version: "1.0"
8+
sparkImage: docker.stackable.tech/stackable/spark-k8s:3.3.0-stackable0.3.0
9+
sparkImagePullPolicy: IfNotPresent
10+
mode: cluster
11+
mainClass: org.apache.spark.examples.SparkPi
12+
mainApplicationFile: s3a://my-bucket/spark-examples_2.12-3.3.0.jar
13+
s3connection: # <1>
14+
inline:
15+
host: test-minio
16+
port: 9000
17+
accessStyle: Path
18+
credentials:
19+
secretClass: s3-credentials-class # <2>
20+
logFileDirectory: # <3>
21+
s3:
22+
prefix: eventlogs/ # <4>
23+
bucket:
24+
inline:
25+
bucketName: spark-logs # <5>
26+
connection:
27+
inline:
28+
host: test-minio
29+
port: 9000
30+
accessStyle: Path
31+
credentials:
32+
secretClass: history-credentials-class # <6>
33+
executor:
34+
instances: 1
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
apiVersion: spark.stackable.tech/v1alpha1
3+
kind: SparkHistoryServer
4+
metadata:
5+
name: spark-history
6+
spec:
7+
image:
8+
productVersion: 3.3.0
9+
stackableVersion: 0.3.0
10+
logFileDirectory: # <1>
11+
s3:
12+
prefix: eventlogs/ # <2>
13+
bucket: # <3>
14+
inline:
15+
bucketName: spark-logs
16+
connection:
17+
inline:
18+
host: test-minio
19+
port: 9000
20+
accessStyle: Path
21+
credentials:
22+
secretClass: history-credentials-class
23+
sparkConf: # <4>
24+
nodes:
25+
roleGroups:
26+
cleaner:
27+
replicas: 1 # <5>
28+
config:
29+
cleaner: true # <6>

docs/modules/ROOT/examples/example-sparkapp-s3-private.yaml

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,13 @@ spec:
99
mode: cluster
1010
mainApplicationFile: s3a://my-bucket/spark-examples_2.12-3.3.0.jar # <1>
1111
mainClass: org.apache.spark.examples.SparkPi # <2>
12-
s3bucket: # <3>
12+
s3connection: # <3>
1313
inline:
14-
bucketName: my-bucket
15-
connection:
16-
inline:
17-
host: test-minio
18-
port: 9000
19-
accessStyle: Path
20-
credentials: # <4>
21-
secretClass: s3-credentials-class
14+
host: test-minio
15+
port: 9000
16+
accessStyle: Path
17+
credentials: # <4>
18+
secretClass: s3-credentials-class
2219
sparkConf: # <5>
2320
spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider" # <6>
2421
spark.driver.extraClassPath: "/dependencies/jars/hadoop-aws-3.2.0.jar:/dependencies/jars/aws-java-sdk-bundle-1.11.375.jar"
124 KB
Loading

docs/modules/ROOT/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@
22
* xref:usage.adoc[]
33
* xref:job_dependencies.adoc[]
44
* xref:rbac.adoc[]
5+
* xref:history_server.adoc[]
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
= Spark History Server
2+
3+
== Overview
4+
5+
The Stackable Spark-on-Kubernetes operator runs Apache Spark workloads in a Kubernetes cluster, whereby driver- and executor-pods are created for the duration of the job and then terminated. One or more Spark History Server instances can be deployed independently of `SparkApplication` jobs and used as an end-point for spark logging, so that job information can be viewed once the job pods are no longer available.
6+
7+
== Deployment
8+
9+
The example below demonstrates how to set up the history server running in one Pod with scheduled cleanups of the event logs. The event logs are loaded from an S3 bucket named `spark-logs` and the folder `eventlogs/`. The credentials for this bucket are provided by the secret class `s3-credentials-class`. For more details on how the Stackable Data Platform manages S3 resources see the xref:home:concepts:s3.adoc[S3 resources] page.
10+
11+
12+
[source,yaml]
13+
----
14+
include::example$example-history-server.yaml[]
15+
----
16+
17+
<1> The location of the event logs. Must be a S3 bucket. Future implementations might add support for other shared filesystems such as HDFS.
18+
<2> Folder within the S3 bucket where the log files are located. This folder is required and must exist before setting up the history server.
19+
<3> The S3 bucket definition, here provided in-line.
20+
<4> Additional history server configuration properties can be provided here as a map. For possible properties see: https://spark.apache.org/docs/latest/monitoring.html#spark-history-server-configuration-options
21+
<5> This deployment has only one Pod. Multiple history servers can be started, all reading the same event logs by increasing the replica count.
22+
<6> This history server will automatically clean up old log files by using default properties. You can change any of these by using the `sparkConf` map.
23+
24+
NOTE: Only one role group can have scheduled cleanups enabled (`cleaner: true`) and this role group cannot have more than 1 replica.
25+
26+
The secret with S3 credentials must contain at least the following two keys:
27+
28+
* `accessKey` - the access key of a user with read and write access to the event log bucket.
29+
* `secretKey` - the secret key of a user with read and write access to the event log bucket.
30+
31+
Any other entries of the Secret are ignored by the operator.
32+
33+
== Application configuration
34+
35+
36+
The example below demonstrates how to configure Spark applications to write log events to an S3 bucket.
37+
38+
[source,yaml]
39+
----
40+
include::example$example-history-app.yaml[]
41+
----
42+
43+
<1> Location of the data that is being processed by the application.
44+
<2> Credentials used to access the data above.
45+
<3> Instruct the operator to configure the application with logging enabled.
46+
<4> Folder to store logs. This must match the prefix used by the history server.
47+
<5> Bucket to store logs. This must match the bucket used by the history server.
48+
<6> Credentials used to write event logs. These can, of course, differ from the credentials used to process data.
49+
50+
51+
52+
== History Web UI
53+
54+
To access the history server web UI, use one of the `NodePort` services created by the operator. For the example above, the operator created two services as shown:
55+
56+
[source,bash]
57+
----
58+
$ kubectl get svc
59+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
60+
spark-history-node NodePort 10.96.222.233 <none> 18080:30136/TCP 52m
61+
spark-history-node-cleaner NodePort 10.96.203.43 <none> 18080:32585/TCP 52m
62+
----
63+
64+
By setting up port forwarding on 18080 the UI can be opened by pointing your browser to `http://localhost:18080`:
65+
66+
image::history-server-ui.png[History Server Console]

docs/modules/ROOT/pages/usage.adoc

Lines changed: 26 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -92,57 +92,50 @@ include::example$example-sparkapp-configmap.yaml[]
9292

9393
You can specify S3 connection details directly inside the `SparkApplication` specification or by referring to an external `S3Bucket` custom resource.
9494

95-
To specify S3 connection details directly as part of the `SparkApplication` resource you add an inline bucket configuration as shown below.
95+
To specify S3 connection details directly as part of the `SparkApplication` resource you add an inline connection configuration as shown below.
9696

9797
[source,yaml]
9898
----
99-
s3bucket: # <1>
99+
s3connection: # <1>
100100
inline:
101-
bucketName: my-bucket # <2>
102-
connection:
103-
inline:
104-
host: test-minio # <3>
105-
port: 9000 # <4>
106-
accessStyle: Path
107-
credentials:
108-
secretClass: s3-credentials-class # <5>
101+
host: test-minio # <2>
102+
port: 9000 # <3>
103+
accessStyle: Path
104+
credentials:
105+
secretClass: s3-credentials-class # <4>
109106
----
110-
<1> Entry point for the bucket configuration.
111-
<2> Bucket name.
112-
<3> Bucket host.
113-
<4> Optional bucket port.
114-
<5> Name of the `Secret` object expected to contain the following keys: `ACCESS_KEY_ID` and `SECRET_ACCESS_KEY`
107+
<1> Entry point for the S3 connection configuration.
108+
<2> Connection host.
109+
<3> Optional connection port.
110+
<4> Name of the `Secret` object expected to contain the following keys: `ACCESS_KEY_ID` and `SECRET_ACCESS_KEY`
115111

116-
It is also possible to configure the bucket connection details as a separate Kubernetes resource and only refer to that object from the `SparkApplication` like this:
112+
It is also possible to configure the connection details as a separate Kubernetes resource and only refer to that object from the `SparkApplication` like this:
117113

118114
[source,yaml]
119115
----
120-
s3bucket:
121-
reference: my-bucket-resource # <1>
116+
s3connection:
117+
reference: s3-connection-resource # <1>
122118
----
123-
<1> Name of the bucket resource with connection details.
119+
<1> Name of the connection resource with connection details.
124120

125-
The resource named `my-bucket-resource` is then defined as shown below:
121+
The resource named `s3-connection-resource` is then defined as shown below:
126122

127123
[source,yaml]
128124
----
129125
---
130126
apiVersion: s3.stackable.tech/v1alpha1
131-
kind: S3Bucket
127+
kind: S3Connection
132128
metadata:
133-
name: my-bucket-resource
129+
name: s3-connection-resource
134130
spec:
135-
bucketName: my-bucket-name
136-
connection:
137-
inline:
138-
host: test-minio
139-
port: 9000
140-
accessStyle: Path
141-
credentials:
142-
secretClass: minio-credentials-class
131+
host: test-minio
132+
port: 9000
133+
accessStyle: Path
134+
credentials:
135+
secretClass: minio-credentials-class
143136
----
144137

145-
This has the advantage that bucket configuration can be shared across `SparkApplication`s and reduces the cost of updating these details.
138+
This has the advantage that one connection configuration can be shared across `SparkApplications` and reduces the cost of updating these details.
146139

147140
== Resource Requests
148141

@@ -228,8 +221,8 @@ Below are listed the CRD fields that can be defined by the user:
228221
|`spec.args`
229222
|Arguments passed directly to the job artifact
230223

231-
|`spec.s3bucket`
232-
|S3 bucket and connection specification. See the <<S3 bucket specification>> for more details.
224+
|`spec.s3connection`
225+
|S3 connection specification. See the <<S3 bucket specification>> for more details.
233226

234227
|`spec.sparkConf`
235228
|A map of key/value strings that will be passed directly to `spark-submit`

examples/ny-tlc-report-external-dependencies.yaml

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,14 +16,11 @@ spec:
1616
deps:
1717
requirements:
1818
- tabulate==0.8.9
19-
s3bucket:
19+
s3connection:
2020
inline:
21-
bucketName: my-bucket
22-
connection:
23-
inline:
24-
host: test-minio
25-
port: 9000
26-
accessStyle: Path
21+
host: test-minio
22+
port: 9000
23+
accessStyle: Path
2724
sparkConf:
2825
spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
2926
spark.driver.extraClassPath: "/dependencies/jars/*"

examples/ny-tlc-report-image.yaml

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -17,14 +17,11 @@ spec:
1717
deps:
1818
requirements:
1919
- tabulate==0.8.9
20-
s3bucket:
20+
s3connection:
2121
inline:
22-
bucketName: my-bucket
23-
connection:
24-
inline:
25-
host: test-minio
26-
port: 9000
27-
accessStyle: Path
22+
host: test-minio
23+
port: 9000
24+
accessStyle: Path
2825
sparkConf:
2926
spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
3027
executor:

examples/ny-tlc-report.yaml

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,11 @@ spec:
2323
name: cm-job-arguments
2424
args:
2525
- "--input /arguments/job-args.txt"
26-
s3bucket:
26+
s3connection:
2727
inline:
28-
bucketName: my-bucket
29-
connection:
30-
inline:
31-
host: test-minio
32-
port: 9000
33-
accessStyle: Path
28+
host: test-minio
29+
port: 9000
30+
accessStyle: Path
3431
sparkConf:
3532
spark.hadoop.fs.s3a.aws.credentials.provider: "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider"
3633
driver:

rust/crd/Cargo.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,10 @@ version = "0.7.0-nightly"
99
publish = false
1010

1111
[dependencies]
12-
stackable-operator = { git = "https://github.com/stackabletech/operator-rs.git", tag="0.27.1" }
12+
stackable-operator = { git = "https://github.com/stackabletech/operator-rs.git", tag="0.30.2" }
1313

1414
semver = "1.0"
15-
serde = { version = "1.0", features = ["derive"] }
15+
serde = "1.0"
1616
serde_json = "1.0"
1717
serde_yaml = "0.8"
1818
snafu = "0.7"

0 commit comments

Comments
 (0)