Skip to content

Commit 935d9bf

Browse files
committed
documentation
1 parent 3c99f0d commit 935d9bf

File tree

4 files changed

+55
-0
lines changed

4 files changed

+55
-0
lines changed
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
apiVersion: spark.stackable.tech/v1alpha1
3+
kind: SparkHistoryServer
4+
metadata:
5+
name: spark-history
6+
spec:
7+
image:
8+
productVersion: 3.3.0
9+
stackableVersion: 0.3.0
10+
logFileDirectory: # <1>
11+
s3:
12+
prefix: eventlogs/ # <2>
13+
bucket: # <3>
14+
inline:
15+
bucketName: spark-logs
16+
connection:
17+
inline:
18+
host: test-minio
19+
port: 9000
20+
accessStyle: Path
21+
credentials:
22+
secretClass: s3-credentials-class
23+
sparkConf: # <4>
24+
nodes: # <5>
25+
roleGroups:
26+
cleaner:
27+
replicas: 1
28+
config:
29+
cleaner: true
124 KB
Loading

docs/modules/ROOT/nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@
22
* xref:usage.adoc[]
33
* xref:job_dependencies.adoc[]
44
* xref:rbac.adoc[]
5+
* xref:history_server.adoc[]
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
= Spark History Server
2+
3+
== Overview
4+
5+
The Stackable Spark-on-Kubernetes operator runs Apache Spark workloads in a Kubernetes cluster, whereby driver- and executor-pods are created for the duration of the job and then terminated. One or more Spark History Server instances can be deployed independently of `SparkApplication` jobs and used as an end-point for spark logging, so that job information can be viewed once the job pods are no longer available.
6+
7+
== Example
8+
9+
[source,yaml]
10+
----
11+
include::example$example-history-server.yaml[]
12+
----
13+
14+
<1> The history server writes logs to a file directory, which currently has to be a bucket in an S3 object store (see the s3 field).
15+
<2> The log destination requires a prefix so that different bucket folders can be detected correctly.
16+
<3> The S3BucketDef description, here provided in-line.
17+
<4> History server configuration settings can be provided here as a map. For possible properties see: https://spark.apache.org/docs/latest/monitoring.html#spark-history-server-configuration-options
18+
<5> The history server implements a single role called `nodes`.
19+
20+
== Accessing the job history
21+
22+
The history exposes a user console on port 18080. By setting up port-forwarding on 18080 this UI can be opened in a browser to show running and completed jobs:
23+
24+
image::history-server-ui.png[History Server Console]
25+

0 commit comments

Comments
 (0)