Skip to content

Commit 3425391

Browse files
committed
fix resource limit usage (#166)
# Description - Restores setting # of executor instances - Define spark conf settings as well as defining the pod template - set kuberenetes.driver/executor.* from CRD directly - use cpu limit to set driver/executor.cores - Integration test to test/illustrate - just resource limits - just spark.conf settings - just defaults Fixes #161. Jenkins: https://ci.stackable.tech/view/02%20Operator%20Tests%20(custom)/job/spark-k8s-operator-it-custom/41/
1 parent 869739d commit 3425391

File tree

17 files changed

+472
-468
lines changed

17 files changed

+472
-468
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,12 @@ All notable changes to this project will be documented in this file.
99
- Bumped image to `3.3.0-stackable0.2.0` in tests and docs ([#145])
1010
- BREAKING: use resource limit struct instead of passing spark configuration arguments ([#147])
1111
- Fixed resources test ([#151])
12+
- Fixed inconsistencies with resources usage ([#166])
1213

1314
[#145]: https://github.com/stackabletech/spark-k8s-operator/pull/145
1415
[#147]: https://github.com/stackabletech/spark-k8s-operator/pull/147
1516
[#151]: https://github.com/stackabletech/spark-k8s-operator/pull/151
17+
[#166]: https://github.com/stackabletech/spark-k8s-operator/pull/166
1618

1719
## [0.5.0] - 2022-09-06
1820

Cargo.lock

Lines changed: 41 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

deploy/crd/sparkapplication.crd.yaml

Lines changed: 0 additions & 132 deletions
Large diffs are not rendered by default.

deploy/helm/spark-k8s-operator/crds/crds.yaml

Lines changed: 0 additions & 132 deletions
Large diffs are not rendered by default.

deploy/manifests/crds.yaml

Lines changed: 0 additions & 132 deletions
Large diffs are not rendered by default.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
apiVersion: spark.stackable.tech/v1alpha1
3+
kind: SparkApplication
4+
metadata:
5+
name: pyspark-streaming
6+
namespace: default
7+
spec:
8+
version: "1.0"
9+
sparkImage: docker.stackable.tech/stackable/pyspark-k8s:3.3.0-stackable0.1.0
10+
mode: cluster
11+
mainApplicationFile: local:///stackable/spark/examples/src/main/python/streaming/hdfs_wordcount.py
12+
args:
13+
- "/tmp2"
14+
sparkConf:
15+
spark.kubernetes.submission.waitAppCompletion: "false"
16+
spark.kubernetes.driver.pod.name: "pyspark-streaming-driver"
17+
spark.kubernetes.executor.podNamePrefix: "pyspark-streaming"
18+
driver:
19+
resources:
20+
cpu:
21+
min: "1"
22+
max: "2"
23+
memory:
24+
limit: "1Gi"
25+
executor:
26+
instances: 1
27+
resources:
28+
cpu:
29+
min: "1700m"
30+
max: "3"
31+
memory:
32+
limit: "2Gi"

docs/modules/ROOT/pages/usage.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,11 @@ executor:
180180
WARNING: The default values are _most likely_ not sufficient to run a proper cluster in production. Please adapt according to your requirements.
181181
For more details regarding Kubernetes CPU limits see: https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/[Assign CPU Resources to Containers and Pods].
182182

183+
Spark allocates a default amount of non-heap memory based on the type of job (JVM or non-JVM). This is taken into account when defining memory settings based exclusively on the resource limits, so that the "declared" value is the actual total value (i.e. including memory overhead). This may result in minor deviations from the stated resource value due to rounding differences.
184+
185+
NOTE: It is possible to define Spark resources either directly by setting configuration properties listed under `sparkConf`, or by using resource limits. If both are used, then `sparkConf` properties take precedence. It is recommended for the sake of clarity to use *_either_* one *_or_* the other.
186+
187+
183188
== CRD argument coverage
184189

185190
Below are listed the CRD fields that can be defined by the user:

rust/crd/Cargo.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,6 @@ serde_json = "1.0"
1717
serde_yaml = "0.8"
1818
snafu = "0.7"
1919
strum = { version = "0.24", features = ["derive"] }
20+
21+
[dev-dependencies]
22+
rstest = "0.15.0"

rust/crd/src/constants.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,6 @@ pub const SECRET_ACCESS_KEY: &str = "secretAccessKey";
2222
pub const S3_SECRET_DIR_NAME: &str = "/stackable/secrets";
2323

2424
pub const SPARK_UID: i64 = 1000;
25+
pub const MIN_MEMORY_OVERHEAD: u32 = 384;
26+
pub const JVM_OVERHEAD_FACTOR: f32 = 0.1;
27+
pub const NON_JVM_OVERHEAD_FACTOR: f32 = 0.4;

0 commit comments

Comments
 (0)