Skip to content

JVM config overrides #272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Aug 18, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ All notable changes to this project will be documented in this file.
### Added

- Default stackableVersion to operator version. It is recommended to remove `spec.image.stackableVersion` from your custom resources ([#267], [#268]).
- Configuration overrides for the JVM security properties, such as DNS caching ([#272]).

### Changed

Expand All @@ -16,6 +17,7 @@ All notable changes to this project will be documented in this file.
[#267]: https://github.com/stackabletech/spark-k8s-operator/pull/267
[#268]: https://github.com/stackabletech/spark-k8s-operator/pull/268
[#269]: https://github.com/stackabletech/spark-k8s-operator/pull/269
[#272]: https://github.com/stackabletech/spark-k8s-operator/pull/272

## [23.7.0] - 2023-07-14

Expand Down
39 changes: 38 additions & 1 deletion deploy/config-spec/properties.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,41 @@
version: 0.1.0
spec:
units: []
properties: []
properties:
- property: &jvmDnsCacheTtl
propertyNames:
- name: "networkaddress.cache.ttl"
kind:
type: "file"
file: "security.properties"
datatype:
type: "integer"
min: "0"
recommendedValues:
- fromVersion: "0.0.0"
value: "30"
roles:
- name: "node"
required: true
asOfVersion: "0.0.0"
comment: "History server - TTL for successfully resolved domain names."
description: "History server - TTL for successfully resolved domain names."

- property: &jvmDnsCacheNegativeTtl
propertyNames:
- name: "networkaddress.cache.negative.ttl"
kind:
type: "file"
file: "security.properties"
datatype:
type: "integer"
min: "0"
recommendedValues:
- fromVersion: "0.0.0"
value: "0"
roles:
- name: "node"
required: true
asOfVersion: "0.0.0"
comment: "History server - TTL for domain names that cannot be resolved."
description: "History server - TTL for domain names that cannot be resolved."
39 changes: 38 additions & 1 deletion deploy/helm/spark-k8s-operator/configs/properties.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,41 @@
version: 0.1.0
spec:
units: []
properties: []
properties:
- property: &jvmDnsCacheTtl
propertyNames:
- name: "networkaddress.cache.ttl"
kind:
type: "file"
file: "security.properties"
datatype:
type: "integer"
min: "0"
recommendedValues:
- fromVersion: "0.0.0"
value: "30"
roles:
- name: "node"
required: true
asOfVersion: "0.0.0"
comment: "History server - TTL for successfully resolved domain names."
description: "History server - TTL for successfully resolved domain names."

- property: &jvmDnsCacheNegativeTtl
propertyNames:
- name: "networkaddress.cache.negative.ttl"
kind:
type: "file"
file: "security.properties"
datatype:
type: "integer"
min: "0"
recommendedValues:
- fromVersion: "0.0.0"
value: "0"
roles:
- name: "node"
required: true
asOfVersion: "0.0.0"
comment: "History server - TTL for domain names that cannot be resolved."
description: "History server - TTL for domain names that cannot be resolved."
12 changes: 12 additions & 0 deletions deploy/helm/spark-k8s-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -527,6 +527,12 @@ spec:
type: array
type: object
type: object
jvmSecurity:
additionalProperties:
nullable: true
type: string
default: {}
type: object
logging:
default:
enableVectorAgent: null
Expand Down Expand Up @@ -4015,6 +4021,12 @@ spec:
minimum: 0.0
nullable: true
type: integer
jvmSecurity:
additionalProperties:
nullable: true
type: string
default: {}
type: object
logging:
default:
enableVectorAgent: null
Expand Down
6 changes: 6 additions & 0 deletions docs/modules/spark-k8s/pages/crd-reference.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -116,4 +116,10 @@ Below are listed the CRD fields that can be defined by the user:
|`spec.logFileDirectory.prefix`
|Prefix to use when storing events for the Spark History server.

|`spec.driver.jvmSecurity`
|A list JVM security properties to pass on to the driver VM. The TTL of DNS caches are especially important.

|`spec.executor.jvmSecurity`
|A list JVM security properties to pass on to the executor VM. The TTL of DNS caches are especially important.

|===
27 changes: 26 additions & 1 deletion docs/modules/spark-k8s/pages/usage-guide/history-server.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ The secret with S3 credentials must contain at least the following two keys:

Any other entries of the Secret are ignored by the operator.

== Application configuration
== Spark application configuration


The example below demonstrates how to configure Spark applications to write log events to an S3 bucket.
Expand Down Expand Up @@ -65,3 +65,28 @@ spark-history-node-cleaner NodePort 10.96.203.43 <none> 18080:325
By setting up port forwarding on 18080 the UI can be opened by pointing your browser to `http://localhost:18080`:

image::history-server-ui.png[History Server Console]

== Configuration Properties

For a role group of the Spark history server, you can specify: `configOverrides` for the following files:

- `security.properties`

=== The security.properties file

The `security.properties` file is used to configure JVM security properties. It is very seldom that users need to tweak any of these, but there is one use-case that stands out, and that users need to be aware of: the JVM DNS cache.

The JVM manages it's own cache of successfully resolved host names as well as a cache of host names that cannot be resolved. Some products of the Stackable platform are very sensible to the contents of these caches and their performance is heavily affected by them. As of version 3.4.0, Apache Spark may perform poorly if the positive cache is disabled. To cache resolved host names, and thus speeding up queries you can configure the TTL of entries in the positive cache like this:

[source,yaml]
----
nodes:
configOverrides:
security.properties:
networkaddress.cache.ttl: "30"
networkaddress.cache.negative.ttl: "0"
----

NOTE: The operator configures DNS caching by default as shown in the example above.

For details on the JVM security see https://docs.oracle.com/en/java/javase/11/security/java-security-overview1.html
6 changes: 4 additions & 2 deletions rust/crd/src/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ pub const VOLUME_MOUNT_PATH_EXECUTOR_POD_TEMPLATES: &str =
pub const POD_TEMPLATE_FILE: &str = "template.yaml";

pub const VOLUME_MOUNT_NAME_CONFIG: &str = "config";
pub const VOLUME_MOUNT_PATH_CONFIG: &str = "/stackable/spark/conf";

pub const VOLUME_MOUNT_NAME_JOB: &str = "job-files";
pub const VOLUME_MOUNT_PATH_JOB: &str = "/stackable/spark/jobs";
Expand All @@ -27,6 +28,8 @@ pub const VOLUME_MOUNT_PATH_LOG: &str = "/stackable/log";

pub const LOG4J2_CONFIG_FILE: &str = "log4j2.properties";

pub const JVM_SECURITY_PROPERTIES_FILE: &str = "security.properties";

pub const ACCESS_KEY_ID: &str = "accessKey";
pub const SECRET_ACCESS_KEY: &str = "secretKey";
pub const S3_SECRET_DIR_NAME: &str = "/stackable/secrets";
Expand Down Expand Up @@ -67,8 +70,7 @@ pub const HISTORY_ROLE_NAME: &str = "node";

pub const HISTORY_IMAGE_BASE_NAME: &str = "spark-k8s";

pub const HISTORY_CONFIG_FILE_NAME: &str = "spark-defaults.conf";
pub const HISTORY_CONFIG_FILE_NAME_FULL: &str = "/stackable/spark/conf/spark-defaults.conf";
pub const SPARK_DEFAULTS_FILE_NAME: &str = "spark-defaults.conf";

pub const SPARK_CLUSTER_ROLE: &str = "spark-k8s-clusterrole";
pub const SPARK_UID: i64 = 1000;
5 changes: 4 additions & 1 deletion rust/crd/src/history.rs
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,10 @@ impl SparkHistoryServer {
> = vec![(
HISTORY_ROLE_NAME.to_string(),
(
vec![PropertyNameKind::File(HISTORY_CONFIG_FILE_NAME.to_string())],
vec![
PropertyNameKind::File(SPARK_DEFAULTS_FILE_NAME.to_string()),
PropertyNameKind::File(JVM_SECURITY_PROPERTIES_FILE.to_string()),
],
self.spec.nodes.clone(),
),
)]
Expand Down
58 changes: 44 additions & 14 deletions rust/crd/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -538,12 +538,6 @@ impl SparkApplication {
}
}

// s3 with TLS
if tlscerts::tls_secret_names(s3conn, s3_log_dir).is_some() {
submit_cmd.push(format!("--conf spark.driver.extraJavaOptions=\"-Djavax.net.ssl.trustStore={STACKABLE_TRUST_STORE}/truststore.p12 -Djavax.net.ssl.trustStorePassword={STACKABLE_TLS_STORE_PASSWORD} -Djavax.net.ssl.trustStoreType=pkcs12 -Djavax.net.debug=ssl,handshake\""));
submit_cmd.push(format!("--conf spark.executor.extraJavaOptions=\"-Djavax.net.ssl.trustStore={STACKABLE_TRUST_STORE}/truststore.p12 -Djavax.net.ssl.trustStorePassword={STACKABLE_TLS_STORE_PASSWORD} -Djavax.net.ssl.trustStoreType=pkcs12 -Djavax.net.debug=ssl,handshake\""));
}

// repositories and packages arguments
if let Some(deps) = self.spec.deps.clone() {
submit_cmd.extend(
Expand Down Expand Up @@ -642,17 +636,23 @@ impl SparkApplication {
value_from: None,
});
}

// Extra JVM opts:
// - java security properties
// - s3 with TLS
let mut daemon_java_opts = vec![format!(
"-Djava.security.properties={VOLUME_MOUNT_PATH_LOG_CONFIG}/{JVM_SECURITY_PROPERTIES_FILE}"
)];
if let Some(s3logdir) = s3logdir {
if tlscerts::tls_secret_name(&s3logdir.bucket.connection).is_some() {
e.push(EnvVar {
name: "SPARK_DAEMON_JAVA_OPTS".to_string(),
value: Some(format!(
"-Djavax.net.ssl.trustStore={STACKABLE_TRUST_STORE}/truststore.p12 -Djavax.net.ssl.trustStorePassword={STACKABLE_TLS_STORE_PASSWORD} -Djavax.net.ssl.trustStoreType=pkcs12"
)),
value_from: None,
});
daemon_java_opts.push( format!("-Djavax.net.ssl.trustStore={STACKABLE_TRUST_STORE}/truststore.p12 -Djavax.net.ssl.trustStorePassword={STACKABLE_TLS_STORE_PASSWORD} -Djavax.net.ssl.trustStoreType=pkcs12"));
}
}
e.push(EnvVar {
name: "SPARK_DAEMON_JAVA_OPTS".to_string(),
value: Some(daemon_java_opts.join(" ")),
value_from: None,
});

e
}
Expand Down Expand Up @@ -957,6 +957,8 @@ pub struct DriverConfig {
#[fragment_attrs(serde(default))]
#[fragment_attrs(schemars(schema_with = "pod_overrides_schema"))]
pub pod_overrides: PodTemplateSpec,
#[fragment_attrs(serde(default))]
pub jvm_security: HashMap<String, Option<String>>,
}

impl DriverConfig {
Expand All @@ -977,6 +979,18 @@ impl DriverConfig {
volume_mounts: Some(VolumeMounts::default()),
affinity: StackableAffinityFragment::default(),
pod_overrides: PodTemplateSpec::default(),
jvm_security: vec![
(
"networkaddress.cache.ttl".to_string(),
Some("30".to_string()),
),
(
"networkaddress.cache.negative.ttl".to_string(),
Some("0".to_string()),
),
]
.into_iter()
.collect(),
}
}
}
Expand Down Expand Up @@ -1011,6 +1025,8 @@ pub struct ExecutorConfig {
#[fragment_attrs(serde(default))]
#[fragment_attrs(schemars(schema_with = "pod_overrides_schema"))]
pub pod_overrides: PodTemplateSpec,
#[fragment_attrs(serde(default))]
pub jvm_security: HashMap<String, Option<String>>,
}

impl ExecutorConfig {
Expand All @@ -1033,6 +1049,18 @@ impl ExecutorConfig {
node_selector: Default::default(),
affinity: Default::default(),
pod_overrides: PodTemplateSpec::default(),
jvm_security: vec![
(
"networkaddress.cache.ttl".to_string(),
Some("30".to_string()),
),
(
"networkaddress.cache.negative.ttl".to_string(),
Some("0".to_string()),
),
]
.into_iter()
.collect(),
}
}
}
Expand All @@ -1053,7 +1081,7 @@ mod tests {
};
use stackable_operator::k8s_openapi::api::core::v1::PodTemplateSpec;
use stackable_operator::product_logging::spec::Logging;
use std::collections::BTreeMap;
use std::collections::{BTreeMap, HashMap};
use std::str::FromStr;

#[test]
Expand Down Expand Up @@ -1419,6 +1447,7 @@ spec:
volume_mounts: None,
affinity: StackableAffinity::default(),
pod_overrides: PodTemplateSpec::default(),
jvm_security: HashMap::new(),
};

let mut props = BTreeMap::new();
Expand Down Expand Up @@ -1474,6 +1503,7 @@ spec:
node_selector: None,
affinity: StackableAffinity::default(),
pod_overrides: PodTemplateSpec::default(),
jvm_security: HashMap::new(),
};

let mut props = BTreeMap::new();
Expand Down
2 changes: 1 addition & 1 deletion rust/crd/src/s3logdir.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ impl S3LogDir {
}

/// Constructs the properties needed for loading event logs from S3.
/// These properties are later written in the `HISTORY_CONFIG_FILE_NAME_FULL` file.
/// These properties are later written in the `SPARK_DEFAULTS_FILE_NAME` file.
///
/// The following properties related to credentials are not included:
/// * spark.hadoop.fs.s3a.aws.credentials.provider
Expand Down
Loading