passing configuration for spark processing job #3732

ajaiswalgit · 2023-03-19T15:03:10Z

Describe the feature you'd like
Feature to create spark.processing configuration.json file from kwargs to an S3 path. Many organizations do not have permission to write a file at bucket level.

How would this feature be used? Please describe.
We need to pass few Spark configurations in configurations.json to override default behavior. Configs pass through kwargs.configuration creates a file configuration.json at default bucket level. But user do not have write access at bucket level.

Describe alternatives you've considered
If we can pass default s3 path along with default S3 bucket then it will not fail.

Additional context
User should be able to pass user_defined_s3_folder so that configuration.json is created at S3bucket/user_defined_s3_folder where they have got write permission.

s3_uri = (
f"s3://{self.sagemaker_session.default_bucket()}/{user_defined_s3_folder}/{self._current_job_name}/"
f"input/{self._conf_container_input_name}/{self._conf_file_name}"
)

jmahlik · 2023-08-23T14:55:06Z

Looks like this might be related to #3200 and possibly fixed.

martinRenou · 2023-09-21T07:49:40Z

Closing as fixed by #3486

martinRenou closed this as completed Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

passing configuration for spark processing job #3732

passing configuration for spark processing job #3732

ajaiswalgit commented Mar 19, 2023

jmahlik commented Aug 23, 2023

martinRenou commented Sep 21, 2023

passing configuration for spark processing job #3732

passing configuration for spark processing job #3732

Comments

ajaiswalgit commented Mar 19, 2023

jmahlik commented Aug 23, 2023

martinRenou commented Sep 21, 2023