operator resubmits all spark applications after restart no matter what their status is #457
Labels
customer-request
priority/high
release/24.11.0
release-note
Denotes a PR that will be considered when it comes time to generate release notes.
type/bug
Affected Stackable version
24.7
Affected Apache Spark-on-Kubernetes version
3.4.2
Current and expected behavior
when the spark-k8s operator restarts (e.g. in case of node restart or operator upgrade) every spark application is reconciled/resubmitted at once. this leads to hell and chaos because of both resource consumption and multiple jobs writing / reading one file which usually done ordered by airflow dags doing the spark submits
Possible solution
not submitting an application which is in state succeeded / failed / stopped / killed. maybe even running apps should not be restarted because they are already running
Additional context
No response
Environment
No response
Would you like to work on fixing this bug?
None
The text was updated successfully, but these errors were encountered: