Skip to content

Commit 5fca268

Browse files
feat(glue-alpha): add job run queuing to Glue job (#31830)
### Issue # (if applicable) Closes #31826 ### Reason for this change [Job](https://docs.aws.amazon.com/cdk/api/v2/docs/@aws-cdk_aws-glue-alpha.Job.html) within [@aws-cdk/aws-glue-alpha](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-glue-alpha-readme.html) does not currently include the [jobRunQueuingEnabled](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_glue.CfnJob.html#jobrunqueuingenabled) property of the [CfnJob](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_glue.CfnJob.html) within [aws-cdk-lib/aws-glue](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_glue-readme.html). Setting this property currently requires a [raw override](https://docs.aws.amazon.com/cdk/v2/guide/cfn_layer.html#develop-customize-override). ### Description of changes Added `jobRunQueuingEnabled` to construction properties for `Job`, along with validation that this is not enabled when execution class is flexible and/or `maxRetries` exceeds zero ([see](https://aws.amazon.com/blogs/big-data/introducing-job-queuing-to-scale-your-aws-glue-workloads/)). ### Description of how you validated changes Unit tests and an integration test. ### Checklist - [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
1 parent be6a964 commit 5fca268

File tree

10 files changed

+425
-22
lines changed

10 files changed

+425
-22
lines changed

packages/@aws-cdk/aws-glue-alpha/README.md

+18
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,24 @@ The `sparkUI` property also allows the specification of an s3 bucket and a bucke
127127

128128
See [documentation](https://docs.aws.amazon.com/glue/latest/dg/add-job.html) for more information on adding jobs in Glue.
129129

130+
### Enable Job Run Queuing
131+
132+
AWS Glue job queuing monitors your account level quotas and limits. If quotas or limits are insufficient to start a Glue job run, AWS Glue will automatically queue the job and wait for limits to free up. Once limits become available, AWS Glue will retry the job run. Glue jobs will queue for limits like max concurrent job runs per account, max concurrent Data Processing Units (DPU), and resource unavailable due to IP address exhaustion in Amazon Virtual Private Cloud (Amazon VPC).
133+
134+
Enable job run queuing by setting the `jobRunQueuingEnabled` property to `true`.
135+
136+
```ts
137+
new glue.Job(this, 'EnableRunQueuing', {
138+
jobName: 'EtlJobWithRunQueuing',
139+
executable: glue.JobExecutable.pythonEtl({
140+
glueVersion: glue.GlueVersion.V4_0,
141+
pythonVersion: glue.PythonVersion.THREE,
142+
script: glue.Code.fromAsset(path.join(__dirname, 'job-script', 'hello_world.py')),
143+
}),
144+
jobRunQueuingEnabled: true,
145+
});
146+
```
147+
130148
## Connection
131149

132150
A `Connection` allows Glue jobs, crawlers and development endpoints to access certain types of data stores. For example, to create a network connection to connect to a data source within a VPC:

packages/@aws-cdk/aws-glue-alpha/lib/job.ts

+18
Original file line numberDiff line numberDiff line change
@@ -502,6 +502,16 @@ export interface JobProps {
502502
*/
503503
readonly description?: string;
504504

505+
/**
506+
* Specifies whether job run queuing is enabled for the job runs for this job.
507+
* A value of true means job run queuing is enabled for the job runs.
508+
* If false or not populated, the job runs will not be considered for queueing.
509+
* If this field does not match the value set in the job run, then the value from the job run field will be used.
510+
*
511+
* @default - no job run queuing
512+
*/
513+
readonly jobRunQueuingEnabled?: boolean;
514+
505515
/**
506516
* The number of AWS Glue data processing units (DPUs) that can be allocated when this job runs.
507517
* Cannot be used for Glue version 2.0 and later - workerType and workerCount should be used instead.
@@ -722,6 +732,9 @@ export class Job extends JobBase {
722732
if (props.workerType && (props.workerType !== WorkerType.G_1X && props.workerType !== WorkerType.G_2X)) {
723733
throw new Error('FLEX ExecutionClass is only available for WorkerType G_1X or G_2X');
724734
}
735+
if (props.jobRunQueuingEnabled === true) {
736+
throw new Error('FLEX ExecutionClass is only available if job run queuing is disabled');
737+
}
725738
}
726739

727740
let maxCapacity = props.maxCapacity;
@@ -743,6 +756,10 @@ export class Job extends JobBase {
743756
throw new Error('Both workerType and workerCount must be set');
744757
}
745758

759+
if (props.jobRunQueuingEnabled === true && props.maxRetries !== undefined && !cdk.Token.isUnresolved(props.maxRetries) && props.maxRetries > 0) {
760+
throw new Error(`Maximum retries was set to ${props.maxRetries}, must be set to 0 with job run queuing enabled`);
761+
}
762+
746763
const jobResource = new CfnJob(this, 'Resource', {
747764
name: props.jobName,
748765
description: props.description,
@@ -756,6 +773,7 @@ export class Job extends JobBase {
756773
glueVersion: executable.glueVersion.name,
757774
workerType: props.workerType?.name,
758775
numberOfWorkers: props.workerCount,
776+
jobRunQueuingEnabled: props.jobRunQueuingEnabled,
759777
maxCapacity: props.maxCapacity,
760778
maxRetries: props.maxRetries,
761779
executionClass: props.executionClass,

packages/@aws-cdk/aws-glue-alpha/test/integ.job.js.snapshot/aws-glue-job.assets.json

+3-3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/@aws-cdk/aws-glue-alpha/test/integ.job.js.snapshot/aws-glue-job.template.json

+121
Original file line numberDiff line numberDiff line change
@@ -1754,6 +1754,127 @@
17541754
},
17551755
"WorkerType": "G.1X"
17561756
}
1757+
},
1758+
"EtlJobWithRunQueuingServiceRole33547334": {
1759+
"Type": "AWS::IAM::Role",
1760+
"Properties": {
1761+
"AssumeRolePolicyDocument": {
1762+
"Statement": [
1763+
{
1764+
"Action": "sts:AssumeRole",
1765+
"Effect": "Allow",
1766+
"Principal": {
1767+
"Service": "glue.amazonaws.com"
1768+
}
1769+
}
1770+
],
1771+
"Version": "2012-10-17"
1772+
},
1773+
"ManagedPolicyArns": [
1774+
{
1775+
"Fn::Join": [
1776+
"",
1777+
[
1778+
"arn:",
1779+
{
1780+
"Ref": "AWS::Partition"
1781+
},
1782+
":iam::aws:policy/service-role/AWSGlueServiceRole"
1783+
]
1784+
]
1785+
}
1786+
]
1787+
}
1788+
},
1789+
"EtlJobWithRunQueuingServiceRoleDefaultPolicy5725F511": {
1790+
"Type": "AWS::IAM::Policy",
1791+
"Properties": {
1792+
"PolicyDocument": {
1793+
"Statement": [
1794+
{
1795+
"Action": [
1796+
"s3:GetBucket*",
1797+
"s3:GetObject*",
1798+
"s3:List*"
1799+
],
1800+
"Effect": "Allow",
1801+
"Resource": [
1802+
{
1803+
"Fn::Join": [
1804+
"",
1805+
[
1806+
"arn:",
1807+
{
1808+
"Ref": "AWS::Partition"
1809+
},
1810+
":s3:::",
1811+
{
1812+
"Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
1813+
},
1814+
"/*"
1815+
]
1816+
]
1817+
},
1818+
{
1819+
"Fn::Join": [
1820+
"",
1821+
[
1822+
"arn:",
1823+
{
1824+
"Ref": "AWS::Partition"
1825+
},
1826+
":s3:::",
1827+
{
1828+
"Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
1829+
}
1830+
]
1831+
]
1832+
}
1833+
]
1834+
}
1835+
],
1836+
"Version": "2012-10-17"
1837+
},
1838+
"PolicyName": "EtlJobWithRunQueuingServiceRoleDefaultPolicy5725F511",
1839+
"Roles": [
1840+
{
1841+
"Ref": "EtlJobWithRunQueuingServiceRole33547334"
1842+
}
1843+
]
1844+
}
1845+
},
1846+
"EtlJobWithRunQueuingA1B098B5": {
1847+
"Type": "AWS::Glue::Job",
1848+
"Properties": {
1849+
"Command": {
1850+
"Name": "glueetl",
1851+
"PythonVersion": "3",
1852+
"ScriptLocation": {
1853+
"Fn::Join": [
1854+
"",
1855+
[
1856+
"s3://",
1857+
{
1858+
"Fn::Sub": "cdk-hnb659fds-assets-${AWS::AccountId}-${AWS::Region}"
1859+
},
1860+
"/432033e3218068a915d2532fa9be7858a12b228a2ae6e5c10faccd9097b1e855.py"
1861+
]
1862+
]
1863+
}
1864+
},
1865+
"DefaultArguments": {
1866+
"--job-language": "python"
1867+
},
1868+
"GlueVersion": "4.0",
1869+
"JobRunQueuingEnabled": true,
1870+
"Name": "EtlJobWithRunQueuing",
1871+
"Role": {
1872+
"Fn::GetAtt": [
1873+
"EtlJobWithRunQueuingServiceRole33547334",
1874+
"Arn"
1875+
]
1876+
}
1877+
}
17571878
}
17581879
},
17591880
"Parameters": {

packages/@aws-cdk/aws-glue-alpha/test/integ.job.js.snapshot/cdk.out

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/@aws-cdk/aws-glue-alpha/test/integ.job.js.snapshot/integ.json

+1-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

packages/@aws-cdk/aws-glue-alpha/test/integ.job.js.snapshot/manifest.json

+21-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)