Skip to content

Commit 48ed7fd

Browse files
authored
chore: Bump airflow to 3.x and update docs and screenshots (#218)
* chore: Bump airflow to 3.x and update docs and screenshots * removed echoes
1 parent 31f9dad commit 48ed7fd

File tree

16 files changed

+43
-42
lines changed

16 files changed

+43
-42
lines changed

demos/airflow-scheduled-job/03-enable-and-run-spark-dag.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ spec:
1717
kubectl rollout status --watch statefulset/airflow-webserver-default
1818
&& kubectl rollout status --watch statefulset/airflow-scheduler-default
1919
&& export AIRFLOW_ADMIN_PASSWORD=$(cat /airflow-credentials/adminUser.password)
20-
&& curl -i -s --user admin:$AIRFLOW_ADMIN_PASSWORD http://airflow-webserver-default:8080/api/v1/dags/sparkapp_dag
21-
&& curl -i -s --user admin:$AIRFLOW_ADMIN_PASSWORD -H 'Content-Type:application/json' -XPATCH http://airflow-webserver-default:8080/api/v1/dags/sparkapp_dag -d '{\"is_paused\": false}'
22-
&& curl -i -s --user admin:$AIRFLOW_ADMIN_PASSWORD -H 'Content-Type:application/json' -XPOST http://airflow-webserver-default:8080/api/v1/dags/sparkapp_dag/dagRuns -d '{}'
20+
&& export ACCESS_TOKEN=$(curl -XPOST http://airflow-webserver-default:8080/auth/token -H 'Content-Type: application/json' -d '{\"username\": \"admin\", \"password\": \"'$AIRFLOW_ADMIN_PASSWORD'\"}' | jq '.access_token' | tr -d '\"')
21+
&& curl -H \"Authorization: Bearer $ACCESS_TOKEN\" -H 'Content-Type: application/json' -XPATCH http://airflow-webserver-default:8080/api/v2/dags/sparkapp_dag -d '{\"is_paused\": false}' | jq
22+
&& curl -H \"Authorization: Bearer $ACCESS_TOKEN\" -H 'Content-Type: application/json' -XPOST http://airflow-webserver-default:8080/api/v2/dags/sparkapp_dag/dagRuns -d '{\"logical_date\": null}' | jq
2323
"]
2424
volumeMounts:
2525
- name: airflow-credentials

demos/airflow-scheduled-job/04-enable-and-run-date-dag.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,9 @@ spec:
1717
kubectl rollout status --watch statefulset/airflow-webserver-default
1818
&& kubectl rollout status --watch statefulset/airflow-scheduler-default
1919
&& export AIRFLOW_ADMIN_PASSWORD=$(cat /airflow-credentials/adminUser.password)
20-
&& curl -i -s --user admin:$AIRFLOW_ADMIN_PASSWORD http://airflow-webserver-default:8080/api/v1/dags/date_demo
21-
&& curl -i -s --user admin:$AIRFLOW_ADMIN_PASSWORD -H 'Content-Type:application/json' -XPATCH http://airflow-webserver-default:8080/api/v1/dags/date_demo -d '{\"is_paused\": false}'
22-
&& curl -i -s --user admin:$AIRFLOW_ADMIN_PASSWORD -H 'Content-Type:application/json' -XPOST http://airflow-webserver-default:8080/api/v1/dags/date_demo/dagRuns -d '{}'
20+
&& export ACCESS_TOKEN=$(curl -XPOST http://airflow-webserver-default:8080/auth/token -H 'Content-Type: application/json' -d '{\"username\": \"admin\", \"password\": \"'$AIRFLOW_ADMIN_PASSWORD'\"}' | jq '.access_token' | tr -d '\"')
21+
&& curl -H \"Authorization: Bearer $ACCESS_TOKEN\" -H 'Content-Type: application/json' -XPATCH http://airflow-webserver-default:8080/api/v2/dags/date_demo -d '{\"is_paused\": false}' | jq
22+
&& curl -H \"Authorization: Bearer $ACCESS_TOKEN\" -H 'Content-Type: application/json' -XPOST http://airflow-webserver-default:8080/api/v2/dags/date_demo/dagRuns -d '{\"logical_date\": null}' | jq
2323
"]
2424
volumeMounts:
2525
- name: airflow-credentials
Loading
Loading
Loading
Binary file not shown.
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

docs/modules/demos/pages/airflow-scheduled-job.adoc

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -46,59 +46,59 @@ To list the installed Stackable services run the following command:
4646
[source,console]
4747
----
4848
$ stackablectl stacklet list
49-
┌─────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────┐
50-
│ PRODUCT ┆ NAME ┆ NAMESPACE ┆ ENDPOINTS ┆ CONDITIONS │
51-
╞═════════╪═════════════════════════════════════════════════════════════════════╪═════════════════════════════════╡
52-
│ airflow ┆ airflow ┆ default ┆ webserver-airflow http://172.18.0.2:31979 ┆ Available, Reconciling, Running │
53-
└─────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────┘
49+
┌─────────┬─────────────────────────────────────────────────────────────────────┬─────────────────────────────────┐
50+
│ PRODUCT ┆ NAME ┆ NAMESPACE ┆ ENDPOINTS ┆ CONDITIONS │
51+
╞═════════╪═════════════════════════════════════════════════════════════════════╪═════════════════════════════════╡
52+
│ airflow ┆ airflow ┆ default ┆ webserver-default-http http://172.19.0.5:30913 ┆ Available, Reconciling, Running │
53+
└─────────┴─────────────────────────────────────────────────────────────────────┴─────────────────────────────────┘
5454
----
5555

5656
include::partial$instance-hint.adoc[]
5757

5858
== Airflow Webserver UI
5959

6060
Superset gives the ability to execute SQL queries and build dashboards. Open the `airflow` endpoint `webserver-airflow`
61-
in your browser (`http://172.18.0.2:31979` in this case).
61+
in your browser (`http://172.19.0.5:30913` in this case).
6262

6363
image::airflow-scheduled-job/airflow_1.png[]
6464

65-
Log in with the username `admin` and password `adminadmin`. The overview screen shows the DAGs mounted during the demo
66-
setup (`date_demo`).
65+
Log in with the username `admin` and password `adminadmin`.
66+
Click in 'Active DAGs' at the top and you will see an overview showing the DAGs mounted during the demo
67+
setup (`date_demo` and `sparkapp_dag`).
6768

6869
image::airflow-scheduled-job/airflow_2.png[]
6970

70-
There are two things to notice here. Both DAGs have been enabled, as shown by the slider to the left of the DAG name
71-
(DAGs are all `paused` initially and can be activated manually in the UI or via a REST call, as done in the setup for
72-
this demo):
71+
There are two things to notice here.
72+
Both DAGs have been enabled, as shown by the slider on the far right of the screen for each DAG
73+
(DAGs are all `paused` initially and can be activated manually in the UI or via a REST call, as done in the setup for this demo):
7374

7475
image::airflow-scheduled-job/airflow_3.png[]
7576

76-
Secondly, the `date_demo` job has been busy, with several runs already logged. The `sparkapp_dag` has only been run
77-
once because they have been defined with different schedules.
77+
Secondly, the `date_demo` job has been busy, with several runs already logged.
78+
The `sparkapp_dag` has only been run once because they have been defined with different schedules.
7879

7980
image::airflow-scheduled-job/airflow_4.png[]
8081

81-
Clicking on the number under `Runs` will display the individual job runs:
82+
Clicking on the DAG name and then on `Runs` will display the individual job runs:
8283

8384
image::airflow-scheduled-job/airflow_5.png[]
8485

85-
The `demo_date` job is running every minute. With Airflow, DAGs can be started manually or scheduled to run when certain
86-
conditions are fulfilled- In this case, the DAG has been set up to run using a cron table, which is part of the DAG
87-
definition.
86+
The `demo_date` job is running every minute.
87+
With Airflow, DAGs can be started manually or scheduled to run when certain conditions are fulfilled - in this case, the DAG has been set up to run using a cron table, which is part of the DAG definition.
8888

8989
=== `demo_date` DAG
9090

91-
Let's drill down a bit deeper into this DAG. Click on one of the job runs shown in the previous step to display the
92-
details. The DAG is displayed as a graph (this job is so simple that it only has one step, called `run_every_minute`).
91+
Let's drill down a bit deeper into this DAG.
92+
At the top under the DAG name there is some scheduling information, which tells us that this job will run every minute continuously:
9393

9494
image::airflow-scheduled-job/airflow_6.png[]
9595

96-
In the top right-hand corner there is some scheduling information, which tells us that this job will run every minute
97-
continuously:
96+
Click on one of the job runs in the list to display the details for the task instances.
97+
In the left-side pane the DAG is displayed either as a graph (this job is so simple that it only has one step, called `run_every_minute`), or as a "bar chart" showing each run.
9898

9999
image::airflow-scheduled-job/airflow_7.png[]
100100

101-
Click on the `run_every_minute` box in the centre of the page and then select `Logs`:
101+
Click on the `run_every_minute` box in the centre of the page to select the logs:
102102

103103
[WARNING]
104104
====
@@ -108,26 +108,27 @@ See the https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/exec
108108
If you are interested in persisting the logs, take a look at the xref:logging.adoc[] demo.
109109
====
110110

111-
image::airflow-scheduled-job/airflow_9.png[]
111+
image::airflow-scheduled-job/airflow_8.png[]
112112

113-
To look at the actual DAG code click on `Code`. Here we can see the crontab information used to schedule the job as well
114-
the `bash` command that provides the output:
113+
To look at the actual DAG code click on `Code`.
114+
Here we can see the crontab information used to schedule the job as well the `bash` command that provides the output:
115115

116-
image::airflow-scheduled-job/airflow_10.png[]
116+
image::airflow-scheduled-job/airflow_9.png[]
117117

118118
=== `sparkapp_dag` DAG
119119

120-
Go back to DAG overview screen. The `sparkapp_dag` job has a scheduled entry of `None` and a last-execution time
121-
(`2022-09-19, 07:36:55`). This allows a DAG to be executed exactly once, with neither schedule-based runs nor any
122-
https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dag-run.html#backfill[backfill]. The DAG can
123-
always be triggered manually again via REST or from within the Webserver UI.
120+
Go back to DAG overview screen.
121+
The `sparkapp_dag` job has a scheduled entry of `None` and a last-execution time.
122+
This allows a DAG to be executed exactly once, with neither schedule-based runs nor any
123+
https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dag-run.html#backfill[backfill].
124+
The DAG can always be triggered manually again via REST or from within the Webserver UI.
124125

125-
image::airflow-scheduled-job/airflow_11.png[]
126+
image::airflow-scheduled-job/airflow_10.png[]
126127

127128
By navigating to the graphical overview of the job we can see that DAG has two steps, one to start the job - which runs
128129
asynchronously - and another to poll the running job to report on its status.
129130

130-
image::airflow-scheduled-job/airflow_12.png[]
131+
image::airflow-scheduled-job/airflow_11.png[]
131132

132133
== Summary
133134

stacks/airflow/airflow.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,8 @@ metadata:
66
name: airflow
77
spec:
88
image:
9-
productVersion: 2.10.4
9+
productVersion: 3.0.1
1010
clusterConfig:
11-
listenerClass: external-unstable
1211
loadExamples: false
1312
exposeConfig: false
1413
credentialsSecret: airflow-credentials
@@ -35,6 +34,7 @@ spec:
3534
memory:
3635
limit: 2Gi
3736
gracefulShutdownTimeout: 30s
37+
listenerClass: external-unstable
3838
roleGroups:
3939
default:
4040
envOverrides:
@@ -75,7 +75,7 @@ data:
7575
7676
with DAG(
7777
dag_id='date_demo',
78-
schedule_interval='0-59 * * * *',
78+
schedule='0-59 * * * *',
7979
start_date=datetime(2021, 1, 1),
8080
catchup=False,
8181
dagrun_timeout=timedelta(minutes=5),
@@ -222,7 +222,7 @@ data:
222222
223223
with DAG(
224224
dag_id='sparkapp_dag',
225-
schedule_interval=None,
225+
schedule=None,
226226
start_date=datetime(2022, 1, 1),
227227
catchup=False,
228228
dagrun_timeout=timedelta(minutes=60),

0 commit comments

Comments
 (0)