Configure Apache Airflow for AWS
Amazon Managed Workflows for Apache Airflow (Amazon MWAA) orchestrates your workflows using Directed Acyclic Graphs (DAGs) written in Python. You provide MWAA an Amazon S3 bucket where your DAGs, plugins, and Python requirements reside. You can run and monitor your DAGs using the AWS Management Console, a command line interface (CLI), a software development kit (SDK), or the Apache Airflow user interface (UI).
Create a S3 bucket by following the steps here.
Package and upload your DAG (Directed Acyclic Graph) code to Amazon S3. Amazon MWAA loads the following folders and files into Airflow.
Ensure Versioning is enabled for the custom plugins in a plugins.zip
, the startup
shell script file and Python dependencies in a requirements.txt
on your Amazon S3 bucket.
Refer to the Amazon documentation on DAGs for more details.
In the DuploCloud Portal, navigate to Cloud Services -> Analytics.
Click the Airflow tab.
Click Add. The New Managed Airflow Environment wizard displays.
Provide the required information, such as Airflow Environment Name, Airflow Version, S3 bucket, and DAGs folder location by navigating through the wizard. You can also enable Logging for Managed Airflow.
If you specify plugins.zip
, requirements.txt
, and startup
script while setting up the Airflow Environment, you must provide the S3 Version ID of these files (for example, lSHNqFtO5Z7_6K6YfGpKnpyjqP2JTvSf
). If the Version ID is blank, the default reference is to the latest Version ID of the files specified from S3 Bucket.
After setup, view the Managed Airflow Environment from the DuploCloud Portal, using the Airflow tab. You can view the Airflow Environment in the AWS Console by clicking the WebserverURL.