on 04-Oct-202305:00 - edited on 04-Oct-202316:10 by LiefZimmerman
Apache Airflow is an open-source platform used for orchestrating complex workflows and data pipelines. Airflow allows you to define and automate workflows as directed acyclic graphs (DAGs). These DAGs represent a series of tasks and their dependencies, making it easy to express complex data processing or ETL (Extract, Transform, Load) pipelines. Airflow supports a variety of task types, including Python scripts, shell commands, SQL queries, and more.
Recently, it was discovered that due to an oversight by Apache Airflow maintainers, tasks could be executed independently from the DAG they belong to. This security vulnerability led to the issuance of CVE-2023-39508. Why was this a problem? Let's consider an example: suppose we have a DAG with two tasks. The first task checks if a user belongs to the Active Directory admin group. If the user is part of the admin group, the second task is executed; otherwise, it is not. The second task is to run an OS command of the user's choice. Now, let's assume that a user of Apache Airflow is not in the admin group but wants to execute an OS command. In this case, they could exploit the “Run Task” feature to run the task and execute OS commands. To address this security issue, the Apache Airflow project decided to disable this feature, stating that its release was a mistake. You can refer to the commit below to learn more or see Figure 1 for additional details.