8. Questions on Airflow Service Issues

Here is a list of FAQs that are related to Airflow service issues with corresponding solutions.

  1. Which logs do I look up for Airflow cluster startup issues?

    Refer to Airflow Services logs which are brought up during the cluster startup.

  2. Where can I find Airflow Services logs?

    Airflow services are Scheduler, Webserver, Celery, and RabbitMQ. The service logs are available at /media/ephemeral0/logs/airflow location inside the cluster node. Since airflow is single node machine, logs are accessible on the same node. These logs are helpful in troubleshooting cluster bringup and scheduling issues.

  3. What is $AIRFLOW_HOME?

    $AIRFLOW_HOME is a location that contains all configuration files, DAGs, plugins, and task logs. It is an environment variable set to /usr/lib/airflow for all machine users.

  4. Where can I find Airflow Configuration files?

    Configuration file is present at “$AIRFLOW_HOME/airflow.cfg”.

  5. Where can I find Airflow DAGs?

    The DAGs’ configuration file is available in the $AIRFLOW_HOME/dags folder.

  6. Where can I find Airflow task logs?

    The task log configuration file is available in $AIRFLOW_HOME/logs.

  7. Where can I find Airflow plugins?

    The configuration file is available in $AIRFLOW_HOME/plugins.

  8. How do I restart Airflow Services?

    You can do start/stop/restart actions on an Airflow service and the commands used for each service are given below:

    • Run sudo monit <action> scheduler for Airflow Scheduler.
    • Run sudo monit <action> webserver for Airflow Webserver.
    • Run sudo monit <action> worker for Celery workers. A stop operation gracefully shuts down existing workers. A start operation adds more equivalent number of workers as per the configuration. A restart operation gracefully shuts down existing workers and adds equivalent number of workers as per the configuration.
    • Run sudo monit <action> rabbitmq for RabbitMQ.
  9. How do I invoke Airflow CLI commands within the node?

    Airflow is installed inside a virtual environment at the location specified in the environment variable AIRFLOW_VIRTUALENV_LOC. Firstly, you should activate the virtual environment using the following script:

    source ${AIRFLOW_HOME}/airflow/qubole_assembly/scripts/virtualenv.sh activate
    

    After you activate the virtual environment, run the Airflow command.