DataCROP Maize Processing Engine Worker Deployment

Use this page when following the manual per-repository setup. If you use Maize MVP, the worker is deployed by the MVP script; refer here only for customization or troubleshooting. See Maize Setup for the two setup options.

This is a demo deployment instance for the Maize DataCROP version. It deploys worker-side services responsible for handling tasks within the DataCROP Workflow Management Engine. The deployment consists of two services: daghandler and airflow-worker.

Overview

The deployment utilizes Apache Airflow and CeleryExecutor for distributed task execution within the DataCROP system. Below is an explanation of the different components and configurations defined in the docker-compose.yaml file.

Airflow Worker Setup

  • The airflow-worker service is set up using Airflow’s CeleryExecutor to manage distributed task execution.
  • The worker communicates with:
    • Redis: Used as the message broker for Celery.
    • PostgreSQL: Used as the backend for storing task results.

Volumes

The following directories are mounted into the Airflow worker container to persist data and provide necessary resources:

  • DAGs: Task definitions are stored in the ./dags folder.
  • Logs: Logs generated by Airflow are stored in the ./logs folder.
  • Data: Input and output data for tasks are stored in the ./data folder.
  • Plugins: Airflow plugins can be added via the ./plugins folder.
  • .env: The .env file is used to handle dynamic environment variables.

REQUIREMENTS

PREREQUISITES

Before proceeding, ensure that you have followed the setup instructions for the airflow processing engine.

After completing the setup, follow these steps to configure your environment variables:

  1. In the Processing Engine Worker repository, edit its .env file and ensure that all necessary environment variables are set correctly for your deployment. Current values from maize-processing-engine-worker/.env are shown below; sensitive secrets are redacted—keep using the real values already present in your .env.

     # HOST              ||  DC.C
     AIRFLOW_IP=<AIRFLOW_HOST_IP>
     AIRFLOW_WEB_SECRET_KEY=[REDACTED – keep existing value in your .env]
     AIRFLOW_FERNET_KEY=[REDACTED – keep existing value in your .env]
     HOST_IP=<WORKER_HOST_IP>
     _PIP_ADDITIONAL_REQUIREMENTS=''
     AIRFLOW_UID=1002
     AIRFLOW_GID=0
    
     # WORKER            ||  DC.W
     WORKER_NAME=<WORKER_NAME>
     WORKER_SSL_KEY_FILE=/security/${WORKER_NAME}/${WORKER_NAME}-key.pem
     WORKER_SSL_CERT_FILE=/security/${WORKER_NAME}/${WORKER_NAME}.pem
     WORKER_SSL_CERT_STORE=/security/ca/rootCA.pem
    
     # Please check the GID of the docker group on the host
     DOCKER_GID=988
    
     # REDIS             ||  DC.C
     REDIS_TLS_PORT=6379
     REDIS_TLS_CERT_FILE=/security/redis/redis.pem
     REDIS_TLS_KEY_FILE=/security/redis/redis-key.pem
     REDIS_TLS_CA_CERT_FILE=/security/ca/rootCA.pem
     REDIS_TLS_CLIENT_CERT_FILE=/security/redis/redis-client.pem
     REDIS_TLS_CLIENT_KEY_FILE=/security/redis/redis-client-key.pem
    
     # CELERY            ||  DC.C
     CELERY_WEB_UNAME=[REDACTED – keep existing value in your .env]
     CELERY_WEB_PSSWD=[REDACTED – keep existing value in your .env]
    

    Adjust only if your deployment differs (e.g., different IPs or worker name); do not publish or rotate the redacted secrets already set in your .env.

Once these parameters are correctly set, you can proceed with the deployment.

Start The Application.

  1. Navigate to the source directory containing the docker-compose.yaml file.
  2. Run the following command:

     docker compose up -d
    

Verify that everything is up and running

Wait for the services to start, then run:

docker compose ps

Confirm both services are up: daghandler and airflow-worker.

Then verify the worker container name matches your configured worker:

docker ps --filter "name=${WORKER_NAME}"

The NAMES column should include exactly ${WORKER_NAME} (the airflow-worker service sets container_name from WORKER_NAME).

Make Sure Everything Works

  1. Open a browser and navigate to the flower web app (http://{Your IP}:5555/workers).
  2. Enter the credentials provided by your organization for celery.
  3. After successful authentication, you will be redirected to the workers page, where the newly created worker should appear in the workers table. If its status is marked as online, the setup was completed successfully.

Stop everything.

Navigate to the source directory and run the following command.

docker compose down