8

I am running Superset via Docker. I enabled the Email Report feature and tried it:

image

However, I only receive the test email report. I don't receive any emails after.

This is my CeleryConfig in superset_config.py:

class CeleryConfig(object):
    BROKER_URL = 'sqla+postgresql://superset:superset@db:5432/superset'
    CELERY_IMPORTS = (
        'superset.sql_lab',
        'superset.tasks',
    )
    CELERY_RESULT_BACKEND = 'db+postgresql://superset:superset@db:5432/superset'
    CELERYD_LOG_LEVEL = 'DEBUG'
    CELERYD_PREFETCH_MULTIPLIER = 10
    CELERY_ACKS_LATE = True
    CELERY_ANNOTATIONS = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
        'email_reports.send': {
            'rate_limit': '1/s',
            'time_limit': 120,
            'soft_time_limit': 150,
            'ignore_result': True,
        },
    }
    CELERYBEAT_SCHEDULE = {
        'email_reports.schedule_hourly': {
            'task': 'email_reports.schedule_hourly',
            'schedule': crontab(minute=1, hour='*'),
        },
    }

The documentation says I need to run the celery worker and beat.

celery worker --app=superset.tasks.celery_app:app --pool=prefork -O fair -c 4
celery beat --app=superset.tasks.celery_app:app

I added them to the 'docker-compose.yml':

superset-worker:
    build: *superset-build
    command: >
      sh -c "celery worker --app=superset.tasks.celery_app:app -Ofair -f /app/celery_worker.log &&
             celery beat --app=superset.tasks.celery_app:app -f /app/celery_beat.log"
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes

Celery Worker is indeed working when sending the first email. The log file is also visible. However, the celery beat seems to not be functioning. There is also no 'celery_beat.log' created.

If you'd like a deeper insight, here's the commit with the full implementation of the functionality.

How do I correctly configure celery beat? How can I debug this?

3 Answers 3

3

I managed to solve it by altering the CeleryConfig implementation, and adding a beat service to 'docker-compose.yml'

New CeleryConfig class in 'superset_config.py':

REDIS_HOST = get_env_variable("REDIS_HOST")
REDIS_PORT = get_env_variable("REDIS_PORT")

class CeleryConfig(object):
    BROKER_URL = "redis://%s:%s/0" % (REDIS_HOST, REDIS_PORT)
    CELERY_IMPORTS = (
        'superset.sql_lab',
        'superset.tasks',
    )
    CELERY_RESULT_BACKEND = "redis://%s:%s/1" % (REDIS_HOST, REDIS_PORT)
    CELERY_ANNOTATIONS = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
        'email_reports.send': {
            'rate_limit': '1/s',
            'time_limit': 120,
            'soft_time_limit': 150,
            'ignore_result': True,
        },
    }
    CELERY_TASK_PROTOCOL = 1
    CELERYBEAT_SCHEDULE = {
        'email_reports.schedule_hourly': {
            'task': 'email_reports.schedule_hourly',
            'schedule': crontab(minute='1', hour='*'),
        },
    }

Changes in 'docker-compose.yml':

  superset-worker:
    build: *superset-build
    command: ["celery", "worker", "--app=superset.tasks.celery_app:app", "-Ofair"]
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes

  superset-beat:
    build: *superset-build
    command: ["celery", "beat", "--app=superset.tasks.celery_app:app", "--pidfile=", "-f", "/app/celery_beat.log"]
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes
2
  • 1
    what is superset-build?.. Is the config outdated?
    – d9k
    Commented Jun 13, 2022 at 23:39
  • 2
    @d9k this was pre superset 1.0. superset-build corresponds to the versions prior 1.0.
    – Snow
    Commented Jun 14, 2022 at 7:39
0

I believe Celery needs to run inside your superset container - so you'll need to modify your dockerfile and entrypoint.
BUT you should really first daemonize celery so you don't have to monitor and restart celery [see how to detect failure and auto restart celery worker and http://docs.celeryproject.org/en/latest/userguide/daemonizing.html].
See an example here for how to run a daemonized celery process in docker: Docker - Celery as a daemon - no pidfiles found

0

you can also add -B flag to celery worker command to run beat

celery worker --app=superset.tasks.celery_app:app --pool=prefork -O fair -c 4 -B
6
  • that's not recommended in production, when you have multiple workers
    – Snow
    Commented May 12, 2020 at 11:33
  • It is better then two separate command/processes in docker. I have 1,5 year of -B in prod, all fine. Commented May 12, 2020 at 13:46
  • how many workers do you have running? The docs say "You can also embed beat inside the worker by enabling the workers -B option, this is convenient if you’ll never run more than one worker node, but it’s not commonly used and for that reason isn’t recommended for production use"
    – Snow
    Commented May 12, 2020 at 13:50
  • If you have only one worker then fine, but you shouldn't use -B for multiple workers
    – Snow
    Commented May 12, 2020 at 13:51
  • there were 6 containers with diffirent params for workers and one of them was with -B option and all works fine Commented May 12, 2020 at 17:55

Not the answer you're looking for? Browse other questions tagged or ask your own question.