Scheduling Tasks in Django

(Comments)

Introduction

This tutorial explains how to deal with scheduled jobs in django websites. Scheduled jobs are jobs that are to be run automatically without any human intervention at set intervals or set time. A popular use case of scheduled jobs could be for cacheing data, that more or less remains unchaged for a period of time.

There are 2 methods to solve this problem and they are as follows
1.Using Celery
2.Using Django Management Command and Cron


We will be disussing both the methods here in this tutorial



Method 1: Using Celery

Celery is a library mainly used for async tasks. But it can be used for Scheduling tasks too

This method as you may have guessed requires celery library to be installed. Also we have to setup celery to schedule tasks. So first we install and setup celery.


Setting up Celery

First we install the django-celery library using pip:


$ pip install django-celery

Let us assume, that your project layout is as follows:

// File Structure
- project/
    - project/__init__.py
    - project/settings.py
    - project/urls.py       
    - myapp/
        - forms.py
        - models.py
        - urls.py
        - ...
        - ...
- manage.py

Now add 'djcelery' in INSTALLED_APPS in project/project/settings.py:

# Python Code
# project/project/settings.py

INSTALLED_APPS = [
    'djcelery',  # <-------------*
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.sites',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    # custom apps
    'myapp',
]

Create the celery database tables, ie, migrate the celery tables into database as follows:


$ python manage.py migrate djcelery

Now, we create a new project/project/celery.py module that defines the Celery instance:

# Python Code
# project/project/celery.py

from __future__ import absolute_import

import os

from celery import Celery

# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')

from django.conf import settings  # noqa

app = Celery('proj')

# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)


@app.task(bind=True)
def debug_task(self):
    print('Request: {0!r}'.format(self.request))


Now import this app in your project/project/init.py module:

# Python Code
# project/project/__init__.py

# This will make sure the app is always imported when
# Django starts so that shared_task will use this app.
from .celery import app as celery_app  # noqa


Configure celery to use the django-celery backend and configure other settings:

# Python Code
# project/project/settings.py

BROKER_URL = 'amqp://'
CELERY_ACCEPT_CONTENT = ['pickle']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'

CELERY_RESULT_BACKEND = 'djcelery.backends.database:DatabaseBackend'
CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'


Writing a task

First we create a tasks.py file in the app in which we want to implement the task. In our case we create it in project/myapp/tasks.py and add a task function in it as follows:

# Python Code
# project/myapp/tasks.py

import datetime
import celery

@celery.decorators.periodic_task(run_every=datetime.timedelta(minutes=5)) # here we assume we want it to be run every 5 mins
def myTask():
    # Do something here
    # like accessing remote apis,
    # calculating resource intensive computational data
    # and store in cache
    # or anything you please
    print 'This wasn\'t so difficult'


The above myTask() function will be run every 5 mins.



Using Django Management Command and Cron

Using this method you dont have to install any libraries or packages. This uses the good ol' cron job functionality.


Creating a Django Management Command

The first thing we will be doing is create a custom django management command. To do this, just add a management/commands directory to the application. Django will register a manage.py command for each Python module in that directory whose name doesn’t begin with an underscore. For example:

// File Structure
- myapp/
    - __init__.py
    - models.py
    - management/
        - __init__.py
        - commands/
            - __init__.py
            - _private.py
            - mytask.py
    - tests.py
    - views.py


On Python 2, be sure to include init.py files in both the management and management/commands directories as done above or your command will not be detected.
The _private.py module will not be available as a management command.

Now we add the code in myapp/management/commands/mytask.py

# Python Code
# myapp/management/commands/mytask.py

from django.core.management.base import BaseCommand, CommandError
from polls.models import Question as Poll

class Command(BaseCommand):
    help = 'Type the help text here'

    def handle(self, *args, **options):
        # Add yout logic here
        # This is the task that will be run
        self.stdout.write('This was extremely simple!!!')


By adding this file, now we can run the code in mytask.handle() function by running the command:
python manage.py mytask in django root.


Add Cron entry

In Windows you may have to install cron, in ubuntu and many linux distros cron is included by default. The installation is not covered in this discussion.

We can edit the main crontab file (in linux) by running the command:


$ crontab -e

Add the following line into the file for running the task every 5 minutes:

5 * * * * /path/to/virtualenv/bin/python /path/to/project/manage.py mytask


Sidenote

A Crontab entry looks like this: <minute[0-59]> <hour[0-23,0=midnight]> <day[1-31]> <month[1-12]> <weekday[0-6, 0=sunday]> <command>

01 04 1 1 1 /usr/bin/somedirectory/somecommand

The above example will run /usr/bin/somedirectory/somecommand at 4:01am on January 1st plus every Monday in January.


An asterisk (*) can be used so that every instance (every hour, every weekday, every month, etc.) of a time period is used.

01 04 * * * /usr/bin/somedirectory/somecommand

Comments

Recent Posts

Archive

2022
2021
2020
2019
2018
2017
2016
2015
2014

Tags

Authors

Feeds

RSS / Atom