(Comments)
This tutorial explains how to deal with scheduled jobs in django websites. Scheduled jobs are jobs that are to be run automatically without any human intervention at set intervals or set time. A popular use case of scheduled jobs could be for cacheing data, that more or less remains unchaged for a period of time.
There are 2 methods to solve this problem and they are as follows
1.Using Celery
2.Using Django Management Command and Cron
We will be disussing both the methods here in this tutorial
Celery is a library mainly used for async tasks. But it can be used for Scheduling tasks too
This method as you may have guessed requires celery library to be installed. Also we have to setup celery to schedule tasks. So first we install and setup celery.
First we install the django-celery library using pip:
$ pip install django-celery
Let us assume, that your project layout is as follows:
// File Structure - project/ - project/__init__.py - project/settings.py - project/urls.py - myapp/ - forms.py - models.py - urls.py - ... - ... - manage.py
Now add 'djcelery' in INSTALLED_APPS in project/project/settings.py:
# Python Code # project/project/settings.py INSTALLED_APPS = [ 'djcelery', # <-------------* 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.sites', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', # custom apps 'myapp', ]
Create the celery database tables, ie, migrate the celery tables into database as follows:
$ python manage.py migrate djcelery
Now, we create a new project/project/celery.py module that defines the Celery instance:
# Python Code # project/project/celery.py from __future__ import absolute_import import os from celery import Celery # set the default Django settings module for the 'celery' program. os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings') from django.conf import settings # noqa app = Celery('proj') # Using a string here means the worker will not have to # pickle the object when using Windows. app.config_from_object('django.conf:settings') app.autodiscover_tasks(lambda: settings.INSTALLED_APPS) @app.task(bind=True) def debug_task(self): print('Request: {0!r}'.format(self.request))
Now import this app in your project/project/init.py module:
# Python Code # project/project/__init__.py # This will make sure the app is always imported when # Django starts so that shared_task will use this app. from .celery import app as celery_app # noqa
Configure celery to use the django-celery backend and configure other settings:
# Python Code # project/project/settings.py BROKER_URL = 'amqp://' CELERY_ACCEPT_CONTENT = ['pickle'] CELERY_TASK_SERIALIZER = 'json' CELERY_RESULT_SERIALIZER = 'json' CELERY_RESULT_BACKEND = 'djcelery.backends.database:DatabaseBackend' CELERYBEAT_SCHEDULER = 'djcelery.schedulers.DatabaseScheduler'
First we create a tasks.py file in the app in which we want to implement the task. In our case we create it in project/myapp/tasks.py and add a task function in it as follows:
# Python Code # project/myapp/tasks.py import datetime import celery @celery.decorators.periodic_task(run_every=datetime.timedelta(minutes=5)) # here we assume we want it to be run every 5 mins def myTask(): # Do something here # like accessing remote apis, # calculating resource intensive computational data # and store in cache # or anything you please print 'This wasn\'t so difficult'
The above myTask() function will be run every 5 mins.
Using this method you dont have to install any libraries or packages. This uses the good ol' cron job functionality.
The first thing we will be doing is create a custom django management command. To do this, just add a management/commands directory to the application. Django will register a manage.py command for each Python module in that directory whose name doesn’t begin with an underscore. For example:
// File Structure - myapp/ - __init__.py - models.py - management/ - __init__.py - commands/ - __init__.py - _private.py - mytask.py - tests.py - views.py
On Python 2, be sure to include init.py files in both the management and management/commands directories as done above or your command will not be detected.
The _private.py module will not be available as a management command.
Now we add the code in myapp/management/commands/mytask.py
# Python Code # myapp/management/commands/mytask.py from django.core.management.base import BaseCommand, CommandError from polls.models import Question as Poll class Command(BaseCommand): help = 'Type the help text here' def handle(self, *args, **options): # Add yout logic here # This is the task that will be run self.stdout.write('This was extremely simple!!!')
By adding this file, now we can run the code in mytask.handle() function by running the command:
python manage.py mytask in django root.
In Windows you may have to install cron, in ubuntu and many linux distros cron is included by default. The installation is not covered in this discussion.
We can edit the main crontab file (in linux) by running the command:
$ crontab -e
Add the following line into the file for running the task every 5 minutes:
5 * * * * /path/to/virtualenv/bin/python /path/to/project/manage.py mytask
A Crontab entry looks like this: <minute[0-59]> <hour[0-23,0=midnight]> <day[1-31]> <month[1-12]> <weekday[0-6, 0=sunday]> <command>
01 04 1 1 1 /usr/bin/somedirectory/somecommand
The above example will run /usr/bin/somedirectory/somecommand at 4:01am on January 1st plus every Monday in January.
An asterisk (*) can be used so that every instance (every hour, every weekday, every month, etc.) of a time period is used.
01 04 * * * /usr/bin/somedirectory/somecommand
We develop web applications to our customers using python/django/angular.
Contact us at hello@cowhite.com
Comments