(Comments)
In this article, we will discuss how to improve your website with search functionality using haystack and elastic search. Assuming you already have a good knowledge of django web framework, lets get into haystack and elastic search.
Elastic Search
Elasticsearch is a search engine based on Lucene. It is an open-source, broadly-distributable, readily-scalable, enterprise-grade search engine. Accessible through an extensive and elaborate API, Elasticsearch can power extremely fast searches that support your data discovery applications. It also provides RESTful API and almost any action can be performed using a simple RESTful API using JSON over HTTP. More details on elastic search can be found on its official page.
Haystack
Haystack provides modular search for Django. It features a unified, familiar API that allows you to plug in different search backends (such as Solr, Elasticsearch, Whoosh, Xapian, etc.) without having to modify your code.
Let's get ino setting up and installing Elasticsearch and Haystack
Installing Elastic Search
Install Java 8:
Elasticsearch and Logstash require Java, so we will install that now. We will install a recent version of Oracle Java 8 because that is what Elasticsearch recommends. It should, however, work fine with OpenJDK, if you decide to go that route.
Add the Oracle Java PPA to apt:
sudo add-apt-repository -y ppa:webupd8team/java
Update your apt package database:
sudo apt-get update
Install the latest stable version of Oracle Java 8 with this command (and accept the license agreement that pops up):
sudo apt-get -y install oracle-java8-installer
Now that Java 8 is installed, let's install ElasticSearch.
Download the elastic search from their official website. After downloading the file, unzip it and navigate to bin directory. You can run the elastic search executable to start the elastic search server with default config. Just hit 127.0.0.1:9200 in your browser to check whether your elastic search server is up or not.
You can also specify your own config file while starting elastic search server using the following command
elasticsearch --config=<PATH_TO YOUR_CONFIG_FILE>/elasticsearch.yml
Elasticsearch can also be installed with a package manager by adding Elastic's package source list.
Run the following command to import the Elasticsearch public GPG key into apt:
wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
If your prompt seems to hang, it is likely waiting for your user's password (to authorize the sudo command). If this is the case, enter your password.
Create the Elasticsearch source list:
echo "deb http://packages.elastic.co/elasticsearch/2.x/debian stable main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch-2.x.list
Update the apt package database again:
sudo apt-get update
Install Elasticsearch with this command:
sudo apt-get -y install elasticsearch
Elasticsearch is now installed. Let's edit the configuration:
sudo nano /etc/elasticsearch/elasticsearch.yml
You will want to restrict outside access to your Elasticsearch instance (port 9200), so outsiders can't read your data or shutdown your Elasticsearch cluster through the HTTP API. Find the line that specifies network.host, uncomment it, and replace its value with "localhost" so it looks like this:
/etc/elasticsearch/elasticsearch.yml excerpt (updated)
network.host: localhost
Save and exit elasticsearch.yml.
Now, start Elasticsearch:
sudo systemctl restart elasticsearch
Then, run the following command to start Elasticsearch on boot up:
sudo systemctl daemon-reload sudo systemctl enable elasticsearch
You will also need to install elastic search python binding to get it working with haystack
pip install elasticsearch
Installing haystack
Haystack can be installed via pip.
pip install django-haystack
After installation, just add it to your installed apps.
INSTALLED_APPS = [ .... 'haystack', ... ]
Add the following lines to the settings.py file
HAYSTACK_CONNECTIONS = { 'default': { 'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine', 'URL': 'http://127.0.0.1:9200/', 'INDEX_NAME': 'haystack', }, }
Now we need to create SearchIndex so that haystack knows what to search on. For this example I will be using the following Django model Musician from my app called Artist.
# -*- coding: utf-8 -*- from __future__ import unicode_literals from django.db import models # Create your models here. class Musician(models.Model): first_name = models.CharField(max_length=50) last_name = models.CharField(max_length=50) instrument = models.CharField(max_length=100) def __unicode__(self): return self.first_name
Now create a file called search_indexes.py in the directory where your models.py file is there. Create MusicianIndex in this file to tell haystack, what data from my Musician model want to store in the search engine.
from haystack import indexes from .models import Musician class MusicianIndex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True) first_name = indexes.CharField(model_attr='first_name') last_name = indexes.CharField(model_attr='last_name') instrument = indexes.CharField(model_attr='instrument') def get_model(self): return Musician def index_queryset(self, using=None): return self.get_model().objects.all()
In your main templates directory create a file called search/indexes/artist/musician_text.txt. You have to change the path to use your own app name and index name and add the searchable information in the template musician_text.txt in this case.
{{ object.first_name }} {{ object.last_name }} {{ object.instrument }}
You can include all fields or the only fields that you need to search. After this add the haystack urls to the urls.py file.
(r'^search/', include('haystack.urls')),
We have to build the index before we can search, To do that run the following command.
python manage.py rebuild_index
To learn more about haystack commands, visit the official haystack documentation page.
Querying the Data
Now that we have the Search Index, we will see how to query that data using the haystack API. See the example code below using the SearchQuerySet class.
from haystack.query import SearchQuerySet query = SearchQuerySet().filter(content='guitar')
The results can be iterated upon as well for individual items like shown below
for item in query: first_name = item.first_name last_name = item.last_name instrument = item.instrument
If there are multiple searchIndex classes, we can specify which models to search in to speed up the search like shown below
from haystack.query import SearchQuerySet query = SearchQuerySet().models(Musician).filter(content='guitar')
For more Haystack filters and other options you can see the official documentation here.
We develop web applications to our customers using python/django/angular.
Contact us at hello@cowhite.com
Comments