Building a Full-Text Search App Using Django, Docker and Elasticsearch

Aymane Mimouni
August 13th, 2020 · 2 min read

In this article, I will be giving you brief information about Elasticsearch, its installation, and some examples of usage.

Elasticsearch – some basic concepts

Elasticsearch is a real-time distributed and open-source full-text search and analytics engine. It is accessible from a RESTful web service interface and uses schema-less JSON (JavaScript Object Notation) documents to store data. It is built on Java programming language, which enables Elasticsearch to run on different platforms. It enables users to explore a very large amount of data at a very high speed.

There are some Elasticsearch basics that -once you’ve internalized them- make the learning curve less traumatic. I’ve put together 4 of the most important concepts:

Fields: the smallest individual unit of data in Elasticsearch. Each field has a defined datatype; the core data-types (strings, numbers, dates, booleans), or complex data-types (object and nested).

Index: a collection of different types of documents and document properties. It can be compared to a database in the world of relational databases.

Documents: a collection of fields defined in the JSON format in a specific manner. Every document belongs to a type and resides inside an index. In the world of relational databases, documents can be compared to a row in a table.

Mapping: a collection of documents sharing a set of common fields present in the same index. Again, it’s like a schema in the world of relational databases.

⚠️ It is worth mentioning that Elasticsearch can’t be used as a database, it wasn’t built for this purpose. Due to that, it’s best if you use it as an additional service in your project next to PostgreSQL, MySQL, or other databases.

Using Elasticsearch with Django

Many tutorials are using Django-Haystack, which is very widely used in the Django community, as a modular search to plug ElasticSearch (or any other search engine such as Solr, Whoosh, Xapian, etc.), for its minimal configurations and the query syntax is similar to Django’s ORM.

I have used it with Solr recently in a project and I was impressed with how easy it implemented, I loved it, but I will not use it in this article, I think that Elasticsearch itself is simple to use.

I will be using Docker to run Elasticsearch.

Here is the source code I will be using, so you can see exactly what is going on.

let us begin

Elasticsearch instance

Edit docker-compose.yml in your project directory to add an ES service:

1es:
2 image: elasticsearch:7.8.1
3 environment:
4 - discovery.type=single-node
5 ports:
6 - "9200:9200"

Then add es to the services that Django app service depends on, and ELASTICSEARCH_DSL_HOSTS=es:9200 to docker-compose.env:

1web:
2 build: .
3 command: python /code/manage.py runserver 0.0.0.0:8000
4 volumes:
5 - .:/code
6 ports:
7 - 8000:8000
8 env_file:
9 - docker-compose.env
10 depends_on:
11 - db
12 - es <--- here

Now run docker-compose up -d --build

You can check if it works correctly via curl:

1curl -X GET localhost:9200/_cluster/health

Set up ElasticSearch

let’s Install Django Elasticsearch DSL. Use your favorite Python package manager to install the app from PyPI, I use pipenv.

1pipenv install django-elasticsearch-dsl

As with most Django applications, you should add django_elasticsearch_dsl to the INSTALLED_APPS within your settings file:

1INSTALLED_APPS = [
2 ...
3 'django_elasticsearch_dsl',
4 ...
5]

You must then define ELASTICSEARCH_DSL in your Django settings.

1# Elasticsearch
2ELASTICSEARCH_DSL = {
3 'default': {
4 'hosts': os.getenv("ELASTICSEARCH_DSL_HOSTS", 'localhost:9200')
5 },
6}

Index data into ElasticSearch

Let’s consider the following model:

1class Post(models.Model):
2 title = models.CharField(max_length=128)
3 content = models.CharField(max_length=5000)
4 created_at = models.DateTimeField(default=timezone.now)
5 likes = models.PositiveIntegerField(default=0)
6 slug = models.SlugField(max_length=128, db_index=True, null=True)
7 draft = models.BooleanField(default=True)
8
9 user = models.ForeignKey(
10 User,
11 related_name='posts',
12 on_delete=models.CASCADE
13 )
14
15 def __str__(self):
16 return self.title
17
18 class Meta:
19 app_label = 'posts'

Then we should run the migrations:

1docker-compose run web python manage.py makemigrations
2docker-compose run web python manage.py migrate

Now let’s define ElasticSearch index, It required to define Document class in documents.py in your app directory.

1from django.contrib.auth import get_user_model
2from django_elasticsearch_dsl import Document, fields
3from django_elasticsearch_dsl.registries import registry
4from .models import Post, Reply
5
6User = get_user_model()
7
8@registry.register_document
9class PostDocument(Document):
10 user = fields.ObjectField(properties={
11 'id': fields.IntegerField(),
12 'username': fields.TextField(),
13 })
14
15 class Index:
16 name = 'posts'
17 settings = {'number_of_shards': 1,
18 'number_of_replicas': 0}
19
20 class Django:
21 model = Post
22
23 fields = [
24 'title',
25 'content',
26 'created_at',
27 'likes',
28 'draft',
29 'slug',
30 ]
31
32 def get_queryset(self):
33 return super(PostDocument, self).get_queryset().select_related(
34 'user'
35 )
36
37 def get_instances_from_related(self, related_instance):
38 if isinstance(related_instance, User):
39 return related_instance.posts.all()
40 elif isinstance(related_instance, Reply):
41 return related_instance.post

Examples of usage

To populate the database with some content, I made a command for that reason, just run:

1docker-compose run web python manage.py load_posts 20

Now, let’s hop into the interactive Python shell (docker-compose run web python manage.py shell) and play around with ElasticSearch queries:

1>>> from posts.documents import PostDocument
2>>> posts = PostDocument.search()
3>>> for hit in posts:
4... print(hit.title)
5...
6Design half three bar quickly material center.
7Author true left. Position entire someone study be.
8School draw individual sell produce brother.
9Truth drug compare TV modern.
10Expert apply baby reveal team along.
11Beautiful for suddenly half.
12Plant argue enough less order receive sing.
13Store economy offer decision industry.
14Beat chair affect assume score occur include laugh.
15Language poor cell fish worry ready industry use.
16>>>

Next, I gathered here a list of examples of use you may need:

1search = PostDocument.search()
2
3# Filter by single field equal to a value
4search = search.query('match', draft=False)
5
6# Filter by single field containing a value
7search = search.filter('match_phrase', title="value")
8
9# Add the query to the Search object
10from elasticsearch_dsl import Q
11q = Q("multi_match", query='python django', fields=['title', 'content'])
12search = search.query(q)
13
14# Query combination
15or_q = Q("match", title='python') | Q("match", title='django')
16and_q = Q("match", title='python') & Q("match", title='django')
17search = search.query(or_q)
18
19# Exclude items from your query
20search = search.exclude('match', draft=True)
21
22# Filter documents that contain terms within a provided range.
23# eg: the posts created for the past day
24search = search.filter('range', created_at={"gte": "now-1d"})
25
26# Ordering
27# prefixed by the - sign to specify a descending order.
28search = search.sort('-likes', 'created_at')

Quick quiz for you 😄 you can submit your answer in the comments 👇

How to get the published posts created in the past week that contains the word ‘use’ in its title/content?

That’s it!

To be honest, this got quite long. If you are patient enough to read this full and find it interesting then please share it.

More articles from Obytes

Sending notifications to Slack using AWS Chatbot.

A guide to set up AWS Chatbot service to send custom notifications to a Slack channel using Terraform.

August 3rd, 2020 · 2 min read

Our product development process for building successful products

This article is a guide to how we do product development at Obytes

June 30th, 2020 · 3 min read

ABOUT US

Our mission and ambition is to challenge the status quo, by doing things differently we nurture our love for craft and technology allowing us to create the unexpected.