Now that we have activated the PostgreSQL database, we can integrate PostgreSQL's full-text search into our project. This module has several search features.
# blog_project/settings.py
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.sites",
"django.contrib.sitemaps",
"django.contrib.staticfiles",
"django.contrib.postgres",
]
Simple search lookups
Just by integrating the 'postgres' module you can create a simple search. I will show you an example in the python shell:
In [1]: from blog.models import Post
In [2]: Post.objects.filter(title__search='django')
Out[2]: <QuerySet [<Post: Django REST Framework (DRF) - permissions>, <Post: Django REST Framework (DRF) - ModelViewSets>, <Post: What is the difference between the template filter: `|linebreaks` and `|linebreaksbr` in Django Template?>]>
Building a search view
To integrate a full-text search into my blog project, I created a form with an attribute called 'query'. In our blog project, this form allows a user to search for blog articles by entering a word or longer text.
# blog/forms.py
from django import forms
class SearchForm(forms.Form):
query = forms.CharField()
In our view, we perform a query with the word or phrases entered by the user. We will use the SearchVector
class to query the body and title of the blog instances.
# blog/views.py
from django.contrib.postgres.search import SearchVector
from django.shortcuts import render
from blog.forms import SearchForm
from blog.models import Post
def post_search(request):
form = SearchForm()
query = None
results = []
if "query" in request.GET:
form = SearchForm(request.GET)
if form.is_valid():
query = form.cleaned_data["query"]
results = Post.published.annotate(
search=SearchVector("title", "body"),
).filter(search=query)
return render(
request, "post/search.html", {"form": form, "query": query, "results": results}
)
Legend:
form = SearchForm()
: Instantiation of the SearchForm
.if "query" in request.GET:
: If the "query" attribute is inside the request object within the GET method, we check that the form is valid and store the results of the search in the results
variable.SearchVector()
: This class allows you to search multiple model attributes or even query within relationships such as ForeignKey
or ManyToMany
.Now we need to create an HTML template to display the form on the website. The template contains several conditions. If no search has been performed or the page has just loaded, the template will display the initial form for the user to perform a search.
If the user has already searched for a word or phrase, it will either display the results of the search found, or it will display the message "No results found".
<!-- blog/templates/post/search.html -->
{% extends "base.html" %}
{% load blog_tags %}
{% block title %} Search {% endblock %}
{% block content %}
{% if query %}
<h1>Posts containing "{{ query }}"</h1>
<h3>
{% with results.count as total_results %}
Found {{ totoal_results }} result{{ total_results|pluralize }}
{% endwith %}
</h3>
{% for post in results %}
<h4>
<a href="{{ post.get_aboslute_url }}">
{{ post.title }}
</a>
</h4>
{{ post.body|markdown|truncatewords_html:12 }}
{% empty %}
<p>There are no results for your search.</p>
{% endfor %}
<hr>
<p><a href="{% url 'blog:post_search' %}">Search again</a></p>
{% else %}
<h1>Search for posts</h1>
<form method="get">
{{ form.as_p }}
<input type="submit" value="Search">
</form>
{% endif %}
{% endblock %}
Don't forget to create a URL pattern for the search view.
# blog/urls.py
from django.urls import path
from blog import views
app_name = "blog"
urlpatterns = [
name="post_feed"),
path("search/", views.post_search, name="post_search"),
]
Stemming and ranking results
Stemming is reducing a word to its stem or base form. Search engines use stemming when indexing words, so that they can find words with inflections or derivatives. To translate terms into a search query object, Django has a SearchQuery
class. The PostgreSQL search engine also removes stop words. These stop words are common words in a language.
To rank the results by relevance, PostgreSQL provides a ranking function that orders the results based on how often the query terms occur and how close they are to each other.
# blog/views.py
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
from django.shortcuts import render
from blog.forms import SearchForm
from blog.models import Post
def post_search(request):
form = SearchForm()
query = None
results = []
if "query" in request.GET:
form = SearchForm(request.GET)
if form.is_valid():
query = form.cleaned_data["query"]
search_vector = SearchVector("title", "body")
search_query = SearchQuery(query)
results = (
Post.published.annotate(
search=search_vector, rank=SearchRank(search_vector, search_query)
)
.filter(search=query)
.order_by("-rank")
)
return render(
request, "post/search.html", {"form": form, "query": query, "results": results}
)
Legend:
SearchQuery()
: Translates the user-supplied text into a search query object, runs the words through a stemming algorithm, and looks for matches for any resulting terms.SearchRank()
: The results are ranked by relevance. This takes into account how often the query terms occur in the title
or body
, how close the terms are to each other, and the importance of the part where they occur.Stemming and removing stop words in different languages
It is possible to set the stemming and removal of stop words in any language. The SearchVector
and the SearchQuery
contain a config attribute to use a different search configuration.
# blog/views.py
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
from django.shortcuts import render
from blog.forms import SearchForm
from blog.models import Post
def post_search(request):
form = SearchForm()
query = None
results = []
if "query" in request.GET:
form = SearchForm(request.GET)
if form.is_valid():
query = form.cleaned_data["query"]
search_vector = SearchVector("title", "body", config="spanish")
search_query = SearchQuery(query, config="spanish")
results = (
Post.published.annotate(
search=search_vector, rank=SearchRank(search_vector, search_query)
)
.filter(search=query)
.order_by("-rank")
)
return render(
request, "post/search.html", {"form": form, "query": query, "results": results}
)
Legend:
config="spanish"
: This is useful for the use of different language parsers and dictionaries.Weighting queries
We can rank results by relevance, giving more relevance to posts that match by title rather than content.
# blog/views.py
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
from django.shortcuts import render
from blog.forms import SearchForm
from blog.models import Post
def post_search(request):
form = SearchForm()
query = None
results = []
if "query" in request.GET:
form = SearchForm(request.GET)
if form.is_valid():
query = form.cleaned_data["query"]
search_vector = SearchVector("title", weight="A") + SearchVector(
"body", weight="B"
)
search_query = SearchQuery(query)
results = (
Post.published.annotate(
search=search_vector, rank=SearchRank(search_vector, search_query)
)
.filter(rank__gte=0.3)
.order_by("-rank")
)
return render(
request, "post/search.html", {"form": form, "query": query, "results": results}
)
Legend:
weight="A"
: This attriubte tells the relevance for the query.