Scaling Django Applications

In today's competitive market, the potential of a business success lies on the scalability of it's services. With the improvements in technology, customers have even less patience when browsing the internet. Let's face the fact, nobody likes to wait. For every second it takes your application to complete a request, your chance of losing a customer increases. Nearly 60 percent of customers will abandon a website if they experience more than three seconds of load time, and the majority will not return. How can businesses avoid this hurdle? Scalability!

What is Scalability? Scalability is the ability of a system to handle increasing amount of workloads or traffic without compromising it's performance or availability. A scalable system should be able to handle a growing number of users or requests without affecting the user experience or requiring a significant increase in resources. For example, a system may be said to be scalable if it can handle a 100% increase in the number of users or requests without any hitch or degradation in performance.

Why is scalability so important in web applications? Let's say you built an e-commerce application which has about 120 visitors per day, with an average API response time of 1.5 seconds. At this point, you might say, "Oh yea the performance of my application is great". It would only take a very successfull marketing campaign to debunk that feel, then keep you on ten toes trying to fix an imaginary bug. Performance and scalability aren't the same. Performance is the raw speed of your application i.e how many milliseconds it takes to process a request. Scalability on the other hand is keeping the performance stable when there's an increase in requests or workload.

As we've demystified scalability, let's discuss Django and the different ways we can scale Django applications to handle increasing workloads. Django is a popular python web framework. It is used at big organisations like Discus, Instagram, Pinterest e.t.c. These organisations handle a large amount of traffic everyday without degradation in performance. How the heck can we achieve this?

Caching

Caching is storing frequently accessed data in memory, which can help reduce the load on the database and improve response time of the application. Let's face the fact, database queries takes time especially when there's lots of connections or loads involved. Caching reduces the frequency of accessing the database which improves our response time. Most web applications are data-driven. This means that their main purpose is to present a data retrieved from a database or an external resource to the users. It takes time to retrieve, assemble and visualize that data. Without caching, that process is repeated on every client request.

Django has it's own cache framework which makes applications more cache friendly. There are different caching tools for Django such as Memcache and Redis. My favourite tool for caching is Varnish. Varnish is a caching HTTP reverse proxy that sits in front of a HTTP server while caching the whole response. Once installed and started, it mimics the server that sits behind it. At this point the cached response would be served to the user without hitting the Django server. Awesome, right? Without a proper consideration and configuration, caching can be a double-edged sword as we need to make sure we are not returning a stale data to the users.

Caching is good, but it's not a way to compensate for poor performing systems. You need to also think about optimizing your code using best practices, then caching would do the rest.

Database and Code Optimization

Optimizing the database can help in improving the performance of the application. Techniques like indexing, query optimization and database sharding can help in scaling the database. In Django, using multiple databases can also help improve performance, scalability and fault tolerance. Also, writing a clean, well-structured efficient code helps reduce the amount of CPU and memory resources required to handle requests. In Django, there are some optimization techniques like select_related and prefetch_related which are designed to stop the deluge of database queries caused by accessing related objects. Both methods are similar but works in different ways. select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. As a result of this, select_related gets the related objects in the same database query. The prefetch_related does a separate lookup for a each relationship while performing the joining in python. This allows it to prefetch many-to-many and many-to-one objects which cannot be done using select_related. The below example throws more light on this.

# Hits the database the fetch the user
user = UserAccount.objects.get(id=2)

# Hits the database again to fetch related Wallet object
wallet = user.wallet

# Hits the database the fetch the user
user = UserAccount.objects.select_related("wallet").get(id=2)

# Doesn't hit the database anymore, because user.wallet has been prepopulated in the previous query
wallet = user.wallet

While inserting/updating large amount of datasets, bulk queries are preferable. bulk_create and bulk_update are used to perform multiple inserts or update operations in a single query. Another way to reduce the response time is by removing extra middlewares that your application is not benefiting from. Each request made to Django passes through these middlewares and add an extra overhead. Consider inspecting your MIDDLEWARE in Django settings and remove any redundant middleware.

Consider setting the CONN_MAX_AGE parameter in your Django settings. CONN_MAX_AGE defines the maximum lifetime of your database connection. By default Django opens a database connection when it first makes a database query. This connection is kept open and reused in subsequent requests. Django closes the connection when it's no longer usable or once it exceeds the maximum age defined by CONN_MAX_AGE. Persistent connections avoid the overhead of reestablishing a connection to the database in each request. The default value of CONN_MAX_AGE is 0. To enable persistent connection, set CONN_MAX_AGE to a positive integer of seconds. For unlimited persistent connections, set it to None.

Django rest framework serializers are great for providing standard interface for your output representations while handling input validations e.t.c. For simple cases they are not always necessary. A simple use of .values() in your database queries might be considered for performance critical views. In 2013, Tom Christie, the creator of Django Rest Framework, wrote an article on the performance of Django Rest Framework and APIs, which shows that DRF serializers accounts for about 12% of the total time spent in processing a request.

In general, you can use profiling tools such as Django Debug Toolbar or Django Silk to identify the slowest part of your code and consider optimizing it.

Asynchronous Processing

Asynchronous processing can help handle a large number of requests by processing them in the background. This can be achieved using libraries like Celery, RQ or Python/Django's built-in async support. Async was a feature introduced in Django 3.0 to support asynchronous programming. Asynchronous programming is a programming paradigm that allows an application to perform multiple tasks simultaneously by running them in the background or concurrently without blocking the main thread.

Django's async support is built on top of Python's asyncio library, which provides a framework for writing asynchronous code via event loop. With Django's async support, you can write views functions, middlewares, and other parts of your application in an asynchronous manner. To use Django's async support, you need to enable it in your Django settings by setting ASGI_APPLICATION variable to a new ASGI application that supports async views. You also need to use an ASGI server, such as Daphne or Uvicorn to run your application. Async views will still work under WSGI, but with performance penalties, and without the ability to have efficient long-running requests.

Once async support is enabled, you can write views that are marked as asynchronous by using the async def syntax. At the time of this writing, some parts of Django(4.1), such as the ORM are still missing async support. As a workaround to avoid performance issues in your async system, you can use sync_to_async adapter to interact with the sync parts of Django

Infrastructural Scaling

We can't discuss scaling without mentioning the infrastructure. Scaling the infrastructure adds a very significant improvement to our overall system scalability. There are two types of infrastructural scaling - vertical scaling and horizontal scaling.

Vertical scaling involves adding more resources to the existing server, such as increasing the RAM, CPU or disk space. Vertical scaling help in handling more requests and improving the performance of the application. This might be enough initially before you'd want to scale horizontally.

Horizontal scaling involves adding more servers to distribute the workloads. Distributing the workload across multiple servers can be achieved by load balancing and clustering the servers, which can help in handling more traffic and improving the application's availability. There are several load balancing solutions available for Django applications, including Apache, Nginx, and HAProxy. You can use a cloud provider such as AWS, Google Cloud, or Azure to spin up additional instances of your application server.

Another way to achieve horizontal scaling is using a scalable architectural pattern such as microservices. Splitting applications to individual services helps in scaling them individually as your load increases. Before choosing a microservice architecture, you need to consider the complexities in managing the microservices and the extra cost involved in communicating among the microservices.

Conclusion

We've talked about different ways we can scale a Django application such as caching, infrastructural scaling, asynchronous processing, database and code optimization. While caching and infrastructural scaling might have the highest impact in achieving scalability, it's worth nothing that poorly writting code and configurations can cause a great damage to your application performance. As you make changes to your codebase, you need to keep monitoring your application to make sure those changes has little or no negative impact on performance. Make use of profiling tools like Django debug toolbar and Django silk to profile your app. Keep coding, keep improving, keep scaling! Thanks for your time. See you in the next article!