How We Build Analytics and ML Platforms on Django

If you have searched for how to build a Django-based production-scale data visualization, analytics, and machine learning platform, you have probably noticed that the advice splits into two unhelpful camps: pure Django tutorials that never touch model serving, and ML blog posts that assume you will rewrite everything in a notebook. The interesting system sits between them. This post describes the architecture we reach for when a team already has Django for their product and now needs analytics and ML on the same data.

The short version: keep Django where it is good (the admin, data management, auth, and the relational core) and add a small number of focused services around it rather than bending Django into roles it dislikes.

The four roles

We split the platform into four responsibilities, each mapped to the tool that fits it.

Django owns data management, the admin, user and tenant boundaries, and the canonical schema. The Django admin alone is a reason to keep Django at the center, because it gives your internal team a usable interface over the data without building one.
FastAPI owns the inference API. Model serving is latency-sensitive and concurrency-heavy, and FastAPI's async request handling and Pydantic validation suit it better than a synchronous Django view.
Celery owns the asynchronous jobs: feature computation, batch scoring, retraining, and report generation. Anything that takes longer than a request should is a Celery task.
PostgreSQL owns durable state, and with the right extensions it also covers time-series and vector search so you do not add a second datastore prematurely.

The reason to separate inference into FastAPI rather than serve models from a Django view is operational, not ideological. The inference service has a different scaling profile, different dependencies (the ML libraries and model weights), and a different deploy cadence than your product code. Splitting it means a model rollout does not redeploy your admin, and a heavy inference load does not starve your request workers. If you want to read how we run the API tier itself, we wrote up our approach to FastAPI development services separately.

Close-up of a circuit board and processor — Inference, background jobs, and the web app have different appetites for CPU and memory, so we give each one its own service to scale.

Data management stays in Django

Django defines the schema, the migrations, and the access patterns for the operational data. The ML side reads from the same PostgreSQL instance through read-optimized queries or replicas, but it does not own the schema. This keeps one source of truth and avoids the drift you get when an analytics team forks the data model.

A typical serializer for exposing a feature record through the Django REST Framework looks like this:

# serializers.py
from rest_framework import serializers
from .models import PredictionRequest
 
class PredictionRequestSerializer(serializers.ModelSerializer):
    features = serializers.JSONField()
 
    class Meta:
        model = PredictionRequest
        fields = ["id", "tenant", "features", "created_at"]
        read_only_fields = ["id", "created_at"]
 
    def validate_features(self, value):
        if not isinstance(value, dict):
            raise serializers.ValidationError("features must be an object")
        return value

DRF here is for the operational endpoints: creating records, listing history, and the admin-facing API. The hot inference path goes to FastAPI, not through DRF, precisely because that path needs different performance characteristics.

Celery for the asynchronous work

Anything that runs longer than a web request tolerates belongs in Celery: nightly retraining, batch scoring of a new dataset, recomputing aggregates for a dashboard. A scoring task reads a batch, calls the model, and writes results back to PostgreSQL.

# tasks.py
from celery import shared_task
from .models import PredictionRequest, PredictionResult
from .ml import load_model
 
@shared_task(bind=True, max_retries=3)
def score_batch(self, batch_id):
    model = load_model()
    requests = PredictionRequest.objects.filter(batch_id=batch_id)
 
    results = []
    for req in requests.iterator(chunk_size=500):
        score = model.predict(req.features)
        results.append(PredictionResult(request=req, score=score))
 
    PredictionResult.objects.bulk_create(results, batch_size=500)
    return {"batch_id": batch_id, "scored": len(results)}

The iterator(chunk_size=...) plus bulk_create(batch_size=...) pairing matters: it keeps the worker's memory flat over a large batch instead of loading every row at once. We size Celery concurrency against the worker's CPU and the model's per-call cost, and we scale the worker pool on queue depth, independent of the API tier.

PostgreSQL, and when to add an extension

For most platforms a well-indexed PostgreSQL instance carries you a long way before you need anything specialized. Two extensions earn their place when the data shape demands it:

Close-up of a hard drive, representing durable database storage — PostgreSQL stays the source of truth. Add an extension only when a real query pattern asks for it.

TimescaleDB when the analytics are genuinely time-series (metrics, events, sensor readings) and you want continuous aggregates and time-bucketed queries without hand-rolling rollup tables.
pgvector when you are storing embeddings for similarity search, so the vectors live next to the relational data and you avoid running a separate vector database until scale forces it.

Reaching for these on day one is a mistake as common as reaching for them too late. Start with plain PostgreSQL and add the extension when a query pattern actually hurts.

For profiling, we run django-silk in staging to find the slow queries before a batch job multiplies them across thousands of rows. It is a staging tool, not a production one, so you disable it before the build ships, the same way you would any request-recording middleware.

Visualization

The platform produces numbers; someone has to look at them. We serve charts in the product UI with Plotly or Recharts, fed by aggregate endpoints rather than by querying raw rows from the browser. The aggregation happens in PostgreSQL (or a Celery-built materialized table for expensive rollups), and the API returns compact summarized series. The browser should render a chart, not compute one.

How the pieces connect

A request to score data lands on the Django API, which writes a PredictionRequest and enqueues a Celery task. For low-latency single predictions, the client calls the FastAPI inference service directly, which loads the model once at startup and answers in-process. Batch and scheduled work runs through Celery. All of it reads and writes the same PostgreSQL database, with TimescaleDB or pgvector added only where the access pattern justifies it.

The result is a platform where each part can scale and deploy on its own schedule, and where Django remains the place your team manages data and runs the admin. Nothing here is novel. The value is in keeping the boundaries clean so the system stays understandable as it grows.

If you are building this and want a team that has run Django, FastAPI, and Celery together in production, our Django cloud deployment team does exactly this kind of work, and we design the underlying AWS cloud architecture and IaC so the whole thing is reproducible rather than hand-assembled. Tell us about your data and your goals on our Django platform engineering page and we will give you an honest read on what to build first.