valberg.dk/content/django-server-sent-events.md

446 lines
16 KiB
Markdown
Raw Normal View History

2023-05-16 15:39:44 +00:00
Title: Server-Sent Events and PostgreSQL LISTEN/NOTIFY using Djangos StreamingHttpRequest
2023-05-17 10:47:02 +00:00
Date: 2023-05-17
Status: hidden
2023-05-16 15:39:44 +00:00
Tags: django, sse, postgresql
Slug: django-sse-postgresql-listen-notify
Authors: Víðir Valberg Guðmundsson
2023-05-16 19:30:09 +00:00
Summary: A write-up of how I implemented server-sent events using Django 4.2 and PostgreSQL LISTEN/NOTIFY
2023-05-17 10:47:02 +00:00
2023-05-16 15:39:44 +00:00
---
With the release of Django 4.2 we got the following [0]:
> [`StreamingHttpResponse`](https://docs.djangoproject.com/en/4.2/ref/request-response/#django.http.StreamingHttpResponse "django.http.StreamingHttpResponse") now supports async iterators when Django is served via ASGI.
And the documentation has been expanded with the following [1]:
> When serving under ASGI, however, a [`StreamingHttpResponse`](https://docs.djangoproject.com/en/4.2/ref/request-response/#django.http.StreamingHttpResponse "django.http.StreamingHttpResponse") need not stop other requests from being served whilst waiting for I/O. This opens up the possibility of long-lived requests for streaming content and implementing patterns such as long-polling, and server-sent events.
2023-05-16 19:30:09 +00:00
Being a sucker for simplicity I got quite intrigued by the possibility to serve
2023-05-17 12:48:39 +00:00
server-sent events (also known as SSE) from Django in an asynchronous manner.
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
So I set out to write a small, drumroll please, chat application!
The code for the chat application can be found at
[github.com/valberg/django-sse](https://github.com/valberg/django-sse).
### What are server-sent events and why do we want to use them?
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
Server-sent events is "old tech", as in that is has been supported in major
browser since around 2010-2011 [2]. The idea is that the client "subscribes" to
2023-05-17 12:48:39 +00:00
an HTTP endpoint, and the server can then issue data to the client as long as
2023-05-16 19:30:09 +00:00
the connection is open. This is a great performance boost compared to other
techniques as for instance polling the server.
2023-05-16 15:39:44 +00:00
_But wait, isn't websockets "shinier"?_
2023-05-16 19:30:09 +00:00
It depends. In many situations when it comes to developing web applications, we
2023-05-17 12:48:39 +00:00
just want a way to push data to the client, and here a bidirectional
connection like websockets feel like an overkill. Also, I would argue that using
2023-05-16 19:30:09 +00:00
POST/PUT requests from the client and SSE to the client might be "just enough"
compared to websockets.
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
SSE also has the added benefit of having a built-in reconnection mechanism,
which is something we would have to implement ourselves with websockets.
All in all SSE is a much simpler solution than websockets, and in many (most?)
cases that is all we need.
### A simple implementation
So lets get to some code!
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
First we need our model for storing the chat messages:
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
:::python
2023-05-17 12:48:39 +00:00
class ChatMessage(models.Model):
user = models.CharField(max_length=255)
text = models.CharField(max_length=255)
With the model defined we can write our view to stream the messages.
The following is something along the lines of my initial attempt. First we have
to define the view, which in fact will not change for the remainder of this
blog post. The juicy bits are in the `stream_messages()` function. Note that
the view is an async view, denoted by the `async` keyword.
:::python
async def stream_messages_view(request: HttpRequest) -> StreamingHttpResponse:
2023-05-16 19:30:09 +00:00
return StreamingHttpResponse(
2023-05-17 12:48:39 +00:00
streaming_content=stream_messages(),
2023-05-16 19:30:09 +00:00
content_type="text/event-stream",
)
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
We tell the `StreamingHttpResponse` class to get its streaming content from the
2023-05-17 12:48:39 +00:00
`stream_messages` function. I implemented this as follows initially:
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
::python
2023-05-17 12:48:39 +00:00
async def stream_messages() -> AsyncGenerator[str, None]:
latest_message = None
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
while True:
2023-05-17 12:48:39 +00:00
current_message = await ChatMessage.objects.order_by("-id").afirst()
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
# If we have a new foo yield that
2023-05-17 12:48:39 +00:00
if latest_message != current_message:
yield "data: {current_message.text}\n\n"
latest_message = current_message
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
await asyncio.sleep(5)
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
So we've gotten rid of the HTTP overhead of polling by not having to do a
request from the client every 5 seconds. But we are still doing a query to the
2023-05-17 12:48:39 +00:00
database every 5 seconds, and that for each client. This is not ideal and is
probably something we could have done with a synchronous view.
Let's see if we can do better. But first we'll have to talk about how to run
this code.
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
#### Aside: Use an ASGI server for development
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
One thing that took me some time to realise is that the Django runserver is not
capable of running async views returning `StreamingHttpResponse`.
2023-05-16 15:39:44 +00:00
Running the above view with the runserver results in the following error:
2023-05-16 19:30:09 +00:00
:::text
.../django/http/response.py:514: Warning: StreamingHttpResponse must
consume asynchronous iterators in order to serve them synchronously.
Use a synchronous iterator instead.
2023-05-16 15:39:44 +00:00
2023-05-17 13:18:15 +00:00
Fortunately Daphne, the ASGI server which was developed to power Django Channels, has an async runserver which we can use:
2023-05-16 15:39:44 +00:00
2023-05-17 13:18:15 +00:00
To set this up we'll have to install the `daphne` package, add `daphne` to the top of our installed apps, and set
the `ASGI_APPLICATION` setting to point to our ASGI application.
2023-05-16 15:39:44 +00:00
2023-05-17 13:18:15 +00:00
:::python
INSTALLED_APPS = [
"daphne",
...
"chat", # Our app
]
ASGI_APPLICATION = "project.asgi.application"
Now we can just run `./manage.py runserver` as before and we are async ready!
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
### More old tech to the rescue: PostgreSQL LISTEN/NOTIFY
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
This is where we could reach for more infrastructure which could help us giving
the database a break. This could be listening for data in Redis (this is what
django-channels does), or even having a queue in RabbitMQ. No matter what, it
is more infrastructure.
2023-05-16 15:39:44 +00:00
But I use PostgreSQL - and PostgreSQL is, like Django, "batteries included".
2023-05-16 19:30:09 +00:00
PostgreSQL has this mechanism called "LISTEN/NOTIFY" where one client can
LISTEN to a channel and then anyone can NOTIFY on that same channel.
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
This seems like something we can use - but psycopg2 isn't async, so I'm not
even sure if `sync_to_async` would help us here.
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
### Enter psycopg 3
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
I had put the whole thing on ice until I realized that another big thing (maybe
a bit bigger than StreamingHttpResponse) in Django 4.2 is the support for
psycopg 3 - and psycopg 3 is very much async!
2023-05-16 15:39:44 +00:00
So I went for a stroll in the psycopg 3 documentation and found this gold[3]:
2023-05-16 19:30:09 +00:00
::python
import psycopg
conn = psycopg.connect("", autocommit=True)
conn.execute("LISTEN mychan")
gen = conn.notifies()
for notify in gen:
print(notify)
if notify.payload == "stop":
gen.close()
print("there, I stopped")
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
This does almost what we want! It just isn't async and isn't getting connection
info from Django.
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
So by combining the snippet from the psycopg 3 documentation and my previous
`stream_foos` I came up with this:
:::python
from collections.abc import AsyncGenerator
import psycopg
from django.db import connection
async def stream_foos() -> AsyncGenerator[str, None]:
connection_params = connection.get_connection_params()
connection_params.pop('cursor_factory')
aconnection = await psycopg.AsyncConnection.connect(
**connection_params,
autocommit=True,
)
channel_name = "new_foo"
async with aconnection.cursor() as acursor:
await acursor.execute(f"LISTEN {channel_name}")
gen = aconnection.notifies()
async for notify in gen:
yield f"data: {notify.payload}\n\n"
I was almost about to give up again, since this approach didn't work initially.
All because I for some reason had removed the `autocommit=True` in my attempts
to async-ify the snippet from the psycopg 3 documentation.
2023-05-16 15:39:44 +00:00
2023-05-17 12:48:39 +00:00
#### Aside: Difference between 4.2 and 4.2.1
2023-05-16 15:39:44 +00:00
2023-05-16 19:30:09 +00:00
the code worked initially in 4.2, but 4.2.1 fixed a regression regarding
setting a custom cursor in the database configuration.
2023-05-16 15:39:44 +00:00
In 4.2 we get this from `connection.get_connection_params()`:
2023-05-16 19:30:09 +00:00
:::javascript
{
'dbname': 'postgres',
'user': 'postgres',
'password': 'postgres',
'host': 'localhost',
'port': 5432,
'context': <psycopg.adapt.AdaptersMap object at 0x7f019cda7a60>,
'prepare_threshold': None
}
2023-05-16 15:39:44 +00:00
in 4.2.1 we get this:
2023-05-16 19:30:09 +00:00
:::javascript
{
'dbname': 'postgres',
'client_encoding': 'UTF8',
'cursor_factory': <class 'django.db.backends.postgresql.base.Cursor'>,
'user': 'postgres',
'password': 'postgres',
'host': 'localhost',
'port': '5432',
'context': <psycopg.adapt.AdaptersMap object at 0x7f56464bcdd0>,
'prepare_threshold': None
}
`django.db.backends.postgresql.base.Cursor` is not async iterable.
So we can probably try to set our own `cursor_factory` in settings:
:::python
from psycopg import AsyncCursor
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'postgres',
'USER': 'postgres',
'PASSWORD': 'postgres',
'HOST': 'localhost',
'PORT': '5432',
'OPTIONS': {
"cursor_factory": AsyncCursor
}
}
}
But alas. For some reason this does not work. I guess that Django does some
wrapping of the cursor - or maybe I've just encountered a bug. The cursor is at
least not treated as an async cursor and thus we get the following error:
:::pytb
.../django-sse/venv/lib/python3.11/site-packages/django/db/backends/utils.py:41:
RuntimeWarning: coroutine 'AsyncCursor.close' was never awaited
self.close()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
.../django-sse/venv/lib/python3.11/site-packages/django/db/models/sql/compiler.py:1560:
RuntimeWarning: coroutine 'AsyncCursor.execute' was never awaited
cursor.execute(sql, params)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
So instead I opted for removing the `cursor_factory` in the streaming function.
So that now looks like so:
:::python
async def stream_messages() -> AsyncGenerator[str, None]:
connection_params = connection.get_connection_params()
connection_params.pop('cursor_factory')
aconnection = await psycopg.AsyncConnection.connect(
**connection_params,
autocommit=True,
)
channel_name = "new_message"
async with aconnection.cursor() as acursor:
print(type(acursor))
await acursor.execute(f"LISTEN {channel_name}")
gen = aconnection.notifies()
async for notify in gen:
yield f"data: {notify.payload}\n\n"
2023-05-17 12:48:39 +00:00
### Test the endpoint with curl
So now we've got the `LISTEN` part in place.
If we connect to the endpoint using curl (`-N` disables buffering and is a way to consume streming content with curl):
:::console
$ curl -N http://localhost:8000/messages/
And connect to our database and run:
:::sql
NOTIFY new_message, 'Hello, world!';
We, excitingly, get the following result :
:::text
data: Hello, world!
Amazing!
### Issuing the NOTIFY
But we want the `NOTIFY` command to be issued when a new chat message is submitted.
For this we'll have a small utility function which does the heavy lifting. Note
that this is just a very simple synchronous function since everything is just
happening within a single request-response cycle.
:::python
from django.db import connection
def notify(*, channel: str, event: str, payload: str) -> None:
payload = json.dumps({
"event": event,
"content": payload,
})
with connection.cursor() as cursor:
cursor.execute(
f"NOTIFY {channel}, '{payload}'",
)
And then we can use this in our view (I'm using `@csrf_exempt` here since this is just a quick proof of concept):
:::python
@csrf_exempt
@require_POST
def post_message_view(request: HttpRequest) -> HttpResponse:
message = request.POST.get("message")
user = request.POST.get("user")
message = ChatMessage.objects.create(user=user, text=message)
notify(
channel="lobby",
event="message_created",
content=json.dumps({
"text": message.text,
"user": message.user,
})
)
return HttpResponse("OK")
The keen observer will notice that we are storing the payload content as a JSON string within a JSON string.
This is because we have two recipients of the payload. The first is the `stream_messages` function which is going to
send the payload to the client with a `event`, and the second is the browser which is going to parse the payload and use
the `event` to determine what to do with the payload.
For this we'll have to update our `stream_messages` function as follows:
:::python
async def stream_messages() -> AsyncGenerator[str, None]:
connection_params = connection.get_connection_params()
# Remove the cursor_factory parameter since I can't get
# the default from Django 4.2.1 to work.
# Django 4.2 didn't have the parameter and that worked.
connection_params.pop('cursor_factory')
aconnection = await psycopg.AsyncConnection.connect(
**connection_params,
autocommit=True,
)
channel_name = "lobby"
async with aconnection.cursor() as acursor:
await acursor.execute(f"LISTEN {channel_name}")
gen = aconnection.notifies()
async for notify in gen:
payload = json.loads(notify.payload)
event = payload.pop("event")
data = payload.pop("data")
yield f"event: {event}\ndata: {data}\n\n"
Everything is the same except that we now parse the payload from the `NOTIFY` command and construct the SSE payload with
an `event` and a `data` field. This will come in handy when dealing with the frontend.
Another way to do this would be to use Django's
[signals](https://docs.djangoproject.com/en/4.2/topics/signals/) or event
writing a PostgreSQL
[trigger](https://www.postgresql.org/docs/15/plpgsql-trigger.html) which issues
the `NOTIFY` command.
### Frontend stuff
Now that we've got the backend in place, we can get something up and running on
the frontend.
We could use HTMX's [SSE
extension](https://htmx.org/extensions/server-sent-events/) but for this
example we'll just use the
[EventSource](https://developer.mozilla.org/en-US/docs/Web/API/EventSource) API
directly.
:::html
<template id="message">
<div style="border: 1px solid black; margin: 5px; padding: 5px;">
<strong class="user"></strong>: <span class="message"></span>
</div>
</template>
<div id="messages"></div>
<script>
const source = new EventSource("/messages/");
// Note that the event we gave our notify utility function is called "message_created"
// so that's what we listen for here.
source.addEventListener("message_created", function(evt) {
// Parse the payload
let payload = JSON.parse(evt.data);
// Get and clone our template
let template = document.getElementById('message');
let clone = template.content.cloneNode(true);
// Update our cloned template
clone.querySelector('.user').innerText = payload.user;
clone.querySelector('.message').innerText = payload.text;
// Append the cloned template to our list of messages
document.getElementById('messages').appendChild(clone);
});
</script>
And that's it! We can now open two browser windows and see the messages appear in real time.
Check out the repo for the full code where I've also added a simple form for submitting new messages.
### Conclusion
Django might not be the shiniest framework out there, but it is solid and boring - which is a good thing!
And with the continued work on async support, it is becoming a viable option for doing real time stuff, especially when paired with other solid and boring tech like PostgreSQL and SSE!
2023-05-16 15:39:44 +00:00
[0]: https://docs.djangoproject.com/en/4.2/releases/4.2/#requests-and-responses
[1]: https://docs.djangoproject.com/en/4.2/ref/request-response/#django.http.StreamingHttpResponse
[2]: https://caniuse.com/eventsource
2023-05-16 19:30:09 +00:00
[3]:https://www.psycopg.org/psycopg3/docs/advanced/async.html#index-4