Let’s discuss WebSocket Scaling

5 min readJul 2, 2024

WebSocket:

WebSocket is a protocol that enables bidirectional communication through a persistent TCP connection. It’s commonly applied in chat applications, gaming, notifications, and trading platforms to provide real time updates of that event.

How connection gets established? The client initiates a web socket connection by sending an HTTP request. The server validates this request and upon successful handshake it upgrades the connection from HTTP to WebSocket. After upgradation the connection becomes persistent and enables bidirectional event transmission between the client and server. In case of a WebSocket connection failure, it’s essential for the client to have logic for retrying and re-establishing the connection.

WebSocket are hard to scale unlike API servers since the connection here is persistent.

Let’s explore alternative methods for achieving real-time or near real-time updates apart from WebSocket. Understanding these alternatives will help us understand the need for WebSocket.

Short polling –It involves the client sending regular HTTP requests to the server at set intervals. For example, if a dashboard requires near real-time updates, the client can be programmed to call the API every 30 to 60 seconds to ensure relatively recent data.

Advantage: The simplicity of short polling means there’s no need for specialized setups. Scaling is straightforward — just add servers as needed to handle increased demand.

Disadvantage: In scenarios demanding extremely real-time data, like stock movements, short polling falls short as it might not deliver updates fast enough (e.g., within seconds). Additionally, if there are no new events then frequent polling can result in empty or redundant calls by consuming unnecessary resources.

Long polling — The client initiates a standard HTTP request, and the server holds onto it until either a timeout occurs or there’s new information to respond with. This method facilitates real-time updates but comes with increased resource consumption due to the extended duration of connections.

Server Sent Events (SSE) -It’s like long polling utilizing standard HTTP requests but with the capability for the server to dispatch multiple events within a single request. This method operates unidirectional by allowing only the server to initiate and transmit events to the client.

The methods mentioned above are either resource-intensive or lack the capability of full duplex communication, unlike WebSocket.

But all of them are relatively easy to scale. Let’s understand how scaling plays a significant role by looking at the core concepts.

Stateful vs Stateless applications

Before looking into why additional measures are needed to scale WebSocket servers it’s crucial to understand the difference between stateful and stateless systems.

A stateless system operates without retaining any information about previous requests. It functions solely based on the request payload and concludes its operation upon delivering a successful response.

In contrast to stateless systems, stateful systems maintain information or state about previous interactions or requests. They retain context or data between interactions allowing them to remember past events or conditions influencing subsequent interactions or responses.

Consider the example of an HTTP session utilizing a Session ID. When a user logs in to the application server it generates a unique Session ID and allocates memory space associated with that login. This memory space resides on a specific server. Subsequent requests must reach the same server to access the Session Data orlese access to this data is not available.
In this scenario additional setup such as Sticky sessions (directing all requests with a specific session ID to the same server) or a shared memory space accessible by all servers is necessary.

Scaling the HTTP Requests

For stateless applications configuring a load balancer and adding several servers suffices. Take any load balancing method — be it Round Robin, Least Error, or Least Weighted — and the setup is complete.

For stateful applications handling HTTP requests, following the above approach and then configuring either sticky sessions or adjusting the code to access session data from a shared memory space — and the setup is complete.

Bottleneck in scaling the WebSocket

As discussed, WebSocket maintains a persistent connection. Consider Client-1 connected to Server-1 and Client-2 to Server-2.
When Client-1 attempts to send a message to Client-2, utilizing the existing WebSocket connection, the message reaches Server-1. However, Server-1 lacks access to Client-2’s connection as it resides on a different server.
Expanding the server count doesn’t significantly contribute to scaling in this scenario.

Solutions

I’m sharing a solution based on my past experiences and learnings, but it doesn’t imply that it’s the definitive answer. I’m open to listening, understanding and learning about alternative solutions from anyone else.
With the VirtuOwl project I’m involved in, we’ve established a dedicated Socket server (Socket IO) and have the capability to scale it according to our requirements. I’ve been actively conducting several proof-of-concepts (POCs) in this regard. If you need more details on any specific aspect, feel free to reach out to me.

Solution-1 (Using Apache Kafka and Spring boot WebSocket)

Consider Server-1, Server-2, and potentially more servers. We’re configuring Apache Kafka or a suitable Pub/Sub mechanism where all these servers listen to the same Kafka topic or channel.
For instance, if Client-1 connected to Server-1 sends a message intended for Client-2, upon Server-1 receiving this data, it publishes the payload to the Kafka topic.
Subsequently, all servers (Server-1, Server-2, and so forth) receive this message. Each server checks its memory for Client-2’s information. If Server-2 holds the necessary access, it retrieves the message and forwards it to Client-2.

Solution-2 (Using Socket IO and Redis)

Given our existing Socket IO application, configuring it to scale using Redis for Pub/Sub and Session storage is straightforward.
To ensure proper functionality, enabling Sticky sessions is essential, particularly because Socket IO uses the Polling method which requires a stateful approach to handle requests effectively.
Just additional information for users of the Apache webserver -Installation of WebSocket modules might be necessary, as it wasn’t supported in certain versions. Our product encountered an issue where Socket IO WebSocket falls back to Polling due to this error.
The primary difference is the setup and logic implementation required when utilizing Kafka compared to Socket IO. Socket IO simplifies the process by handling various configurations and functionalities automatically through the addition of Redis/Heroku as an adapter configuration.

Important note

When developing the WebSocket server, focus on using it solely for message pushing and reception. Offload resource-intensive tasks like database operations or memory/CPU-intensive operations to separate servers. This approach allows a single server to efficiently manage numerous concurrent persistent connections.
If your application doesn’t demand real-time events, Go for short polling.
Consider building the socket server as a separate entity if feasible. This approach eases scaling efforts. That’ll also let you maintain many connections.

Velraj Chamundeeswaran

Tech Lead, ConcertIDC

Let’s discuss WebSocket Scaling

Velraj Chamundeeswaran

Written by ConcertIDC

No responses yet