Queues: web cluster without upstreams orchesration

The story began where I was once asked how to build such HTTP-gate (api endpoint), which proxies requests to a group of lower-level servers (“upstreams”), without any knowledge about their exact IP address, or their quantity, or “health” state. In addition, each request must be processed by the least loaded server in the moment, and the response should be sent synchronously – as plain HTTP response (not as secondary callback).

I do not pretend to be original, but will tell you how we did it: Continue reading “Queues: web cluster without upstreams orchesration”

Key difference between AMQP and HTTP in distributed web applications

Even my experienced colleagues don’t understand the basic fundamental difference: AMQP promises you guaranteed delivery (and processing), but HTTP – don’t.

Disclaimer: I know that there is written “not guaranteed” in the RabbitMQ documentation. But, Rabbit at least try to do it (and does it very well), while HTTP does not guarantee anything by design.

Let me remind how HTTP works: the client sends a request, and “hangs” waiting for response. If you will abort the execution, or disconnect the client from server by any reason – the answer will be lost forever. Some systems, AFAIK, also will interrupt the processing if client drops the connection.

If the fatal error occurs on server side, we can get the response code (500/503) as well as get nothing. And we don’t know, had our request be processed completely, or partially, or just died when begin to send response to us, or there was deadlock between two “parallel” requests, or it’s such long by it’s nature, or backend is very busy and we were rejected by balancer, or.. I think you understand me. Continue reading “Key difference between AMQP and HTTP in distributed web applications”

Where are the queues coming from in web?

Let me remind you how websites was created in past: there was server application that receives the request from the user, processes it, draws the HTML page (or performs the requested operation and draws the similar page) and gives it back to user. Simple rule: the more RPS you can process – the more visitors you can serve.

When internet grows, some people began to counteract with “high load” with typical methods: they setuped nginx as front-server, several backend-servers (upstreams) with copies of their web application, and spreaded the load to them. Randomly (by round-robin) or with a little trick: for example, the first upstream had 1s timeout, second upstream – 2s timeout, and so on. Of course there were more clever schemes.

It was a typical way more than 6 years ago. Continue reading “Where are the queues coming from in web?”