openinfraday/1. Follow the Rabbitmq.md


RabbitMQ recent improvments

RabbitMQ is a key component in OpenStack deployment.
Both nova and neutron heavily rely on it for intra communication (between agents running on computes and API running on control plane).
RabbitMQ clustering is a must have to let operators manage the lifecycle of rabbitMQ. This is also true when rabbitmq is running in a kubernetes environment.
OpenStack components consume rabbitMQ through oslo.messaging.

Some recent improvment have been done on oslo.messaging to allow a better scaling and management of rabbitmq queues.

Here is a list of what we did on OVH side to achieve better stability at large scale.


- Better eventlet / green thread management
AMQP protocol rely on "heartbeats" to keep idle connection open.
Two patches were done in oslo.messaging to send hearbeats correctly:
the first patch was about sending heartbeats more often to respect the protocol definition.
the second patch was about using native threads instead of green thread to send hearbeats.
Green threads could be paused by eventlet under some circumstances, leading to connection beeing dropped by rabbitmq because of missed heartbeats.
While dropping and creating a new connection is not a big deal on small deployment, it leads to some messages loss and a lot of TCP churn at large scale.

Both patches are merged upstream and available by default.


- Replace classic HA with quorum
Rabbitmq is moving out of HA classic queues and replacing those with Quorum queues (based on raft algorithm).
This is a huge improvment on rabbitmq side. This allow better scalability as well as redundancy of data.
Quorum queues were partially implemented on oslo.messaging.

OVH did a patch to finish this implementation (for 'transient' queues)

Using quorum queues is not yet the default and we would like to enable this by default.


- Consistent queue naming
oslo.messaging was relying on random queue naming.
While this seems not a problem on small deployments, it has two bad side effects :
- it's harder to figure out which service created a specific queue
- as soon as you restart your services, new random queues are created, leaving a lot of orphaned queues in rabbitmq

These side effects are highly visible at large scale, and even more visible when using quorum queues.

We did a patch on oslo.messaging to stop using random name.

This is now merged upstream, but disable by default.
We would like to enable this by default in the future.


- Reduce the number of queues
Both neutron and nova are heavily relying on rabbitmq communication.
While nova is the one sending most messages (5x more than neutron), neutron is the one creating most queues (10x more than nova).
RabbitMQ is a message broker, not a queue broker.
Neutron is creating a lot of queues without even using them (neutron instanciate oslo.messaging for one queue, but oslo.messaging is creating multiples queues for multiple purpose, even if neutron does not need them)
With a high number of queues, rabbitmq does not work correctly (timeouts / cpu usage / network usage / etc.).

OVH did some patches to reduce the number of queues created by neutron by patching oslo.messaging and neutron code (we divide neutron number of queues by 5).

We would like to push this upstream.


- Replace classic fanouts with streams
Both neutron and nova rely on fanout queues to send messages to all computes.
Neutron mostly use that to trigger a security group update or any other update on object (populating the remote cache).

When classic queues were used to perform such thing, messages were replicated in all queues for all computes.
If you were having a region with 2k computes, you would be sending 2k identical messages in 2k queues (1 message per queue). This is not efficient at all.

OVH did a patch to rely on "stream" queues to replace classic fanouts.
With stream queues, all computes listen to the same queue, so only 1 message is sent to 1 queue and is received on 2k computes.
This is also reducing the number of queues on rabbitmq.

Those patches are merged upstream but disabled by default

We would like to enable this by default.


- Get rid of 'transient' queues
oslo.messaging is distinguishing 'transient' queues from other queues but it make no sense anymore.
Neutron and nova are expecting all queues to be fully replicated and highly available.
There is no transient concept in nova / neutron code.
This concept lead to bad practices when managing rabbitmq cluster. E.G. not replicating the transient queues, which is bad for both nova and neutron.

OVH stopped distinguishing transients and manage all queues in a high available fashion (using quorum queues).
This allow us the stop a rabbitmq server from the cluster without any impact on the service.

What we would like is to patch oslo.messaging in the future to stop considering some queues as transient.
This would simplify the code a lot.