diff --git a/Plan.md b/Plan.md index f5b8569..4a6d69b 100644 --- a/Plan.md +++ b/Plan.md @@ -1,28 +1,34 @@ - - - Issues with rabbit ? - flap when rolling out agent / deploying new agent version - even crash on big regions - network flap / rabbit partition - pause-minority + - reset cluster was ... the solution + +- What's going on with rabbit ? + - reproduce workload with rabbit perftest + - oslo.metrics + - rabbitmq exporter / grafana dashboards + - smokeping between nodes + +=> we identified issues were mostly related to neutron + - rabbit flap flood resources to agents + +- How ? RPC implementation in Openstack: aka oslo.messaging + - pub/sub + - RPC server: setup endpoints / queues / listeners + - topic, fanout mechanism + - publish: rpc provided methods + - call + - cast + - cast / fanout=true + - notifications: kafka -- Troubleshoot - - oslo.metrics - - - Journey to get stable - - Infra POV + - Infra - split rabbit-neutron / rabbit-* - scale some clusters to 5 node - Upgrade to 3.10+ - - - -- Deep dive - - oslo.messaging - - how RPC is implemented - - rpc server - - call / cast - - topic, fanout, notifications - - neutron ovs rpc implementation - - too many queues - - too many connections + - openstack + -