How the simulation is done

Report:

Simulation parameters:

Untitled

Deployment process:

We are first using KubeVirt to deploy between 50 and 80 virtualised Kubernetes workers. The exact number we’ll be using going forward is still being worked out, but for now we are using 75 worker nodes.

We deploy 3 initial bootstrap nodes, followed by 30 (what we called) midstrap nodes, followed by the remaining nodes (1K, 2K or 3K). The idea behind this deployment is to speed up the process of stabilizing the mesh toward a healthy state before starting to inject traffic.

Once the mesh is stable (all nodes have the topic as healthy), we start the injection. The publisher will start injecting traffic to random nodes through the headless service.

After 15 minutes, the data is collected, saved and plotted.

Notes:

Current lab specs: 4 physical machines, 448.00 CPU threads, RAM: 2TiB, 140TiB SSD

Directly participating in this test: 3 nodes, 320.00 CPU threads, RAM: 1.5TiB, 140TiB SSD

(Machine “inferno” is being excluded from running Waku nodes, but does help by running metrics, storage and query workloads).

Current issues

  1. We are currently bottlenecked largely by CPU usage, both on the Kubevirt hosts and on the VMs they create. An unknown percentage of this CPU usage is because of the network traffic not being offloaded to the network cards (as opposed to processed by the CPU(s)). We’re working through switching to SR-IOV and hardware offloading for network traffic, which should improve CPU efficiency and allow us further gains in performance and scalability.