Summary:
- It seems like Service/Relay ratio is still set to
60:40. Please check connection-limit errors in Experiment-1
- In all experiments, store nodes had received all (600) messages. However,
get-store-messages failed in Experiment-1 and Experiment-2. Possible reasons may include
- If a store node terminates a connection due to connection-limit, retrying the same node results in connection failure. Please see logs in Experiment-1.
- On one occasion (See Experiment-2), only
247/600 messages were received from store nodes. It seems like, the message order in store nodes is not guaranteed to be same. This results in wrong pagination cursor value. That's why some messages are missed.
- Using store retries seems to have minimized the message fetching issues from store. However, we noticed that significantly increasing
max-connections for any of the relay nodes still results in connection limit exhausting in store nodes. For instance, setting max-connections in fserver, store, and lpserver to 150, 150, 500 respectively, results in store query failures. Whereas setting max-connections to 150, 150, 150 works fine.
- Considering that data is published in only one pubsub topic
"/waku/2/rs/2/", every service node needs to maintain very few mesh connections in addition to a small number of service-related connections (store-sync is not used in these experiments). This might lead to two assumptions:
- Ideally, store queries should not hold connections (These should be short-lived connections). But looking at the connection-limit issue in store nodes, it seems like these connections (or at-least state/counters) are long-lived.
- Looking at the bandwidth also reveals that increasing
max-connections in lpserver also drastically increases store requests/queries (please see store bandwidth). Interestingly, it also increases fserver bandwidth. Does it mean:
lpserver nodes try exhausting connection limit while making store requests? Maybe after store-retries inclusion, we can limit store requests/queries to a smaller number of store nodes at a time?
- increasing
max-connections also increase mesh size? if so, is it really intended?
Scenario1 Experiments nwaku:v0.37.1-beta
Simulation components
Raw Data: