

Industry Description
SatoriNet operates in the rapidly evolving space of decentralized networking, aiming to predict the future with AI. This is structured as a decentralized mesh of nodes (electrons) which contribute their own predictions to the network.
Problem Description
Satori has reached out following huge success with their platform. So much success, in fact, that all central components of the network where seeing outage mutliple times a day. We came in as an emergency consultant, analyzed the problem and drafted a list of 10 items that we need to work together in order to overcome the challenge.
Over a period of a couple of weeks, we have approached the problem tactically and achieved the 100% uptime position shortly after. The range of issues resolved included: introduction of load balancers, horizontal scaling up of nodes, optimizations of SQL queries, monitoring, adjusting connectivity patterns.
Following the initial lift-up, we continued working with Satori to create and implement a long term architectural strategy and be on hand if issues come up.
Optimizations
The following is a list of the items we implemented in order to achieve 100% uptime.
01
Add Monitoring Systems
You can't fix what you can't measure. We added exception and performance monitoring systems and started analyzing the problem
02
Dockerize, Load Balancer and Horizontal Scaling
The easiest short-term solution to scaling problems is typically to outspend the traffic. We dockerized the platform and set up a horizontally scalable system (multiple nodes) with a generous load balancer ahead. This reduced constant outages to only a handful.
03
Query Optimizations
Once the above is done, we now try to reach 100%. This is a matter of taking expensive queries and rewriting them, adding database indexes, or even not doing them at all.
04
Reduce to Load
We now have a system that is always up, but we are still overspending on hosting. We now want to reduce the provisioning level to acceptable CPU and Memory levels (75% is a good target, but it depends on the traffic spikes that are expected). This can go up and down in time depending on traffic.
05
Continuous Monitoring and Intervention
Now that the system is set up properly, we can turn on our continuous monitoring capabilities where the specialist team is automatically notified if issues show up. We then commit to start taking a look within 15 minutes, day and night.
06
Long Term Strategy
Once the storm has been averted, it is time to invest into being properly prepared for any storms that may lie ahead. This is not simple - we recommend scenario planning and analysis, drill runs, and being prepared for the unexpected.
Digitize your business. Talk to us today.
Following our initial call, we'll come back with a full development plan.