When you have a monolithic system of any significant size, it’s likely different parts of it require different types of resources to function optimally. Some parts are CPU intensive, while other parts are RAM intensive. It’s possible that there’s at least one part that would greatly benefit from using GPUs for some intensive computational processing.
Given that more and more systems are moving to the cloud these days, it’s pretty unlikely you’ll find a generic compute instance type that’s great at everything. That’s because all the top cloud vendors provide specialized compute instance types that are good at something particular — CPU, RAM, or GPU.
Even though they do provide general purpose instance types as well, they’re just that — general purpose. They’re not great at anything; they’re just decent at everything.
If your monolithic system would benefit from maxed out resources of all types — CPU, RAM, and GPU — you’re out of luck. First, GPUs are only provided in specialized GPU instance types and hence are not even available in generic purpose instance types.
Second, even if you do find a generic instance type that has enough CPU and RAM resources, those resources are not the best available. For instance, a generic instance type might have the same number of CPU cores as a CPU optimized instance type, but the former has just a decent Intel Xeon processor while the latter has an Intel Xeon Platinum processor with advanced instruction sets.
Same applies to RAM. A generic instance type might have the same amount of RAM as a memory optimized instance type, but the former has good enough memory cards while the latter has throughput optimized DDR4 memory cards.
The point here is that it’s not just about the number of CPU cores or amount of RAM. There’s a huge difference in performance of those resources between generic instance types and specialized ones. Even though a superficial look at the numbers might make you think otherwise.
Another important aspect here is the cost. Even if you find a generic instance type that has the same number of CPU cores as a CPU optimized instance type, the cost of each core will be higher on generic instance.
Same applies to RAM. The cost of each GB of RAM in a generic instance type is higher than in a memory optimized instance type.
That’s especially true once you realize that a CPU core in a compute optimized instance type is more powerful than one in a generic instance type, and a GB of RAM is faster in a memory optimized instance type than in a generic one.
Basically, you get a bigger bang for your buck per resource type if you go with a more specialized instance type.
And as I mentioned before, if your system would greatly benefit from taking advantage of GPUs, you just can’t go with a generic instance type; you have to pick a GPU instance type.
Using GPUs can significantly improve performance of many types of tasks. Graphics and video processing are the obvious ones but there are many more types of workloads that could greatly benefit from GPUs.
You can’t take advantage of any of these specialized instance types with a monolithic system. If your monolith hits a CPU bottleneck and it’s deployed to a generic instance, you have to scale all resources in order to get more CPU resources. You have to switch to an instance type with more CPU cores but that will also give you more RAM which you might not need but still have to pay for.
Microservices to the Rescue
Microservices solve this problem. You can create a microservice focused on a particular task that requires a particular type of resource more than others.
For instance, if you do a lot of image processing, you create a dedicated microservice for that and deploy it to either CPU or GPU optimized instances.
When there’s a huge spike in traffic that results in having to process a huge number of images at the same time, you can autoscale that particular microservice without having to overpay for other resources like RAM that you don’t need anyway.
That traffic spike might not stress any other part of the overall system and hence you don’t need to scale up anything else besides the image processing microservice. In the case of a monolithic system though, you would have to scale up everything just because a single part of the system is being stressed.