Waterfall has clearly defined stages of software development: requirements → architecture → design → implementation → and so on. (As a side note, some people don’t differentiate between architecture and design, but I consider them separate. Architecture is strategic, while design is tactical.)
Then along came Agile. Most people started considering Waterfall evil and started opposing everything about it. They threw away the baby with the bathwater. And by that I mean essential practices like architecting got thrown away along with Waterfall.
These days, it seems like the only architecture most people do is picking a technology stack like LAMP or MEAN. And that’s it. After picking a standard stack, most projects end up as just a bunch of ad-hoc features slapped together in an incomprehensible way without any overall strategy.
In the case of LAMP, all the data goes to MySQL, all the files are being written to disk, all the traffic is being served by Apache, and all the requests are being handled synchronously, including hitting third-party APIs. To boot, all this is being hosted on a single VM at some VPS provider. (VPS is not the cloud, BTW.) And then you make changes to the database schema by hand and deploy new code by manually uploading files to the server.
Then the disk fills up, MySQL eats all the RAM, there’s a contention of resources, HTTP requests time out halfway because of long running operations, and Apache is too busy serving static files to bother serving dynamic PHP requests. Then third-party APIs go down and because of that half of users get 5xx errors, even though all those APIs are not the core of the system. And then you keep scaling your VPS instance up to handle peak traffic even if it happens just once a month, but since scaling down requires downtime, you never do it. And you end up with code changes that only exist on the server but are not committed to the code repository — if you happen to have one, that is.
What About Agile?
Doing Agile doesn’t mean architecture is unnecessary anymore. You still need to make high level strategic decisions on whether you’re going to build a microservice or monolithic architecture. You still need to pick a technology stack. And you need to pick the proper stores for files and data.
And the reality here is that unless your project is small and simple, you’ll likely need more than one type of datastore — both SQL and NOSQL. And it can even be multiple NOSQL datastore like MongoDB, Redis, Elasticsearch, and Cassandra. Yeah, all those together on a single project. Each of them solves a particular problem well and you’re supposed to use the right tool for the job instead of using a single one for everything.
The same applies to technology stacks as well. It’s likely you’ll need multiple stacks for different parts of your system. The API could be done in Java, while the admin panel could be done in PHP or JS, and the user dashboard will be done in one of those as well. Then you’ll also have native mobile apps for both iOS and Android, developed with their respective stacks.
And then you need to tie all those parts together. And for that you’ll need streaming and queueing tools like Kafka and an AMQP implementation.
And then you need to host all that in the cloud. You could build your own cloud if you want, but the point here is that the hosting solution needs to be flexible to support independent autoscaling of each subsystem. Clearly, hosting all that stuff on a single VPS is not the proper solution.
And then you need to automate deployment of all these subsystems because managing all that manually is practically impossible.
And you also need a development/staging area where all the subsystems get integrated together to be able to test the whole thing before deploying to production.
Big Design Up Front?
Reading all this you might think that I’m an advocate of BDUF. But I’m not. You don’t have to introduce all this complexity right away.
Once you understand the overall idea of the project, do enough architecting to support the first core features of the system, but do it in a way to be flexible enough to be able to modify the architecture for future needs. Then, when things change, rethink the architecture to handle the new demands. By new demands I mean not only new features, but also rise in traffic, rise in volume of data created by users and the system itself, and so on.
You could still start with a stack like LAMP. But once you need to send out emails, introduce a queue or a stream to push emails to so that the server can respond immediately without waiting for an email to go out.
The same applies to processing images and videos — once you have a need for that, use a queue or a stream. Maybe it even makes sense to do the actual processing in a microservice using a different technology stack that’s more suited to the problem at hand.
Once you need search, don’t even bother with MySQL fulltext support. Go with Elasticsearch or the like. Dedicated search engines are much smarter, faster, and more flexible than relational databases when it comes to search.
Need to combine data from different sources and build a JSON document for faster and simpler access? Go with MongoDB or PostgreSQL with its JSON support.
Need to store hundreds of millions or even billions rows? Don’t bother with MySQL sharding or table partitioning. Go with Cassandra.
Need to store user uploads and other files? Go with S3 or the like. And if they need to be served, go with a CDN like CloudFront. That will also offload serving static files from your Web servers and let them focus on serving dynamic content.
Integrating with third-party APIs? Abstract them away as microservices that handle all the caching, asynchronous syncing, error handling, API limits, etc. Then the core system can just hit those microservices without having to bother with all those third-party peculiarities.
And so on.
The point here is that you do need some kind of upfront architecture, but it doesn’t have to be exhaustive. It just has to support the first features and at the same time be flexible enough to be able to evolve when things change.
Architecture is an ongoing thing. You can’t just do an initial architecture and keep building features for years without revisiting it. I prefer revisiting architecture for each new feature or even a significant change of an existing feature. Even if a feature worked fine with just PHP and MySQL, a significant change to it might require extracting it to a Java microservice and interacting with it using streaming and REST.
Of course, just because you install an awesome tool like Elasticsearch or Cassandra, it doesn’t mean your architecture becomes awesome right away. You can’t buy awesomeness that easy. It’s very important to learn how to use tools properly and that takes at least a few months for each tool. But having a proper tool for each job is a very good start already.
Being Agile doesn’t mean not doing architecture. It just means doing enough architecture upfront and then revisiting it in every cycle of your Agile flow.
Your company probably has strategic planning — the overall direction. And then it has tactical planning of execution of each step to facilitate moving in the right direction. The technology side needs strategic planning as well — and that’s what architecture is about. Otherwise it’s just who knows what.