By Prem Balachandar
As Mosaic has continued to grow, our five-plus-year-old, Scala-based monolith began showing signs of wear and tear. Release cycles became increasingly longer, developer onboarding took a long time, and bugs surfaced weeks after production release due to the intertwined nature of the business logic implementation. We knew this was no longer maintainable and sustainable, and began a concerted effort to break down the monolith into smaller business domain-oriented microservices.
How did we get started?
We decided to start small by forming a two-person engineering team. Their mission was to create a prototype CI/CD pipeline to build and deploy a microservice that would support sending out paper mail via a third-party API. The choice of microservice was deliberate so that we could focus more on the technical components such as the CI/CD pipeline, rather than being bogged down by business logic complexities. The outcome of this mission was to identify a lean technology stack that would support this pipeline and make available “quick start templates,” so that other teams could quickly adopt and begin their own microservice rollouts.
In approximately a two-month time period, the team — working in close collaboration with the DevOps team (daily standups, etc.) — built out a tech stack composed of a Lagom framework based microservice and a CI/CD pipeline based on AWS components (CodeBuild, CodeDeploy, CodePipeline, S3), Github.com for SCM, and finally a kops based Kubernetes cluster to host the deployments. The success of this early prototype (rough architecture diagram below) gave the team enough confidence to review it with business and executive stakeholders and position it as a company-wide initiative for further investment and adoption.
Once stakeholders were bought in (this was achieved with extensive discussions and roadshows with the CTO and other leadership committees) the following happened:
- An annual technology OKR at the company level was created (40% of monolith functionality moved to microservices)
- We brought in Lightbend (Maintainers of Scala and Lagom — an opinionated microservice framework) as our technical partner to direct and advise us on this journey
- Several key engineering decisions were made, such as adopting Kafka for interservice communication, a monorepo to host all of our microservices code, and a bias toward using CQRS/Event Sourcing patterns for implementing reactive microservices
- Formation of a small engineering productivity team dedicated to iterating and improving the infrastructure and the microservice CI/CD pipeline
With support from Lightbend, we kicked off a series of training sessions, including workshops and offsites, to bring the rest of the engineering and product organization up to speed. Every team was encouraged to adopt and migrate to the monorepo based CI/CD pipeline and weave their roadmap to enable and support development of microservices.
Lightbend and its Akka-based platform stack have been a valuable partner and enabler throughout our journey. Lightbend’s own report provides more details in a review we participated in. In summary, we experienced significant productivity gains. And more importantly, Lightbend’s framework tool and its engineers created strong engagement and incentives within our own engineering teams to migrate away from the monolith.
Rewards from the journey
Our platform’s technology has now been much more scalable and reactive than ever to growth and expansion while providing a “white glove” experience to our developers. Now, shipping our microservices are routine, automated tasks that require little to no effort. Opting to build out business, domain-based microservice has allowed teams to develop deep business knowledge and expertise, and truly “own” their services.
Other benefits include:
- Improved developer experience is leading to improved developer productivity
- Complex features no longer always translate to complexity in design or code
- Reduced change related impact due to the localized nature of changes
Some related “upshot” metrics:
- Developer onboarding onto the new microservices is much shorter (4 days vs. 4 weeks for the monolith)
- Code commit to deploy from 2 hours to <10 minutes
- Number of developer environments reduced from 7 to 2 in the Monorepo
- We have migrated approximately half our monolith’s existing domains and its capabilities into new microservices
Some final thoughts and our “Domain Oriented Services” North Star
We realize we will never entirely slay the monolith. And to be clear, that was not the goal when we began this journey. However, we now enjoy a mindset and culture where by default we look to break down the monolith further along roughly discrete business domains (diagrammed below), while acknowledging that adding to the monolith is still an available option (especially to support urgent business needs.)