Microservices War Stories

This post originally appeared on the Fastly blog. You can find the original version here.

Recently I had the opportunity to present on the topic of microservices at OSCON and I covered some of the pitfalls and lessons learned from building several service-oriented systems. The talk explored some of the problems with building, testing, and deploying a functional microservice architecture, from data loss to dependency nightmares, drawing on war stories collected and personal experience. Here are some of the stories that I shared and lessons I’ve learned.

The popularity of implementing microservices in today’s application landscape continues to rise. There have been countless success stories focused on migrating from a monolithic architecture (a single large application stored in one code repository) to microservices, in which parts of application logic are broken into smaller functional services. As more teams move toward microservices architectures, an increasing number of stories have arisen about the pain of poor choices. Microservices are not the answer to all application problems. Attempts to move away from one giant application to smaller focused services often result in a tightly coupled nest of applications. Some of these problems can be avoided by learning from the mistakes of existing architectures.

Microservice vs monolithic architecture

The microservice architecture can be best understood in contrast to a monolithic architecture; historically, in a monolithic application, all server-side logic used for retrieving and manipulating records from a database and presenting information in a user interface existed in a single application and code repository. As applications grew larger and more complex, they began to resemble one giant block or monolith. In contrast, a microservice architecture breaks application logic into smaller, isolated services that can exist independently. Each service is typically responsible for a small or micro piece of functionality inside of the greater application system. For further understanding on microservices, I recommend reading the extensive collection of articles written by Martin Fowler and James Lewis.

Why microservices?

The microservice architecture provides several advantages:

1. Independent deployment and scalability of services

Monolithic applications are deployed to a production environment on a server and are typically scaled by increasing the number of servers. The microservice architecture allows separation of the application logic into smaller parts that can be developed and deployed independently. New application functionality can be released continuously in smaller parts instead of waiting for a specified time to deploy the entire application. By splitting application logic into smaller services, parts of the architecture that require more processing power can be duplicated and scaled without unnecessarily replicating the entire application.

2. Compartmentalization of teams and responsibilities

Smaller services allow teams to focus on parts of the application logic without having knowledge of the entire application. For example, developers focused on presentation logic in the user interface can exist as an independent team. At Fastly, our UX team works on an application that consumes the Fastly API and presents this data in a user-friendly interface. This architecture allows the team to focus on the user experience without needing an intimate understanding of the backend application logic. For some companies, application architecture begins to resemble the organization’s communication structures over time, which is commonly known as Conway’s Law.

3. Technical design freedom

Smaller services also allow teams to build using the technology best suited for the task. The UX team can use a popular JavaScript framework for presentation of information in the views. The API team can use the Ruby or Python programming languages which have helpful web application frameworks for retrieving records from a database, manipulating and returning that data via an API. Additionally, application logic that needs to be highly performant such as real-time data processing can be moved to services that can be written in programming languages like Go or C.

4. Fault tolerance of the system

Deploying services independently allows the overall system to be more tolerant of problems that may arise in some parts of the application logic. For monolithic applications, errors in one small part of the application logic can result in the entire system being unavailable. With microservices, the system can and should be built to gracefully handle situations where parts of the application are unavailable without affecting overall performance.

These benefits have enticed more and more organizations to move towards a microservice architecture. For some teams, this isn’t always a smooth transition. At Fastly, and at previous jobs, I’ve had the opportunity to work on several service-oriented architectures in high-traffic production scenarios. These experiences haven’t left me immune to some of the pitfalls associated with working on microservices but I have been able to learn from past mistakes. Note: not all of the following stories are from my time working at Fastly.

Story 1: Supporting a new content type

A new feature required the API to support multiple media types. As part of this change, a bug was introduced to the codebase that resulted in an incorrect Content-Type header being set. As a result, the service responsible for the user interface was unable to process the API responses, causing errors in production. The bug resulted in a change in the contract between the UI and the API. This bug might have been caught earlier in the development process with a monolithic architecture because automated integration tests likely would have failed. For a microservice architecture, a QA team might have been able to prevent this change from being deployed to production. Alternatively, running automated tests for the UI against the updated API in a staging environment might have caught this problem.

Story 2: Removing a feature flag

Isolating services in separate code repositories for ease of deployment can add additional overhead to the development process. Feature flags are a commonly used tool in software development to allow features to be gradually released to a subset of users over time. Feature flags can be turned on and off easily for a set of users by an administrator via controls in the UI. In a microservice architecture, the logic for a feature flag can spread over several services. 1) The user interface presents the current state of the feature flag for a given user. 2) An authorization service is responsible for which users have the feature flag enabled. 3) Another service prepares and returns different data based on whether a user has a feature flag enabled or not. After the feature is enabled for all users and no longer needs to be gated by a feature flag, all application logic for the flag can be removed. In the architecture described here, this requires deletion of code in three different code repositories. Additionally, this would require three different deploys to remove the code from the production environment. Ensuring all parts of the application are updated requires additional contextual overhead and coordination that may not be necessary with a monolithic architecture. Before moving to a microservice architecture, organizations should weigh the benefits of having specialized teams with the additional costs of communicating and coordinating changes across all teams.

Story 3: Tightly coupled services

In some cases, moving application logic into separate services results in services that cannot exist independently and that are too tightly coupled and dependent on other services. This coupling can cause problems in the production environment as well as the development and test environments. In the production environment, if one service is unavailable, it is likely that both services will become unavailable.

In a development environment, one service cannot be running without the other service also running, sometimes leading to additional operations overhead for setting up a development environment in which two very different services can run and communicate. Before moving to a microservice architecture, organizations should consider if they have an infrastructure team that can support a complicated development and deployment process.

In an automated testing environment, tests cannot be run without proper setup and teardown of both services. One solution for isolating services is to build client applications for each service. Clients provide an interface for interacting with another service, both for real interactions in the development and production environments and mock interactions that can be used in automated testing. Cistern is an example of a client framework that can be used for building clients using the Ruby programming language.

Story 4: String vs. integer

By isolating application logic and development teams, additional communication and effort is required to ensure services work well together. Services need to have a clear and documented interface for interactions. Miscommunication between services can result in errors and sometimes data loss. One example of such a miscommunication arose in the case of an API endpoint that allowed related database records to be created, updated, or deleted by making a request and passing a JSON blob of nested objects. If the child object didn’t exist in the database, it would be created. If it did exist in the database it would be modified or deleted. In order for the service to find the record in the database, the JSON blob would need to contain an id attribute strictly in the form of an integer. Because the user interface was unaware of the strict integer requirement requests were being passed with an id attribute in the form of a string. As a result, the service was unable to find matching records in the database. In turn, the application deleted records that matched the integers and created records matching the strings, resulting in data loss. This situation might have been avoided if the UI and application logic existed in one monolithic application. Additionally, it could have been avoided if the application logic was more lenient in regards to which data types it accepted for the id attribute. This example should demonstrate that care should be taken when isolating teams of developers and application logic into separate services.

A cautionary tale

Don’t let these examples dissuade you from considering a microservice architecture, but microservices are not the answer to all application problems. The contextual and operational costs associated with supporting microservices are a compromise for being able to develop and deploy features independently. Organizations should consider if they can support the operational and communication overhead that comes from isolating teams and applications into smaller services. Additionally, you should take care not to create knowledge silos when choosing this architecture pattern — communication and documentation are key components to a successful microservice architecture. Using an automated test suite, including integration tests that cross boundaries between services, will ensure that the contract between services hasn’t been broken. Build services to exist independently to avoid pain in production and development. If your organization is unable to invest the time to ensure that services can exist independently, microservices might not be the answer.

Watch the full talk below: