In July 2015, Accusoft's SaaS Applications team was tasked with integrating the recently acquired edocr application with our existing Prizm Share community for publishing and sharing documents. While this integration offered numerous challenges with data migration, feature parity, and cohesive branding, we are going to focus on the architectural changes that resulted from the project.
Both projects were initially built as LAMP applications—edocr on Drupal 5 and Prizm Share having been built in-house by a fledgling apps development team. Throughout their independent evolutions, they each had the expected increase in features and correlated increase in code base size and technical debt. The decision to merge the two products under a single brand and code base meant choosing a target platform.
Initially, we considered two options:
- Migrate Prizm Share data to the existing edocr platform - With edocr being the dominant service for traffic between the two, it made more sense to merge on that base when looked at from a customer-centric perspective. However, edocr was running on a very dated Drupal 5 instance and in need of an approach that could be supported well into the future. Along these lines, we also considered adopting the Drupal platform and then going through the upgrade process from 5 to the current release, 8. Tests of these upgrades did not have good results, however, as numerous plugins and custom modules in edocr were not available or supported in the latest version.
- Migrate edocr data to the existing Prizm Share platform - With the edocr application being based on the very outdated Drupal 5 platform, the primary contender was to move the application to our existing PHP Prizm Share code base and extend the features that we didn't currently support in order to serve the existing customers of both products. The problem with this was that the code for Prizm Share was already starting to show signs of rigidity based on a lack of foresight on the initial architecture decisions. New features were increasingly difficult to work in with the existing framework and the team had been looking to move to something more modular.
What the team eventually decided on was a complete overhaul of the whole system, targeting the existing feature sets of both products along with a specific list of MVP requirements for our new consumer document platform.
To ensure flexibility in the future and to keep a positive connotation on the word legacy, we decided to adopt a microservice architecture built on a lightweight application framework we developed in Node. This would handle configuration and dependency management for each of our services. By separating out our functionality into independently managed components and updating our build and deployment system to deploy as docker containers, the team was able to reduce friction with code changes, improve code testability, and drastically reduce the time from commit to production.
How Big is Too Big?
Having decided to reengineer the application on a microservice architecture, the team began the process of designing a scalable, maintainable base on which to rebuild edocr.
One of the first architectural issues we faced was a very common one in software design—how should the components be delimited and broken up? Our engineers had varying opinions on this topic and our outlook on the options changed as we iterated on the design.
From an object-oriented mindset, our initial approach was to separate each "class" into its own service. This resulted in a very comfortable context switch from the previous object-oriented codebase and the lines were already drawn, so to speak. Our initial design was to house all logic in an API service that could be exposed to our Web application, various in-house mobile apps, and potentially the public at some future date. This layer would depend on services that acted as models for the various types of objects in the application hierarchy.
Based on feedback from the development and architecture teams through the initial process, we began looking at making the services even smaller. There are numerous benefits to breaking down to units of atomic functionality as separate services:
- Maintainability - One of the core goals of a microservice architecture is the ability to avoid bloated, unmaintainable "legacy" code after a period of time. By depending on individual small services that provide a single piece of functionality to the greater application, any one piece can easily be refactored or even replaced entirely with little friction to the overall product.
- Robustness - Code paths involving failing services can be avoided. With a large number of interoperating services in the ecosystem, there will invariably be times when something doesn't go as planned. By having each small piece of functionality in its own service and writing good error handling when talking between services, any unavailable services can be worked around or skipped if the application logic allows for it.
- Testability - Acceptance testing of individual services can be broken down to small, manageable test sets. By only coding a service to a very small public interface (often one or two endpoints in our current case), the scope automated testing of the functionality becomes much narrower.
One of the core concerns with separating into such small services was the need for each service to talk to numerous other services to perform its processing. To facilitate this, we created a service routing component that takes in a JSON configuration file and provides native access to the registered services without requiring each developer to know anything about the implementation of the source service, including the port number or even hostname if the service is not local. While our code is currently all in Node.js, migrating this library to any other language is trivial and allows us the flexibility to talk between systems, services, and languages without the boilerplate code of manual HTTP calls.
We kept the singular public gateway, which acts as a router and authentication handler for the underlying services. That API gateway and various utility microservices interact with the functionality-providing services via the routing component, leading to a decentralized application structure that aides in long-term maintainability.
In retrospect, it would have been beneficial to have spent more time vetting the decisions made in this stage prior to beginning implementation. However, in true agile fashion, the team was able to adjust course during development and get the code base to a place that is much easier to compartmentalize and maintain going forward.
Continuous delivery and confidence in testing
One of the primary goals of our recent redesign of the edocr document application was to continue our progress towards a continuous delivery system and away from a model of scheduled maintenance windows and releases that is tied to sprints.