Tools to Monitor and Visualize Microservices Architecture

Previous articles in ProgrammableWeb's microservices series look at what microservices are and explain differences between monolithic and microservices architectures.

Once a distributed application is built and deployed, it is crucial to monitor and visualize it to make sure the software is reliable, available, and performs as expected. That isn’t necessarily easy.

The heterogeneous and distributed nature of applications driven by a microservices architecture make monitoring, visualization, and analysis a difficult prospect. Traditional application monitoring and performance management (APM) solutions are not suited for today’s complex distributed applications.

Fortunately, several new APM solutions have been launched within the past few years to address these issues. These APM solutions take advantage of advanced technologies such as artificial intelligence (AI), machine learning, and graph analysis to monitor, visualize, and analyze microservices architectures. Many of these modern APM solutions also include distributed tracing and topology visualization capabilities necessary for effectively managing microservices architectures.

Distributed Tracing and Topology Visualization

Among the current open source distributed tracing systems are Zipkin, HTrace, X-Trace, and Trace. There is also the OpenTracing Project, which aims to provide vendor-neutral APIs so that distributed tracing and context propagation can be implemented on all popular platforms.

Adrian Cockcroft, a technology fellow at Battery Ventures and former chief architect at Netflix, described distributed tracing in a recent presentation saying, “Distributed tracing systems collect end-to-end latency graphs (traces) in near real-time. You can compare traces to understand why certain requests take longer than others.”

Not all of the open source distributed tracing systems available include topology visualization capabilities, which can be an important feature. Topology visualization maps or diagrams the layout of applications in a microservices architecture and in other distributed applications. Doing so is critical when you need to discover performance issues and other problems.

Screenshot of SimianViz demo, more complex Netflix visualization: View live demonstration.

Adrian Cockcroft recently released a new open source tool, SimianViz (formerly Spigo), that generates large-scale simulations of complex microservices. Companies can use these simulations for visualizing topologies and for stress testing microservices monitoring solutions without having to set up large test configurations.

Major technology companies, such as Netflix and LinkedIn, have built their own distributed tracing and performance monitoring solutions. For Netflix, the need for its several distributed tracing tools was driven by its need for scalability, as most commercial tools are unable to scale at the level Netflix requires. Netflix also uses a variety of visualization tools including on-demand CPU flame graphs for analyzing and optimizing Java and Node.js application performance.

LinkedIn has a real-time distributed tracing system that uses Apache Samza results to build real-time call graphs. The call graphs are used for performance optimization and root cause analysis of the LinkedIn distributed architecture.

Most companies don't have the extensive resources of companies like LinkedIn and Netflix, so building a custom distributed tracing and performance monitoring solution from the ground up may not be possible. Fortunately, there are a number of tools that developers in any size company can use for monitoring and visualizing distributed applications.

Monitoring and Visualization of Microservices Architectures

There’s a lot going on in any system built on a microservices architectures. A microservices architecture typically consists of dozens, sometimes hundreds, of fine-grained services; every user transaction goes through many of those services. In addition, transactions are often asynchronous, involving multiple concurrent service requests. Traditional APM products are typically unable to monitor distributed applications that process multiple concurrent service requests.

Their inherent complexity and high scalability requirements have led to the creation of application monitoring and visualization tools that use machine learning, graph analysis, distributed tracing, topology visualization, and other cutting-edge technologies.

Here are just a few examples of these solutions, so you can get an idea of the tools available to help you understand what’s going on in your software.


Image Credit: AppDynamics

While AppDynamics has been around for quite some time, the company launched its machine-learning powered APM product in June 2015, to monitor, manage, and analyze complex architectures such as microservices. AppDynamics shows application performance in real time and automatically discovers application topology and interdependencies. Its APM tool also includes distributed tracing, topology visualization, and dynamic tagging.

Janet Wagner is a technical writer and contributor to ProgrammableWeb covering breaking news, in-depth analysis, and product reviews. She specializes in creating well-researched, in-depth content about APIs, machine learning, deep learning, computer vision, analytics, GIS/maps, and other advanced technologies.