What is Zipkin and how does it work?
Zipkin is a project that originated at Twitter in 2010 and is based on the Google Dapper papers. Observing the system from different angles is critical when troubleshooting, especially when a system is complex and distributed.
This blog will provide you with comprehensive information about Zipkin and will show you how to install it on RHEL, activate Zipkin in your application, and configure elasticsearch for storage.
Zipkin helps gather timing data needed to troubleshoot latency problems in service architectures. Features include both the collection and lookup of this data. It also helps you find out exactly where a request to the application has spent more time. Whether it’s an internal call inside the code or an internal or external API call to another service, you can instrument the system to share a context. Microservices usually share context by correlating requests with a unique ID.
Here’s an example sequence of HTTP tracing where the user code calls the resource /foo. This results in a single span, sent asynchronously to Zipkin after the user code receives the HTTP response.
Trace instrumentation report spans asynchronously to prevent delays or failures relating to the tracing system from delaying or breaking user code.
To install Zipkin on an RHEL:
We’ll be running Zipkin tracing system using the following two options:
- Using Java (jar file)
- Running in Docker Container
Install Zipkin Using Docker
Prerequisite: Docker.
Then fire the following command to Install Zipkin after Installing Docker.docker run -d -p 9411:9411 openzipkin/zipkin
Installing Zipkin Using Java
Install java by running:
sudo yum -y install epel-release*(run the below two steps if java is not installed)*
sudo yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel jq vim
sudo alternatives --config java
To check if java has been installed run the following command:
java -version
After installing the pre-requisites, go to the latest release of Zipkin
curl -SSL <https://zipkin.io/quickstart.sh> | bash -s
Run the executable to Install Zipkin
java -jar zipkin.jar
Configure Systemd:
Running Zipkin with the java -jar command will not persist system reboots. If your system has support for systemd, you can create a service for it.
For your system, create a service for Zipkin to manage the application. Move the jar files to /opt directory
sudo mkdir /opt/zipkin
sudo mv zipkin.jar /opt/zipkin
ls /opt/zipkin
First of all, here create a system group for the user
sudo groupadd -r zipkin
sudo useradd -r -s /bin/false -g zipkin zipkin
sudo chown -R zipkin:zipkin /opt/zipkin
Then create a systemd service file.
sudo vim /etc/systemd/system/zipkin.service
Paste the below data into a file.
Zipkin System Service
[Unit]
Description=Manage Java service
Documentation=https://zipkin.io/
[Service]
WorkingDirectory=/opt/zipkin
ExecStart=/usr/bin/java -jar zipkin.jar
User=zipkin
Group=zipkin
Type=simple
Restart=on-failure
RestartSec=10
[Install]
WantedBy=multi-user.target
Next, reload the daemon to take effect.
sudo systemctl daemon-reload
Then, start the services again
sudo systemctl start zipkin.service
See the status by typing
sudo systemctl status zipkin.service
To enable Zipkin in the microservice:
The first change is the build.gradle, where we add the cloud-starter dependencies for both Sleuth and Zipkin.
implementation('org.springframework.cloud:spring-cloud-sleuth-zipkin')
implementation 'org.springframework.cloud:spring-cloud-starter-sleuth'
The second change is to add the URL, in the application.yml for spring to publish data to Zipkin.
spring.zipkin.baseUrl=http://localhost:9411/
All the services that need to use the Distributed Tracing feature, will need the above two changes/additions.
Zipkin provides a nice interface for viewing traces based on service, time, and annotations. Browse to http://localhost:9411/zipkin/ to access Zipkin Web UI and find traces.
Zipkin helps you find out exactly where a request to the application has spent more time. Whether it’s an internal call inside the code or an internal or external API call to another service. You can see how much time this request spent on each function. For example, you could say that the problem is with the call to the microservice and focus on reducing the latency in that service first. Or, you could use these traces to understand what the workflow of a request is. What if you’re calling a dependency more than once? With Zipkin, it’s easy to spot those types of issues.
To check if the Zipkin service is working on the RHEL:curl localhost:9411/zipkin/
You'll receive the following output:
<!doctype html><html><head><base href="/zipkin/"><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><link href="./static/css/2.287eba14.chunk.css" rel="stylesheet"><link href="./static/css/main.dc336df7.chunk.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div><script>!function(e){function r(r){for(var n,i,l=r[0],p=r[1],f=r[2],c=0,s=[];c<l.length;c++)i=l[c],Object.prototype.hasOwnProperty.call(o,i)&&o[i]&&s.push(o[i][0]),o[i]=0;for(n in p)Object.prototype.hasOwnProperty.call(p,n)&&(e[n]=p[n]);for(a&&a(r);s.length;)s.shift()();return u.push.apply(u,f||[]),t()}function t(){for(var e,r=0;r<u.length;r++){for(var t=u[r],n=!0,l=1;l<t.length;l++){var p=t[l];0!==o[p]&&(n=!1)}n&&(u.splice(r--,1),e=i(i.s=t[0]))}return e}var n={},o={1:0},u=[];function i(r){if(n[r])return n[r].exports;var t=n[r]={i:r,l:!1,exports:{}};return e[r].call(t.exports,t,t.exports,i),t.l=!0,t.exports}i.m=e,i.c=n,i.d=function(e,r,t){i.o(e,r)||Object.defineProperty(e,r,{enumerable:!0,get:t})},i.r=function(e){"undefined"!=typeof Symbol&&Symbol.toStringTag&&Object.defineProperty(e,Symbol.toStringTag,{value:"Module"}),Object.defineProperty(e,"__esModule",{value:!0})},i.t=function(e,r){if(1&r&&(e=i(e)),8&r)return e;if(4&r&&"object"==typeof e&&e&&e.__esModule)return e;var t=Object.create(null);if(i.r(t),Object.defineProperty(t,"default",{enumerable:!0,value:e}),2&r&&"string"!=typeof e)for(var n in e)i.d(t,n,function(r){return e[r]}.bind(null,n));return t},i.n=function(e){var r=e&&e.__esModule?function(){return e.default}:function(){return e};return i.d(r,"a",r),r},i.o=function(e,r){return Object.prototype.hasOwnProperty.call(e,r)},i.p="./";var l=this["webpackJsonpzipkin-lens"]=this["webpackJsonpzipkin-lens"]||[],p=l.push.bind(l);l.push=r,l=l.slice();for(var f=0;f<l.length;f++)r(l[f]);var a=p;t()}([])</script><script src="./static/js/2.5dabcfd5.chunk.js"></script><script src="./static/js/main.1f5ed6e2.chunk.js"></script></body></html>
You could do tunneling and check on your local system if you can access the UI:ssh -L 9411:localhost:9411 raj@34.69.77.64
Open http://localhost:9411/zipkin/ : You should be able to see the Zipkin UI
Configure elasticsearch as the storage type:
Zipkin server bundles extension for span collection and storage. By default, spans can be collected over HTTP, Kafka, or RabbitMQ transports and stored in Elasticsearch for long-term retention of the trace data.
You can specify the configuration while running the executable to install Zipkin as:
java -DSTORAGE_TYPE=elasticsearch -DES_HOSTS=http://127.0.0.1:9200 -jar zipkin.jar
Else you can pass an env file containing the following configuration:
STORAGE_TYPE=elasticsearch
ES_HOSTS=elastic:9200
And pass the env file docker run as:
docker run -d --env-file=/home/zipkin.env -p 9411:9411 openzipkin/zipkin
Conclusion:
Enterprises are increasingly adopting microservice architectures. They are developing and deploying more microservices every day. Often, these services are deployed into separate runtime containers and managed by different teams and organizations. Large enterprises can have tens of thousands of microservices. Visibility into the health and performance of the diverse service topology is extremely important for them to be able to quickly determine the root cause of issues, as well as increase overall reliability and efficiency. The ability to sample requests, Zipkin’s instrumentation libraries and native support for Elasticsearch storage are the key reasons we utilize Zipkin to find latency issues with our services.
Credits: Nidhi Mittal