In this post, we will discuss what Docker HEALTHCHECK
is, when it should be used, provide a code example to demonstrate how the HEALTHCHECK
command works and how to debug it.
What is Docker HEALTHCHECK
Docker HEALTHCHECK
is a command used to test the status of a container in a systematic way. The command runs periodically, and if it fails, the container is marked as unhealthy
.
When should you use Docker HEALTHCHECK?
Docker HEALTHCHECK should be used to verify whether a containerized application has started successfully or whether a container is still running after the application has encountered a fatal error.
For example, when running a database container, it’s essential to verify that the database is responding to queries. If the database is not responding, the container should be restarted automatically, or an alert should be sent to the system administrator.
What happens when no Docker HEALTHCHECK is defined
Assume we wrote a Dockerfile
to containerize a Flask application app.py
that has a single endpoint, /killswitch
. This endpoint will simulate an unexpected application error by stopping the Flask application process and setting its exit code to 42
.
app.py
|
|
Dockerfile
|
|
Let’s see what happens when we don’t define the Docker HEALTHCHECK
command in the Dockerfile
. Build the image no-check
and run it:
|
|
Use the docker ps
command to monitor the health status of the no-check
container:
|
|
Use the curl
command to stop the Flask application process:
|
|
You’ll notice that the status of the container has changed to Exited (42)
.
This shows that if we don’t define the HEALTHCHECK
command in the Dockerfile
then the status of a container changes according to the exit status code of the process ran by the CMD
command in the Dockerfile
.
How to use the HEALTHCHECK command in a Dockerfile
Now, let’s see how to use the HEALTHCHECK command in a Dockerfile
. We add the /ping
endpoint to our Flask application app.py
that returns an HTTP status code of 200 Ok
when the application is running fine, otherwise 500 Internal Server Error
is returned.
app.py
|
|
We also add the following HEALTHCHECK
command to the Dockerfile
. The curl
command sends a GET request to the /ping
endpoint every 3 seconds, with a timeout of 2 seconds between each request. If any request fails, the command terminates with an exit code equal to 1, indicating a failure.
Dockerfile
|
|
How to debug the HEALTHCHECK command
Let’s finally see the HEALTHCHECK
command in action! Build the image with-check
and run it:
|
|
Instead of using docker ps
, this time we use docker inspect
command to debug the health checks of the with-check
container. The command outputs a JSON that we then pipe into jq, a nifty tool for manipulating JSON data:
|
|
Here’s an example output of the command:
|
|
Let’s focus on the following fields: Status
, FailingStreak
and Log
:
- the
Status
field indicates the current health status of the container; - the
FailingStreak
field indicates the number of consecutive times that the container has failed the health checks. Since the container is currentlyhealthy
, this value is equal to 0; - the
Log
field is an array of the last 5 health check executions for the container. Each element of the array represents a single health check.
It’s time to trip our killswitch and see its effects on the container health:
|
|
After a few seconds, the output of docker inspect
will be similar to the following:
|
|
The container is still running even though its status is now unhealthy
, since the last 3 health check executions terminated with an exit code of 1. We can manually shut it down and start a new container or let Docker Swarm do the dirty work for us.
Notice that you can customize the number of consecutive failures required for bringing the container into the unhealthy
state by adding the --retries=N
option to the HEALTHCHECK
command.
Conclusion
Docker HEALTHCHECK is an important command that allows us to define a command to periodically check the health of a running container. In this post, we went over the basic usage of the HEALTHCHECK command, discussed when to use it, provided a code example of how to use it in a Dockerfile to check whether a Flask application is healthy and how to debug it.
Please drop a comment to let me know if you found this useful!