Apply Elegance to Stop Best Practices
Regardless of the type of application, it is desirable to receive a stop notification before the service is stopped, and to have a certain amount of time to do the work of releasing resources before exiting, stopping the connection, and no longer receiving external requests. We provide a comprehensive guide to the elegant stop configuration of all applications and also provide a complete flow of elegant stop functionality for each type of service, from development to deployment to AppEngine(K8s).
The simplest way to stop a container elegantly
Before exiting the container, the container will be removed from the service-provided list so that external requests no longer appear on it and Hooks can be executed before exiting.
Here, we set to execute the sleep 10
command in the container before stopping such container. At this time, the container will no longer receive external requests, and there are 10 seconds to complete the processing of the request. 10 seconds later, the container will be forcibly deleted.
Linux common signals
To understand how to apply the elegant stop method, let's review the container-related Linux common signals.
Signals are a form of interprocess communication. A signal is a message sent by the kernel to the process, telling the process that something has happened. The process needs to register handlers for the signals it is interested in. For example:
- In order to allow the program to elegantly exit (resources are cleared after receiving the exit signal), a normal program will handle the
SIGTERM
signal. Unlike theSIGTERM
signal, theSIGKILL
signal will violently end a process; - Many daemons implement hot configuration files by processing the
SIGHUP
signal.
Use the kill -l
command to display a list of signals supported by Linux. The signals numbered 1-31 are signals supported by traditional UNIX and are unreliable signals (not real-time). The signals numbered 32-63 are later expanded, called reliable signals (real-time).
The following describe the commonly used signals:
SIGHUP(1)
When the user terminal connection is complete, the system will send this signal like all running processes; it is usually used when hot configuration files are loaded. The wget command registers the
SIGHUP(1)
signal so that if you exit the Linux login, wget can continue to download the file. Similarly, services such as Docker/Nginx/LVS will also register theSIGHUP(1)
signal to implement the service's hot configuration file function.SIGINT(2)
The program interrupt signal is issued when the user hits the INTR character (usually
Ctrl+C
) to notify the front-desk process group to terminate the process.SIGQUIT(3)
Similar to
SIGINT
but controlled by QUIT character (usuallyCtrl+backslash
). Nginx stops the service elegantly by registering this signal.SIGKILL(9)
Ends the program immediately. This signal cannot be blocked, processed and ignored and cannot be obtained in the program.
SIGTERM(15)
The Terminate signal, also known as the Request exit signal, is different from
SIGKILL
in that the signal can be blocked and processed. We can register the signal in the program to achieve elegant stop of the service. Use the kill command to send this signal by default.SIGCHLD(17)
At the end of the sub-process, this signal is usually sent to the parent process. Nginx is a multi-process program that is used by the master process to communicate with the worker process.
To learn more about Linux Signal, see the man signal file.
Docker container service support for signals
Docker also does a lot of support for Linux Signal.
- 1)
docker stop
command signal support
When we use the docker stop
command to stop the container, docker by default allows the application in the container to have 10 seconds to terminate the application. We can customize the length of a stop by manually specifying the --time/-t
parameter when executing the docker stop
command.
→ docker stop --help
Usage: docker stop [OPTIONS] CONTAINER [CONTAINER…]
Stop one or more running containers
Options:
--help Print usage
-t, --time int Seconds to wait for stop before killing it (default 10)
When the docker stop
command is executed, it will send the system signal SIGTERM
to the process with PID as 1 (main process
) in the container. Then wait for the application in the container to terminate execution. If the waiting time reaches the set timeout, such as the default 10 seconds, the system signal of SIGKILL
will continue to be sent to forcibly kill the process. The application in the container can choose to ignore and not process the SIGTERM
signal, but once the timeout is reached, the program will be forcibly killed by the system.
- 2)
docker kill
command signal support
By default, the docker kill
command does not give any graceful shutdown opportunity to the application in the container. It will directly send a system signal of SIGKILL
to force the program operation in the container to terminate.
Looking at the help of the docker kill
command, we see that, in addition to sending the SIGKILL
signal by default, it also allows us to send some customized system signals:
→ docker kill --help
Usage: docker kill [OPTIONS] CONTAINER [CONTAINER…]
Kill one or more running containers
Options:
--help Print usage
-s, --signal string Signal to send to the container (default "KILL")
For example, if we want to send a SIGINT
signal to a program in docker, we can do this:
docker kill --signal=SIGINT container_name
Unlike the docker stop command, docker kill
does not have any time-out setting. It will send the SIGKILL
signal directly, or other user-specified signals.
- 3)
docker rm
command signal support
The docker rm
command is used to delete a container that has stopped running. We can add a --force
or -f
parameter to force the running container to be deleted. After using this parameter, docker will send a SIGKILL
signal to the running container, forcing the container to be stopped and then deleting it.
For example, forcibly deleting a running container with the name web
docker rm -fv web
- 4) docker daemon process signal support
The docker daemon process will receive the SIGHUP
signal and will reload the daemon.json configuration file after receiving it.
We send a SIGHUP
signal for the dockerd process:
root@vm10-1-1-28:~# kill -SIGHUP $(pidof dockerd)
root@vm10-1-1-28:~# or
root@vm10-1-1-28:~# systemctl reload docker
Looking at the docker daemon's log, you can see that the docker daemon receives this signal and reloads the daemon.json configuration file
root@vm10-1-1-28:~# journalctl -u docker.service -f
-- Logs begin at Sun 2018-01-07 09:17:01 CST. --
Jan 18 16:20:11 vm10-1-1-28.ksc.com dockerd[26668]: time="2018-01-18T16:20:11.262904839+08:00" level=info msg="Got signal to reload configuration, reloading from: /etc/docker/daemon.json"
Jan 18 16:21:41 vm10-1-1-28.ksc.com systemd[1]: Reloading Docker Application Container Engine.
Therefore, after you modify the /etc/docker/daemon.json file, you can send Docker a SIGHUP signal to reload the configuration file without restarting the docker daemon.
AppEngine(K8s) service signal support
AppEngine(K8s) uses Kubernetes as the orchestration engine and currently provides elegant termination support for running containers.
When a user requests to delete an application through the interface or the command line, and the application expands and shrinks, as long as a Pod is deleted, the following process is triggered to terminate and remove the Pod:
- After the service receives the delete Pod request, the elegant exit time of the Pod is updated according to
wait time
; - Because the Pod has set an elegant exit time, the state of the Pod seen on the interface and the client command line is changed to the
"Terminating"
state. - (Concurrent with Step 2) The background service
kubelet
starts to enter the process of closing the Pod:- If
preStop hook
is defined in the Pod, it will be invoked inside the Pod. If the hook is still running after the elegant exit period expires, it will add another elegant time of 2 seconds; - A process with a PID of 1 in the Pod is sent a
SIGTERM
signal;
- If
- (Concurrent with Step 2) Background service
Service manager
removes the Pod from the list of services and is no longer considered part of a running pod. The slowly closed Pod can continue to provide external services until the load balancer removes them in turn; - When the elegant exit time is reached, any running process in the Pod will be sent a
SIGKILL
signal to forcibly kill it; - The background service
kubelet
will complete the deletion of the Pod and set the elegant exit time to 0 (indicating immediate deletion). Pod is removed from the API and is no longer visible to clients.
From the above flow, we can see that there are two ways to add an elegant exit to an application running in AppEngine V2:
- If the process with PID as 1 in the container is our application, we can receive and process the
SIGTERM
signal in the program to achieve an elegant exit; We can also set a
preStop hook
to specify how to elegantly stop the container in the hook. ThepreStop hook
currently supports two methods: sending an HTTP request to the container and <1>executing a command in the container**.</li> </ol> 1>Note that if we do not set the
wait time</ 0> mentioned above, the default is <strong>30 seconds</ 1>. If set to 0, the <code>SIGKILL
signal will be sent immediately to kill all processes in the Pod. If you want to set it, please set it as appropriate according to service conditions to avoid any deadlock or other problems caused by the program.Elegant service stop cases
Regardless of the service, if you want to achieve an elegant stop, you want to tell it to the program before it stops to let the program have a certain amount of time to process, save the program execution site, and elegantly quit the program. Below we have prepared two common cases that basically cover the application scenarios of all services, including: receiving and processing TERM signals in the program and specifying preStop hook.
The following two cases provide a detailed manual, please choose the appropriate usage method according to your own situation in actual use.
1. Receiving and processing signals within the program
By understanding the signal support of the Docker container service support for signals service signal support) above, we know that the
docker kill
command is suitable for forcibly terminating the program and achieving a quick stop of the container. If you want the program to elegantly shutdown,docker stop
is the best choice, so that we can let the program have a certain time to process after receiving theSIGTERM
signal and save the program execution site. Then you can elegantly quit the program.Next we write a simple Go program to receive and process the signal. After the program is started, it will block and monitor the system signal until it detects the corresponding system signal, and then outputs it to the console and exits execution.
// main.go package main import ( "fmt" "os" "os/signal" "syscall" ) func main() { fmt.Println("Program started…") ch := make(chan os.Signal, 1) // notify signal SIGTERM(15) signal.Notify(ch, syscall.SIGTERM) // notify signal SIGINT(2) signal.Notify(ch, syscall.SIGINT) s := <-ch switch { case s == syscall.SIGINT: fmt.Println("SIGINT received!") //Do something… case s == syscall.SIGTERM: fmt.Println("SIGTERM received!") //Do something… } fmt.Println("Exiting…") }
Next, use a cross-compilation method to compile the program so that the program can run under Linux:
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o graceful
After compiling, we do the test:
- 1) The test receives the
SIGTNT
signal. Start the process in the foreground and enterCtrl + C
to send theSIGINT(2)
signal.
lynzabo@ubuntu ~/r/g/s/edit> ./graceful
Program started…
^CSIGINT received!
Exiting…
lynzabo@ubuntu ~/r/g/s/edit>
- 2) Test receives
SIGTERM
signal
lynzabo@ubuntu ~/r/g/s/edit> ./graceful &
Program started…
lynzabo@ubuntu ~/r/g/s/edit> ps -ef | grep graceful
lynzabo 21223 21082 0 15:57 pts/21 00:00:00 ./graceful
lynzabo 21287 21082 0 15:57 pts/21 00:00:00 grep --color=auto graceful
lynzabo@ubuntu ~/r/g/s/edit> kill 21223
SIGTERM received!
Exiting…
“./graceful &” has ended
lynzabo@ubuntu ~/r/g/s/edit>
- 3) Pack the above program into the container and run it.
Dockerfile
FROM alpine:latest
LABEL maintainer "opl-xws@xiaomi.com"
ADD graceful /graceful
CMD ["/graceful"]
A pit that is common in handling SIGTERM signals
We all know that by using CMD, ENTRYPOINT command in Dockerfile, we can define the container start command; the difference between these two commands will no longer be mentioned here; we will only talk about the issues that must be paid attention to during use.
Both of these commands support the following formats:
- shell format: CMD<命令>
- exec format: CMD ["Executable", "Parameter 1", "Parameter 2"...]
- Parameter list format: CMD ["Parameter 1","Parameter 2"...]. After specifying the ENTRYPOINT command, specify the specific parameters with CMD.
It is generally recommended to use the exec
format, which is parsed as a JSON array during parsing, so be sure to use double quotes "
instead of single quotes '
.
If you use the shell format, the actual command is executed as a sh-c parameter, such as:
CMD echo $HOME
In the actual implementation, it will be changed to:
CMD [ "sh", "-c", "echo $HOME" ]
Therefore, the main process of the container is sh. When a signal is sent to the container and the signal is received by the sh process, the sh process will directly exit after receiving the signal, and the container will naturally exit. Our program will never receive a signal.
Mirror packaging process:
lynzabo@ubuntu ~/r/g/s/edit> docker build -t cnbj6-repo.cloud.mi.com/k8s/graceful-golang-case:1.0.0 .
Sending build context to Docker daemon 1.953 MB
Step 1/4 : FROM alpine:latest
---> 3fd9065eaf02
Step 2/4 : LABEL maintainer "opl-xws@xiaomi.com"
---> Using cache
---> 6cc05b3f0ed0
Step 3/4 : ADD graceful /graceful
---> Using cache
---> 4a47b371a124
Step 4/4 : CMD /graceful
---> Using cache
---> f1841c0035af
Successfully built f1841c0035af
lynzabo@ubuntu ~/r/g/s/edit>
- 4) Start the container:
lynzabo@ubuntu ~/r/g/s/edit> docker run -d --name graceful cnbj6-repo.cloud.mi.com/k8s/graceful-golang-case:1.0.0
08d871007b58e55e9552cff23960c80faf51bf8637014a745dec060b80ac9a6f
lynzabo@ubuntu ~/r/g/s/edit> docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
08d871007b58 cnbj6-repo.cloud.mi.com/k8s/graceful-golang-case:1.0.0 "/graceful" 10 seconds ago Up 9 seconds graceful
lynzabo@ubuntu ~/r/g/s/edit>
- 5) Look at the container output and you can see that the program has started normally:
lynzabo@ubuntu ~/r/g/s/edit> docker logs graceful
Program started…
lynzabo@ubuntu ~/r/g/s/edit>
- 6) Then we need to use
docker stop
to see if the program can respond to the SIGTERM signal.
We all know that docker stop
, by default within 10 seconds of sending the SIGTERM
signal and then sending the SIGKILL
signal to forcibly stop all processes in the container, deletes the container. If my program is too complex to process, I can't finish the cleanup within 10 seconds. Therefore, during the execution of the docker stop
, I will customize to forcibly kill my container after 2 minutes:
lynzabo@ubuntu ~/r/g/s/edit> docker stop --time=120 graceful
graceful
lynzabo@ubuntu ~/r/g/s/edit> docker logs graceful
Program started…
SIGTERM received!
Exiting…
lynzabo@ubuntu ~/r/g/s/edit>
Looking at the above log, we can see that our program can indeed handle the SIGTERM
signal sent by Docker.
7) The following explains how to apply the received signal program to AppEngine V2 and let AppEngine V2 control the elegant stop of the Pod.
- i. Create a new application, fill it in with the application information, set our application elegant exit time to 2 minutes, and set the elegant exit time in the interface as 120 seconds. Creation then becomes successful.
- ii. Application launch becomes successful
- iii. Check the application startup log and keep checking the log.
Program started…
- iii. When we click on Delete Application in the interface, we can see that there is already a SIGTERM signal log sent by AppEngine V2 to our program.
Program started… SIGTERM received! Exiting…
The entire procedure above demonstrates how to set the elegant stopping of the service through our program. Next we will talk about how to elegantly stop the service through the preStop hook.
2. Stop the service using AppEngine's preStop hook
Sometimes we also want to elegantly stop the service by executing a command or sending an HTTP request before the service stops.
Example:
- For example, for the Spring boot application, the Spring Boot Actuator provides an elegant stop method for the service. When the service is to be stopped, a shutdown HTTP request for the post method can be sent to the service.
- For example, for the Nginx service, when you want to stop the service, you can execute the command
kill -QUIT Nginx main process number
to stop the service.
Below we will use the Nginx service to explain how to use the preStop hook
to stop the service:
First, review the basics of Nginx:
Nginx is a multi-process service, master process and a bunch of worker processes. The master process is only responsible for verifying the configuration file syntax, creating the worker process. The actual execution, receiving client requests, and processing the configuration file instructions are all done by the worker process. Between the master process and the worker process, interaction is mainly achieved through Linux Signal
. Nginx provides a large number of commands and processing signals to achieve the syntax check of the configuration file, elegant stop of the service, smooth restart of the process, upgrade and other functions. Here, we only briefly introduce the Linux Signal
execution procedure and the execution principle triggered by the nginx elegant stop-related command.
There are many ways to stop nginx. Normally, nginx is stopped by sending the system signal to nginx's master process.
- i. Elegantly stopping nginx
[root@localhost ~]# nginx -s quit
[root@localhost ~]# kill -QUIT 【Nginx main process number】
[root@localhost ~]# kill -QUIT /usr/local/nginx/logs/nginx.pid
When the master process receives the SIGQUIT
signal, it forwards this signal to all worker processes. The worker process then closes the listening port so that it no longer receives new connection requests, closes the idle connection, and waits for all the normal connection speeds of the active connection to invoke ngx_worker_process_exit
. The master process exits after all worker processes have exited, invoking the ngx_master_process_exit
function to exit;
- ii. Stop nginx quickly
[root@localhost ~]# nginx -s stop
[root@localhost ~]# kill -TERM 【Nginx main process number】
[root@localhost ~]# kill -INT 【Nginx main process number】
TERM
signal can be called elegant exit signal in the Linux system, INT
signal is system SIGINT
signal, and Nginx handles these two signals differently. Nginx elegantly stops the service with the SIGQUIT(3)
signal.
When the master process receives the SIGTERM
or SIGINT
signal, it forwards the signal to the worker process, and the worker process directly invokes the ngx_worker_process_exit
function to exit. The master process exits after all the worker processes have exited by invoking the ngx_master_process_exit
function. In addition, if the work process fails to exit normally, the master process will wait 1 second and send the SIGKILL
signal to forcefully terminate the work process.
iii. Force all nginx processes to stop
[root@localhost ~]# nginx -s stop
[root@localhost ~]# pkill -9 nginx
Send the SIGKILL
signal directly to all nginx processes.
The Nginx service uses AppEngine V2 `preStop hook<0> function flow as follows:</p>0>
- 1) Run the Nginx mirror provided by the Docker hub.
The default start Nginx command provided in the official Nginx Dockerfile is as follows
Dockerfile
...
CMD ["nginx", "-g", "daemon off;"]
`
The CMD
above specifies that nginx is launched directly at the front end.
In order to see that nginx can output SIGNAL logs, we set the nginx log level to notice. Modify the nginx.conf file to change the value of the error_log
attribute from error
to notice
.
nginx.conf
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log notice;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}
- 2) We start running the container locally
root@vm10-1-1-28:~/nginx# docker run -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro nginx
2018/01/26 04:53:51 [notice] 1#1: using the "epoll" event method
2018/01/26 04:53:51 [notice] 1#1: nginx/1.13.8
2018/01/26 04:53:51 [notice] 1#1: built by gcc 6.3.0 20170516 (Debian 6.3.0-18)
2018/01/26 04:53:51 [notice] 1#1: OS: Linux 4.4.0-62-generic
2018/01/26 04:53:51 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2018/01/26 04:53:51 [notice] 1#1: start worker processes
2018/01/26 04:53:51 [notice] 1#1: start worker process 5
- 3) We start a new terminal and send a
QUIT
signal to the container, elegantly stopping Nginx
root@vm10-1-1-28:~# docker exec -ti bc0a0272448a nginx -s quit
root@vm10-1-1-28:~# or
root@vm10-1-1-28:~# docker kill --signal=SIGQUIT bc0a0272448a
bc0a0272448a
root@vm10-1-1-28:~#
View the container log again
2018/01/26 04:55:04 [notice] 1#1: signal 3 (SIGQUIT) received, shutting down
2018/01/26 04:55:04 [notice] 5#5: gracefully shutting down
2018/01/26 04:55:04 [notice] 5#5: exiting
2018/01/26 04:55:04 [notice] 5#5: exit
2018/01/26 04:55:04 [notice] 1#1: signal 17 (SIGCHLD) received from 5
2018/01/26 04:55:04 [notice] 1#1: worker process 5 exited with code 0
2018/01/26 04:55:04 [notice] 1#1: exit
root@vm10-1-1-28:~/nginx#
root@vm10-1-1-28:~/nginx#
We see that after nginx receives the SIGQUIT
signal, it then elegantly stops the service operation.
- 4) The following explain how to make this service use the
preStop hook
provided by AppEngine(K8s) to control the elegant stop of the Pod.- i. Create a new application, fill it in with Nginx application information, set the Nginx application elegant exit time to 2 minutes, and set the elegant exit time in the interface as 120 seconds.
- ii. Select the
preStop hook
type as Exec, fill in the elegant stop Nginx service command as["nginx","-s","quit"]
. - iii. Click Save application and the application is finally created.
This way, our Nginx application has been configured to stop elegantly. When there is a Pod stopped, AppEngine(K8s) will first invoke the preStop hook
to stop the service.
Q&A
Q1: Register the SIGTERM
signal in the program. How do you determine if the signal registration function in the program has already been executed?
A: You can directly view the container execution log.
Q2: I defined the preStop hook
configuration. How do you determine if this hook has been executed? What does the execution result look like?
A: Our platform does not provide the function to view the execution results. If it fails to execute preStop, you can see the
FailedPreStopHook
event in Events. Output Event is as follows:Warning FailedPreStopHook 1s kubelet, 10.1.0.105 Exec lifecycle hook ([1nginx -s quit]) for Container "nginx" in Pod "nginx-7df4f86d64-z69cx_default(712f57eb-0274-11e8-a931-06ff9d8abab7)" failed - error: command '/1nginx -s quit' exited with 126: , message: "rpc error: code = 2 desc = \"oci runtime error: exec failed: exec: \\\"/usr/sbin/1nginx\\\": stat /usr/sbin/1nginx: no such file or directory\"" Normal Killing 1s kubelet, 10.1.0.105 Killing container with id docker://nginx:Need to kill Pod