Apply Elegance to Stop Best Practices

Regardless of the type of application, it is desirable to receive a stop notification before the service is stopped, and to have a certain amount of time to do the work of releasing resources before exiting, stopping the connection, and no longer receiving external requests. We provide a comprehensive guide to the elegant stop configuration of all applications and also provide a complete flow of elegant stop functionality for each type of service, from development to deployment to AppEngine(K8s).

The simplest way to stop a container elegantly

The simplest configuration to stop a container elegantly

Before exiting the container, the container will be removed from the service-provided list so that external requests no longer appear on it and Hooks can be executed before exiting.

Here, we set to execute the sleep 10 command in the container before stopping such container. At this time, the container will no longer receive external requests, and there are 10 seconds to complete the processing of the request. 10 seconds later, the container will be forcibly deleted.

Linux common signals

To understand how to apply the elegant stop method, let's review the container-related Linux common signals.

Signals are a form of interprocess communication. A signal is a message sent by the kernel to the process, telling the process that something has happened. The process needs to register handlers for the signals it is interested in. For example:

  1. In order to allow the program to elegantly exit (resources are cleared after receiving the exit signal), a normal program will handle the SIGTERM signal. Unlike the SIGTERM signal, the SIGKILL signal will violently end a process;
  2. Many daemons implement hot configuration files by processing the SIGHUP signal.

Use the kill -l command to display a list of signals supported by Linux. The signals numbered 1-31 are signals supported by traditional UNIX and are unreliable signals (not real-time). The signals numbered 32-63 are later expanded, called reliable signals (real-time).

The following describe the commonly used signals:

  1. SIGHUP(1)

    When the user terminal connection is complete, the system will send this signal like all running processes; it is usually used when hot configuration files are loaded. The wget command registers the SIGHUP(1) signal so that if you exit the Linux login, wget can continue to download the file. Similarly, services such as Docker/Nginx/LVS will also register the SIGHUP(1) signal to implement the service's hot configuration file function.

  2. SIGINT(2)

    The program interrupt signal is issued when the user hits the INTR character (usually Ctrl+C) to notify the front-desk process group to terminate the process.

  3. SIGQUIT(3)

    Similar to SIGINT but controlled by QUIT character (usually Ctrl+backslash). Nginx stops the service elegantly by registering this signal.

  4. SIGKILL(9)

    Ends the program immediately. This signal cannot be blocked, processed and ignored and cannot be obtained in the program.

  5. SIGTERM(15)

    The Terminate signal, also known as the Request exit signal, is different from SIGKILL in that the signal can be blocked and processed. We can register the signal in the program to achieve elegant stop of the service. Use the kill command to send this signal by default.

  6. SIGCHLD(17)

    At the end of the sub-process, this signal is usually sent to the parent process. Nginx is a multi-process program that is used by the master process to communicate with the worker process.

To learn more about Linux Signal, see the man signal file.

Docker container service support for signals

Docker also does a lot of support for Linux Signal.

  • 1) docker stop command signal support

When we use the docker stop command to stop the container, docker by default allows the application in the container to have 10 seconds to terminate the application. We can customize the length of a stop by manually specifying the --time/-t parameter when executing the docker stop command.

→ docker stop --help
Usage:  docker stop [OPTIONS] CONTAINER [CONTAINER…]
Stop one or more running containers
Options:
      --help      Print usage
  -t, --time int  Seconds to wait for stop before killing it (default 10)

When the docker stop command is executed, it will send the system signal SIGTERM to the process with PID as 1 (main process) in the container. Then wait for the application in the container to terminate execution. If the waiting time reaches the set timeout, such as the default 10 seconds, the system signal of SIGKILL will continue to be sent to forcibly kill the process. The application in the container can choose to ignore and not process the SIGTERM signal, but once the timeout is reached, the program will be forcibly killed by the system.

  • 2) docker kill command signal support

By default, the docker kill command does not give any graceful shutdown opportunity to the application in the container. It will directly send a system signal of SIGKILL to force the program operation in the container to terminate.

Looking at the help of the docker kill command, we see that, in addition to sending the SIGKILL signal by default, it also allows us to send some customized system signals:

→ docker kill --help
Usage:  docker kill [OPTIONS] CONTAINER [CONTAINER…]
Kill one or more running containers
Options:
      --help            Print usage
  -s, --signal string  Signal to send to the container (default "KILL")

For example, if we want to send a SIGINT signal to a program in docker, we can do this:

docker kill --signal=SIGINT container_name

Unlike the docker stop command, docker kill does not have any time-out setting. It will send the SIGKILL signal directly, or other user-specified signals.

  • 3) docker rm command signal support

The docker rm command is used to delete a container that has stopped running. We can add a --force or -f parameter to force the running container to be deleted. After using this parameter, docker will send a SIGKILL signal to the running container, forcing the container to be stopped and then deleting it.

For example, forcibly deleting a running container with the name web

docker rm -fv web
  • 4) docker daemon process signal support

The docker daemon process will receive the SIGHUP signal and will reload the daemon.json configuration file after receiving it.

We send a SIGHUP signal for the dockerd process:

root@vm10-1-1-28:~# kill -SIGHUP $(pidof dockerd)
root@vm10-1-1-28:~# or
root@vm10-1-1-28:~# systemctl reload docker

Looking at the docker daemon's log, you can see that the docker daemon receives this signal and reloads the daemon.json configuration file

root@vm10-1-1-28:~# journalctl -u docker.service -f
-- Logs begin at Sun 2018-01-07 09:17:01 CST. --
Jan 18 16:20:11 vm10-1-1-28.ksc.com dockerd[26668]: time="2018-01-18T16:20:11.262904839+08:00" level=info msg="Got signal to reload configuration, reloading from: /etc/docker/daemon.json"
Jan 18 16:21:41 vm10-1-1-28.ksc.com systemd[1]: Reloading Docker Application Container Engine.

Therefore, after you modify the /etc/docker/daemon.json file, you can send Docker a SIGHUP signal to reload the configuration file without restarting the docker daemon.

AppEngine(K8s) service signal support

AppEngine(K8s) uses Kubernetes as the orchestration engine and currently provides elegant termination support for running containers.

When a user requests to delete an application through the interface or the command line, and the application expands and shrinks, as long as a Pod is deleted, the following process is triggered to terminate and remove the Pod:

  1. After the service receives the delete Pod request, the elegant exit time of the Pod is updated according to wait time;
  2. Because the Pod has set an elegant exit time, the state of the Pod seen on the interface and the client command line is changed to the "Terminating" state.
  3. (Concurrent with Step 2) The background service kubelet starts to enter the process of closing the Pod:
    1. If preStop hook is defined in the Pod, it will be invoked inside the Pod. If the hook is still running after the elegant exit period expires, it will add another elegant time of 2 seconds;
    2. A process with a PID of 1 in the Pod is sent a SIGTERM signal;
  4. (Concurrent with Step 2) Background service Service manager removes the Pod from the list of services and is no longer considered part of a running pod. The slowly closed Pod can continue to provide external services until the load balancer removes them in turn;
  5. When the elegant exit time is reached, any running process in the Pod will be sent a SIGKILL signal to forcibly kill it;
  6. The background service kubelet will complete the deletion of the Pod and set the elegant exit time to 0 (indicating immediate deletion). Pod is removed from the API and is no longer visible to clients.

From the above flow, we can see that there are two ways to add an elegant exit to an application running in AppEngine V2:

  1. If the process with PID as 1 in the container is our application, we can receive and process the SIGTERM signal in the program to achieve an elegant exit;
  2. We can also set a preStop hook to specify how to elegantly stop the container in the hook. The preStop hook currently supports two methods: sending an HTTP request to the container and <1>executing a command in the container**.</li> </ol>

    Note that if we do not set the wait time</ 0> mentioned above, the default is <strong>30 seconds</ 1>. If set to 0, the <code>SIGKILL signal will be sent immediately to kill all processes in the Pod. If you want to set it, please set it as appropriate according to service conditions to avoid any deadlock or other problems caused by the program.

    Elegant service stop cases

    Regardless of the service, if you want to achieve an elegant stop, you want to tell it to the program before it stops to let the program have a certain amount of time to process, save the program execution site, and elegantly quit the program. Below we have prepared two common cases that basically cover the application scenarios of all services, including: receiving and processing TERM signals in the program and specifying preStop hook.

    The following two cases provide a detailed manual, please choose the appropriate usage method according to your own situation in actual use.

    1. Receiving and processing signals within the program

    By understanding the signal support of the Docker container service support for signals service signal support) above, we know that the docker kill command is suitable for forcibly terminating the program and achieving a quick stop of the container. If you want the program to elegantly shutdown, docker stop is the best choice, so that we can let the program have a certain time to process after receiving the SIGTERM signal and save the program execution site. Then you can elegantly quit the program.

    Next we write a simple Go program to receive and process the signal. After the program is started, it will block and monitor the system signal until it detects the corresponding system signal, and then outputs it to the console and exits execution.

    // main.go
    package main
    import (
    "fmt"
    "os"
    "os/signal"
    "syscall"
    )
    func main() {
     fmt.Println("Program started…")
     ch := make(chan os.Signal, 1)
     // notify signal SIGTERM(15)
     signal.Notify(ch, syscall.SIGTERM)
     // notify signal SIGINT(2)
     signal.Notify(ch, syscall.SIGINT)
     s := <-ch
     switch {
     case s == syscall.SIGINT:
         fmt.Println("SIGINT received!")
         //Do something…
     case s == syscall.SIGTERM:
         fmt.Println("SIGTERM received!")
         //Do something…
     }
     fmt.Println("Exiting…")
    }
    

Next, use a cross-compilation method to compile the program so that the program can run under Linux:

CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o graceful

After compiling, we do the test:

  • 1) The test receives the SIGTNT signal. Start the process in the foreground and enter Ctrl + C to send the SIGINT(2) signal.
lynzabo@ubuntu ~/r/g/s/edit> ./graceful
Program started…
^CSIGINT received!
Exiting…
lynzabo@ubuntu ~/r/g/s/edit>
  • 2) Test receives SIGTERM signal
lynzabo@ubuntu ~/r/g/s/edit> ./graceful &
Program started…
lynzabo@ubuntu ~/r/g/s/edit> ps -ef | grep graceful
lynzabo  21223  21082  0 15:57 pts/21  00:00:00 ./graceful
lynzabo  21287  21082  0 15:57 pts/21  00:00:00 grep --color=auto graceful
lynzabo@ubuntu ~/r/g/s/edit> kill 21223
SIGTERM received!
Exiting…
“./graceful &” has ended
lynzabo@ubuntu ~/r/g/s/edit>
  • 3) Pack the above program into the container and run it.

Dockerfile

FROM alpine:latest

LABEL maintainer "opl-xws@xiaomi.com"

ADD graceful /graceful

CMD ["/graceful"]

A pit that is common in handling SIGTERM signals

We all know that by using CMD, ENTRYPOINT command in Dockerfile, we can define the container start command; the difference between these two commands will no longer be mentioned here; we will only talk about the issues that must be paid attention to during use.

Both of these commands support the following formats:

  • shell format: CMD<命令>
  • exec format: CMD ["Executable", "Parameter 1", "Parameter 2"...]
  • Parameter list format: CMD ["Parameter 1","Parameter 2"...]. After specifying the ENTRYPOINT command, specify the specific parameters with CMD.

It is generally recommended to use the exec format, which is parsed as a JSON array during parsing, so be sure to use double quotes " instead of single quotes '.

If you use the shell format, the actual command is executed as a sh-c parameter, such as:

CMD echo $HOME

In the actual implementation, it will be changed to:

CMD [ "sh", "-c", "echo $HOME" ]

Therefore, the main process of the container is sh. When a signal is sent to the container and the signal is received by the sh process, the sh process will directly exit after receiving the signal, and the container will naturally exit. Our program will never receive a signal.

Mirror packaging process:

lynzabo@ubuntu ~/r/g/s/edit> docker build -t cnbj6-repo.cloud.mi.com/k8s/graceful-golang-case:1.0.0 .
Sending build context to Docker daemon 1.953 MB
Step 1/4 : FROM alpine:latest
---> 3fd9065eaf02
Step 2/4 : LABEL maintainer "opl-xws@xiaomi.com"
---> Using cache
---> 6cc05b3f0ed0
Step 3/4 : ADD graceful /graceful
---> Using cache
---> 4a47b371a124
Step 4/4 : CMD /graceful
---> Using cache
---> f1841c0035af
Successfully built f1841c0035af
lynzabo@ubuntu ~/r/g/s/edit>
  • 4) Start the container:
lynzabo@ubuntu ~/r/g/s/edit> docker run -d --name graceful cnbj6-repo.cloud.mi.com/k8s/graceful-golang-case:1.0.0
08d871007b58e55e9552cff23960c80faf51bf8637014a745dec060b80ac9a6f
lynzabo@ubuntu ~/r/g/s/edit> docker ps
CONTAINER ID        IMAGE                                                    COMMAND            CREATED            STATUS              PORTS                    NAMES
08d871007b58        cnbj6-repo.cloud.mi.com/k8s/graceful-golang-case:1.0.0  "/graceful"        10 seconds ago      Up 9 seconds                                graceful
lynzabo@ubuntu ~/r/g/s/edit>
  • 5) Look at the container output and you can see that the program has started normally:
lynzabo@ubuntu ~/r/g/s/edit> docker logs graceful
Program started…
lynzabo@ubuntu ~/r/g/s/edit>
  • 6) Then we need to use docker stop to see if the program can respond to the SIGTERM signal.

We all know that docker stop, by default within 10 seconds of sending the SIGTERM signal and then sending the SIGKILL signal to forcibly stop all processes in the container, deletes the container. If my program is too complex to process, I can't finish the cleanup within 10 seconds. Therefore, during the execution of the docker stop, I will customize to forcibly kill my container after 2 minutes:

lynzabo@ubuntu ~/r/g/s/edit> docker stop --time=120 graceful
graceful
lynzabo@ubuntu ~/r/g/s/edit> docker logs graceful
Program started…
SIGTERM received!
Exiting…
lynzabo@ubuntu ~/r/g/s/edit>

Looking at the above log, we can see that our program can indeed handle the SIGTERM signal sent by Docker.

  • 7) The following explains how to apply the received signal program to AppEngine V2 and let AppEngine V2 control the elegant stop of the Pod.

    • i. Create a new application, fill it in with the application information, set our application elegant exit time to 2 minutes, and set the elegant exit time in the interface as 120 seconds. Creation then becomes successful.
    • ii. Application launch becomes successful
    • iii. Check the application startup log and keep checking the log.
      Program started…
      

  • iii. When we click on Delete Application in the interface, we can see that there is already a SIGTERM signal log sent by AppEngine V2 to our program.
    Program started…
    SIGTERM received!
    Exiting…
    

The entire procedure above demonstrates how to set the elegant stopping of the service through our program. Next we will talk about how to elegantly stop the service through the preStop hook.

2. Stop the service using AppEngine's preStop hook

Sometimes we also want to elegantly stop the service by executing a command or sending an HTTP request before the service stops.

Example:

  • For example, for the Spring boot application, the Spring Boot Actuator provides an elegant stop method for the service. When the service is to be stopped, a shutdown HTTP request for the post method can be sent to the service.
  • For example, for the Nginx service, when you want to stop the service, you can execute the command kill -QUIT Nginx main process number to stop the service.

Below we will use the Nginx service to explain how to use the preStop hook to stop the service:

First, review the basics of Nginx:

Nginx is a multi-process service, master process and a bunch of worker processes. The master process is only responsible for verifying the configuration file syntax, creating the worker process. The actual execution, receiving client requests, and processing the configuration file instructions are all done by the worker process. Between the master process and the worker process, interaction is mainly achieved through Linux Signal. Nginx provides a large number of commands and processing signals to achieve the syntax check of the configuration file, elegant stop of the service, smooth restart of the process, upgrade and other functions. Here, we only briefly introduce the Linux Signal execution procedure and the execution principle triggered by the nginx elegant stop-related command.

There are many ways to stop nginx. Normally, nginx is stopped by sending the system signal to nginx's master process.

  • i. Elegantly stopping nginx
[root@localhost ~]# nginx -s quit
[root@localhost ~]# kill -QUIT 【Nginx main process number】
[root@localhost ~]# kill -QUIT /usr/local/nginx/logs/nginx.pid

When the master process receives the SIGQUIT signal, it forwards this signal to all worker processes. The worker process then closes the listening port so that it no longer receives new connection requests, closes the idle connection, and waits for all the normal connection speeds of the active connection to invoke ngx_worker_process_exit. The master process exits after all worker processes have exited, invoking the ngx_master_process_exit function to exit;

  • ii. Stop nginx quickly
[root@localhost ~]# nginx -s stop
[root@localhost ~]# kill -TERM 【Nginx main process number】
[root@localhost ~]# kill -INT 【Nginx main process number】

TERM signal can be called elegant exit signal in the Linux system, INT signal is system SIGINT signal, and Nginx handles these two signals differently. Nginx elegantly stops the service with the SIGQUIT(3) signal.

When the master process receives the SIGTERM or SIGINT signal, it forwards the signal to the worker process, and the worker process directly invokes the ngx_worker_process_exit function to exit. The master process exits after all the worker processes have exited by invoking the ngx_master_process_exit function. In addition, if the work process fails to exit normally, the master process will wait 1 second and send the SIGKILL signal to forcefully terminate the work process.

iii. Force all nginx processes to stop

[root@localhost ~]# nginx -s stop
[root@localhost ~]# pkill -9 nginx

Send the SIGKILL signal directly to all nginx processes.

The Nginx service uses AppEngine V2 `preStop hook<0> function flow as follows:</p>

  • 1) Run the Nginx mirror provided by the Docker hub.

The default start Nginx command provided in the official Nginx Dockerfile is as follows

Dockerfile

...
CMD ["nginx", "-g", "daemon off;"]
`

The CMD above specifies that nginx is launched directly at the front end.

In order to see that nginx can output SIGNAL logs, we set the nginx log level to notice. Modify the nginx.conf file to change the value of the error_log attribute from error to notice.

nginx.conf

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log notice;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}

http {
    include      /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush    on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;
}
  • 2) We start running the container locally
root@vm10-1-1-28:~/nginx# docker run -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro nginx
2018/01/26 04:53:51 [notice] 1#1: using the "epoll" event method
2018/01/26 04:53:51 [notice] 1#1: nginx/1.13.8
2018/01/26 04:53:51 [notice] 1#1: built by gcc 6.3.0 20170516 (Debian 6.3.0-18)
2018/01/26 04:53:51 [notice] 1#1: OS: Linux 4.4.0-62-generic
2018/01/26 04:53:51 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2018/01/26 04:53:51 [notice] 1#1: start worker processes
2018/01/26 04:53:51 [notice] 1#1: start worker process 5
  • 3) We start a new terminal and send a QUIT signal to the container, elegantly stopping Nginx
root@vm10-1-1-28:~# docker exec -ti bc0a0272448a nginx -s quit
root@vm10-1-1-28:~# or
root@vm10-1-1-28:~# docker kill --signal=SIGQUIT bc0a0272448a
bc0a0272448a
root@vm10-1-1-28:~#

View the container log again

2018/01/26 04:55:04 [notice] 1#1: signal 3 (SIGQUIT) received, shutting down
2018/01/26 04:55:04 [notice] 5#5: gracefully shutting down
2018/01/26 04:55:04 [notice] 5#5: exiting
2018/01/26 04:55:04 [notice] 5#5: exit
2018/01/26 04:55:04 [notice] 1#1: signal 17 (SIGCHLD) received from 5
2018/01/26 04:55:04 [notice] 1#1: worker process 5 exited with code 0
2018/01/26 04:55:04 [notice] 1#1: exit
root@vm10-1-1-28:~/nginx#
root@vm10-1-1-28:~/nginx#

We see that after nginx receives the SIGQUIT signal, it then elegantly stops the service operation.

  • 4) The following explain how to make this service use the preStop hook provided by AppEngine(K8s) to control the elegant stop of the Pod.
    • i. Create a new application, fill it in with Nginx application information, set the Nginx application elegant exit time to 2 minutes, and set the elegant exit time in the interface as 120 seconds.
    • ii. Select the preStop hook type as Exec, fill in the elegant stop Nginx service command as ["nginx","-s","quit"].
    • iii. Click Save application and the application is finally created.

This way, our Nginx application has been configured to stop elegantly. When there is a Pod stopped, AppEngine(K8s) will first invoke the preStop hook to stop the service.

Q&A

Q1: Register the SIGTERM signal in the program. How do you determine if the signal registration function in the program has already been executed?

A: You can directly view the container execution log.

Q2: I defined the preStop hook configuration. How do you determine if this hook has been executed? What does the execution result look like?

A: Our platform does not provide the function to view the execution results. If it fails to execute preStop, you can see the FailedPreStopHook event in Events. Output Event is as follows:

Warning  FailedPreStopHook      1s    kubelet, 10.1.0.105  Exec lifecycle hook ([1nginx -s quit]) for Container "nginx" in Pod "nginx-7df4f86d64-z69cx_default(712f57eb-0274-11e8-a931-06ff9d8abab7)" failed - error: command '/1nginx -s quit' exited with 126: , message: "rpc error: code = 2 desc = \"oci runtime error: exec failed: exec: \\\"/usr/sbin/1nginx\\\": stat /usr/sbin/1nginx: no such file or directory\""
 Normal  Killing                1s    kubelet, 10.1.0.105  Killing container with id docker://nginx:Need to kill Pod

results matching ""

    No results matching ""