修改pod的dns配置文件/etc/resolv.conf内容

kubernetes提供了修改pod的/etc/resolv.conf文件配置方法,即spec.dnsConfigspec.dnsPolicy,具体可以访问[Customizing DNS Service],但是这种方法会导致pod重新生成。

我们有个业务场景:pod访问本地的localdns方式,取代中心化的访问coredns。kubelet的cluster dns配置已经改成localdns地址,但是在变更之前生成的pod还是使用coredns,需要将这部分pod的dns的nameserver改成localdns。但是不能主动删除pod或重启container(这的确不是一个好的容器使用方式(把容器当成宠物),这里公司文化决定的(业务程序没有实现优雅退出))。 即需要将pod直接访问coredns进行域名解析方式,切换到pod访问本地的node local dns,但是不能让pod进行重启。

第一种,直接修改容器里的/etc/resolv.conf,由于容器运行时是docker,所以pod里的所有容器里的/etc/resolv.conf都是由同一个文件挂载的,只需要一个容器里/etc/resolv.conf,就能达到修改pod的dns的nameserver目的。

pod所有容器使用同一个resolv.conf文件

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# docker ps
a15fc9af0da0   75c6ae8cb401                                                         "/app/push-netstat -…"   5 seconds ago   Up 4 seconds             k8s_main-container_push-netstat-2pwbn_ops_b90f9867-6c13-4242-9904-26354f2b65df_0
9007620aa2cb   k8s-google-containers/pause-amd64:3.2      "/pause"                 6 seconds ago   Up 5 seconds             k8s_POD_push-netstat-2pwbn_ops_b90f9867-6c13-4242-9904-26354f2b65df_0
# docker inspect 9007620aa2cb 
[
    {
        "Id": "9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0",
        "Created": "2022-12-01T04:51:14.862746022Z",
        "Path": "/pause",
        "Image": "sha256:80d28bedfe5dec59da9ebf8e6260224ac9008ab5c11dbbe16ee3ba3e4439ac2c",
        "ResolvConfPath": "/data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf",

# docker inspect a15fc9af0da0
[
    {
        "Id": "a15fc9af0da02a93a9c84f3528cc3970a44157c3b63d1d2efa17e8694cded9b5",
        "Created": "2022-12-01T04:51:15.425130718Z",
        "ResolvConfPath": "/data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf",

#cat /proc/1/mountinfo 

952 703 253:1 /usr/share/zoneinfo/Asia/Shanghai /usr/share/zoneinfo/UCT ro,noatime - ext4 /dev/vda1 rw,stripe=32411,data=ordered
953 705 253:17 /kubernetes/kubelet/pods/b90f9867-6c13-4242-9904-26354f2b65df/containers/main-container/7adc41b0 /dev/termination-log rw,relatime - ext4 /dev/vdb1 rw,data=ordered
954 703 253:17 /kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf /etc/resolv.conf rw,relatime - ext4 /dev/vdb1 rw,data=ordered
955 703 253:17 /kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/hostname /etc/hostname rw,relatime - ext4 /dev/vdb1 rw,data=ordered
956 703 253:17 /kubernetes/kubelet/pods/b90f9867-6c13-4242-9904-26354f2b65df/etc-hosts /etc/hosts rw,relatime - ext4 /dev/vdb1 rw,data=ordered
957 705 0:122 / /dev/shm rw,nosuid,nodev,noexec,relatime - tmpfs shm rw,size=65536k
958 703 0:120 / /run/secrets/kubernetes.io/serviceaccount ro,relatime - tmpfs tmpfs rw,size=1048576k

缺点:需要容器里有修改文件的程序,比如echo(不能使用vim、sed直接修改,因为它们的原理是新建一个文件,再覆盖源文件),同时要求这个容器的运行用户是root(要不然会报Permission denied)。

优点:直观、操作简单。

使用sed在root用户的启动的容器里面修改报错

1
2
# sed -i '$a \sdasd' /etc/resolv.conf 
sed: cannot rename /etc/sedz6ITFW: Device or resource busy

非root用户容器里修改报错

1
2
$ echo a >>  /etc/resolv.conf 
bash: /etc/resolv.conf: Permission denied

第二种方法:修改docker容器里的/etc/resolv.conf对应的ResolvConfPath宿主机上挂载的源文件,同样不能使用vim、sed直接修改,可以使用echo命令来修改。

这个方法的原始想法来自dockershim修改sandbox里的/etc/resolv.conf

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
	// Rewrite resolv.conf file generated by docker.
	// NOTE: cluster dns settings aren't passed anymore to docker api in all cases,
	// not only for pods with host network: the resolver conf will be overwritten
	// after sandbox creation to override docker's behaviour. This resolv.conf
	// file is shared by all containers of the same pod, and needs to be modified
	// only once per pod.
	if dnsConfig := config.GetDnsConfig(); dnsConfig != nil {
		containerInfo, err := ds.client.InspectContainer(createResp.ID)
		if err != nil {
			return nil, fmt.Errorf("failed to inspect sandbox container for pod %q: %v", config.Metadata.Name, err)
		}

		if err := rewriteResolvFile(containerInfo.ResolvConfPath, dnsConfig.Servers, dnsConfig.Searches, dnsConfig.Options); err != nil {
			return nil, fmt.Errorf("rewrite resolv.conf failed for pod %q: %v", config.Metadata.Name, err)
		}
	}

https://github.com/kubernetes/kubernetes/blob/e17755904ae82971e954820187961636c350c6ff/pkg/kubelet/dockershim/docker_sandbox.go#L147-L162

缺点:需要宿主机的root权限,只适用于docker作为容器运行时

优点:方法简单

下面来看使用vim修改后容器里面不生效

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#将options ndots:5改成options ndots:4
# vim /data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf
nameserver 169.254.20.10
search xxx-ops.svc.cluster.local svc.cluster.local cluster.local xxx.com
options ndots:4

#容器里面不生效
# kubectl exec -it -n ops push-netstat-2pwbn cat /etc/resolv.conf
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
nameserver 169.254.20.10
search xxx-ops.svc.cluster.local svc.cluster.local cluster.local xxx.com
options ndots:5

# cat /data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf
nameserver 169.254.20.10
search xxx-ops.svc.cluster.local svc.cluster.local cluster.local xxx.com
options ndots:4

#运行一个相同网络命名空间的容器,/etc/resolv.conf文件是被修改过的
#docker run -it --name test-resolv-conf --network container:9007620aa2cb  ubuntu bash
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
e96e057aae67: Pull complete 
Digest: sha256:4b1d0c4a2d2aaf63b37111f34eb9fa89fa1bf53dd6e4ca954d47caebca4005c2
Status: Downloaded newer image for ubuntu:latest

root@push-netstat-5pxlr:/# cat /etc/resolv.conf 
nameserver 169.254.20.10
search xxx-ops.svc.cluster.local svc.cluster.local cluster.local xxx.com
options ndots:4

但是使用echo命令修改宿主机上的挂载源文件,容器中能够感知到

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# echo 11 >> /data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf
# ll -i /data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf
3276908 -rw-r--r-- 1 root root 132 Dec  1 13:01 /data/kubernetes/docker/containers/9007620aa2cb23f2d91a3fb194771c39be45d75e73ef36a3f78b99a4574a1cb0/resolv.conf

# kubectl exec -it -n ops push-netstat-2pwbn /bin/cat /etc/resolv.conf 
nameserver 169.254.20.10
search xxx-ops.svc.cluster.local svc.cluster.local cluster.local xxx.com
options ndots:5
11
# kubectl exec -it -n ops push-netstat-2pwbn /bin/ls /etc/resolv.conf -i
3276908 /etc/resolv.conf

虽然容器里/etc/resolv.conf文件内容更新了,但是应用程序是否重新读取/etc/resolv.conf文件呢?

不同的应用程序有不同行为:

java jvm:默认只在启动时候读取,且运行后不会重新读取/etc/resolv.conf

As a program (any process, JVM included) has its very first DNS request it reads and caches forever the entire contents of /etc/resolv.conf by default. It never refreshes that info later, even when it encounters a total DNS failure. The program would need to have some specific system calls programmed to behave in more user-friendly manner. This SO question explains the details.

https://serverfault.com/a/901790/405334

networkaddress.cache.ttl

Specified in java.security to indicate the caching policy for successful name lookups from the name service.. The value is specified as integer to indicate the number of seconds to cache the successful lookup.

A value of -1 indicates “cache forever”. The default behavior is to cache forever when a security manager is installed, and to cache for an implementation specific period of time, when a security manager is not installed.

https://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html

golang: 使用go1.17 net/http库进行访问测试,golang会重新读取/etc/resolv.conf

虽然每种语言有默认进行dns解析的实现,但是每个程序都可以自己实现dns域名解析。所以程序有没有重新读取/etc/resolv.conf,还要进行验证(比如通过抓包)。

修改pod的/etc/resolv.conf的方法分为两种:使用kubernetes原生支持的spec.dnsConfigspec.dnsPolicy、直接修改容器里的/etc/resolv.conf、根据容器运行时docker特点修改resolv.conf。

前面两种方法是通用的、后面一种方法只适应于docker作为容器运行时。修改正在运行的容器/etc/resolv.conf文件,应用程序不一定会进行重新读取里面内容。

利用这个方法可以在pod的dns解析不中断情况下,进行优雅的更换kubernetes集群的节点的dns服务器,具体可以看这篇文章在kubernetes集群里优雅的不影响应用的更换节点的dns服务器ip

相关内容