【トラブルシューティング】microk8sで「tls: failed to verify certificate」が発生した場合
久しぶりに microk8s を使用していたら kubectl exec コマンドで pod にアクセスできなかった。
DNSを起動して証明書を更新してもうまくいかない。
多くの人が1度は通ると思ったので、誰かの助けになればと、この記事を残します。
まずは稼働状況とエラー内容
稼働状況
yu@itbs:/var/snap/microk8s/current/certs$ kk get po
NAME READY STATUS RESTARTS AGE
api-deployment-866979c88f-g75jd 1/1 Running 1 32m
emotion-analyzer-deployment-5bcb9d8796-g87wm 1/1 Running 1 32m
importance-analyzer-deployment-5b85694c6c-qm84w 1/1 Running 1 32m
postgres-deployment-7895468b75-6m668 1/1 Running 1 32m
yu@itbs:/var/snap/microk8s/current/certs$エラー内容
yu@itbs:/var/snap/microk8s/current/certs$ kk exec -it postgres-deployment-7895468b75-6m668 -- bash
error: Internal error occurred: error sending request: Post "https://172.21.0.184:10250/exec/default/postgres-deployment-7895468b75-6m668/postgres?command=bash&input=1&output=1&tty=1": tls: failed to verify certificate: x509: certificate is valid for 172.20.110.99, 172.17.0.1, not 172.21.0.184
yu@itbs:/var/snap/microk8s/current/certs$次に microk8s の各種情報。master-nodeのINTERNAL-IPとして「172.21.0.184」が付与されていることを覚えていてほしい。
状態
yu@itbs:/var/snap/microk8s/current/certs$ k status
microk8s is running
high-availability: no
datastore master nodes: 127.0.0.1:19001
datastore standby nodes: none
addons:
enabled:
dns # (core) CoreDNS
ha-cluster # (core) Configure high availability on the current node
helm # (core) Helm - the package manager for Kubernetes
helm3 # (core) Helm 3 - the package manager for Kubernetes
hostpath-storage # (core) Storage class; allocates storage from host directory
metrics-server # (core) K8s Metrics Server for API access to service metrics
storage # (core) Alias to hostpath-storage add-on, deprecated
disabled:
cert-manager # (core) Cloud native certificate management
cis-hardening # (core) Apply CIS K8s hardening
community # (core) The community addons repository
dashboard # (core) The Kubernetes dashboard
gpu # (core) Alias to nvidia add-on
host-access # (core) Allow Pods connecting to Host services smoothly
ingress # (core) Ingress controller for external access
kube-ovn # (core) An advanced network fabric for Kubernetes
mayastor # (core) OpenEBS MayaStor
metallb # (core) Loadbalancer for your Kubernetes cluster
minio # (core) MinIO object storage
nvidia # (core) NVIDIA hardware (GPU and network) support
observability # (core) A lightweight observability stack for logs, traces and metrics
prometheus # (core) Prometheus operator for monitoring and logging
rbac # (core) Role-Based Access Control for authorisation
registry # (core) Private image registry exposed on localhost:32000
rook-ceph # (core) Distributed Ceph storage using Rook
yu@itbs:/var/snap/microk8s/current/certs$yu@itbs:/var/snap/microk8s/current/certs$ kk cluster-info
Kubernetes control plane is running at https://127.0.0.1:16443
CoreDNS is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
yu@itbs:/var/snap/microk8s/current/certs$yu@itbs:/var/snap/microk8s/current/certs$ kk get node -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
itbs Ready <none> 11d v1.32.8 172.21.0.184 <none> Ubuntu 24.04.3 LTS 6.8.0-79-generic containerd://1.6.36
yu@itbs:/var/snap/microk8s/current/certs$最後に原因だ。対象ファイルの「alt_names」に設定されている IP.1 および IP.2 には「ループバックアドレス: 127.0.0.1」と「CoreDNSアドレス: 10.152.183.1」しか付与されていなかった。ループバックアドレスはローカルホストのことで、CoreDNSアドレスとは コンテナ同士が互いに通信するために使用されるアドレスだ。具体的には、postgres-serviceやemotion-analyzer-serviceといったサービス名が、このIPアドレスを通じて実際のPodのIPアドレスに解決する。
ここに master-node のINTERNAL-IPとして「172.21.0.184」が存在しないことが問題だ。
対象ファイル
yu@itbs:/var/snap/microk8s/current/certs$ cat csr.conf.template
[ req ]
default_bits = 2048
prompt = no
default_md = sha256
req_extensions = req_ext
distinguished_name = dn
[ dn ]
C = GB
ST = Canonical
L = Canonical
O = Canonical
OU = Canonical
CN = 127.0.0.1
[ req_ext ]
subjectAltName = @alt_names
[ alt_names ]
DNS.1 = kubernetes
DNS.2 = kubernetes.default
DNS.3 = kubernetes.default.svc
DNS.4 = kubernetes.default.svc.cluster
DNS.5 = kubernetes.default.svc.cluster.local
IP.1 = 127.0.0.1
IP.2 = 10.152.183.1
#MOREIPS
[ v3_ext ]
authorityKeyIdentifier=keyid,issuer:always
basicConstraints=CA:FALSE
keyUsage=keyEncipherment,dataEncipherment,digitalSignature
extendedKeyUsage=serverAuth,clientAuth
subjectAltName=@alt_names
yu@itbs:/var/snap/microk8s/current/certs$修正
yu@itbs:/var/snap/microk8s/current/certs$ diff csr.conf.template csr.conf.template.bk
27d26
< IP.3 = 172.21.0.184
yu@itbs:/var/snap/microk8s/current/certs$証明書適応 ※フォルダを移動してコマンドを適応
yu@itbs:/var/snap/microk8s/current/certs$ sudo microk8s refresh-certs -e ca.crt
Taking a backup of the current certificates under /var/snap/microk8s/8355/certs-backup/
Creating new certificates
Signature ok
subject=C = GB, ST = Canonical, L = Canonical, O = Canonical, OU = Canonical, CN = 127.0.0.1
Getting CA Private Key
Signature ok
subject=CN = front-proxy-client
Getting CA Private Key
1
Creating new kubeconfig file
Restarting service kubelite.
Restarting service cluster-agent.
The CA certificates have been replaced. Kubernetes will restart the pods of your workloads.
Any worker nodes you may have in your cluster need to be removed and re-joined to become aware of the new CA.
yu@itbs:/var/snap/microk8s/current/certs$ 対象ファイル更新
yu@itbs:/var/snap/microk8s/current/certs$ microk8s.config > ~/.kube/config
yu@itbs:/var/snap/microk8s/current/certs$ 権限付与
yu@itbs:/var/snap/microk8s/current/certs$ chmod 600 ~/.kube/config
yu@itbs:/var/snap/microk8s/current/certs$動作確認
yu@itbs:/var/snap/microk8s/current/certs$ kk get po
NAME READY STATUS RESTARTS AGE
api-deployment-866979c88f-4d7kk 1/1 Running 0 5m43s
emotion-analyzer-deployment-5bcb9d8796-6fzcl 1/1 Running 0 5m43s
importance-analyzer-deployment-5b85694c6c-tj5t5 1/1 Running 0 5m43s
postgres-deployment-7895468b75-jh89k 1/1 Running 0 5m44s
yu@itbs:/var/snap/microk8s/current/certs$ kk exec -it postgres-deployment-7895468b75-jh89k -- bash
postgres-deployment-7895468b75-jh89k:/#まとめ
今回発生していた tlsエラーは、csr.conf.templateのIPアドレスに、「172.21.0.184」というホストIPアドレスが含まれていなかったことが原因で、kubectlがPodに直接アクセスしようとした際、PodのIPアドレスと証明書に記載されたIPアドレスが一致せず、通信が拒否されることにあった。
汎用的AIに問いかけても回答は返ってこない為、誰かの助けになれば幸いです。