JUST WRITE

Pod 주기적으로 삭제하기 - Kubernetes에 CronJob 세팅 본문

MLOps/Kubernetes

Pod 주기적으로 삭제하기 - Kubernetes에 CronJob 세팅

천재보단범재 2023. 8. 6. 18:57

Kubernetes에 CronJob 세팅

Pod 주기적으로 삭제하기

Kubernetes Cluster에 CVAT을 Helm으로 설치하여 운영하고 있습니다.

 

Kubernetes에 CVAT 설치하기 - Helm으로 CVAT 설치

Kubernetes에 CVAT 설치하기 Yolo 모델 학습에 도전하였던 포스팅에서 Auto Labeling Tool로 CVAT을 사용하였습니다. [Vision] 엔지니어의 Yolo 도전기(1) - CVAT을 통한 Auto Labeling 엔지니어의 Yolo 도전기 회사에서

developnote-blog.tistory.com

CVAT을 사용하다 보니 설치한 namespace에 많은 수 Pod이 생성되어 있었습니다.

STATUS가 Completed 상태의 Pod이 많이 남아 있었습니다.

Completed는 Pod내 Container의 Process가 종료된 상태로 Terminated 상태와 동일합니다.

$ kubectl get pods -n cvat

NAME                                                         READY   STATUS      RESTARTS       AGE
pod/complete-pod-clean-nzk74                                 0/1     Completed   0              24h
pod/cvat-backend-server-9684f5576-47tfw                      1/1     Running     2 (16h ago)    22h
pod/cvat-backend-server-9684f5576-r89h8                      0/1     Completed   0              22h
pod/cvat-backend-server-9684f5576-xgf9s                      0/1     Completed   0              23h
pod/cvat-backend-server-9684f5576-z7rkh                      0/1     Completed   8 (44h ago)    12d
pod/cvat-backend-utils-95cf97f47-5965r                       0/1     Completed   0              22h
pod/cvat-backend-utils-95cf97f47-6v5xx                       0/1     Completed   1 (44h ago)    2d1h
pod/cvat-backend-utils-95cf97f47-tqtx8                       1/1     Running     2 (16h ago)    22h
pod/cvat-backend-utils-95cf97f47-wj2j9                       0/1     Completed   0              23h
pod/cvat-backend-worker-annotation-7fd565bd6f-7tvkj          1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-annotation-7fd565bd6f-f5nmv          0/1     Completed   0              22h
pod/cvat-backend-worker-annotation-7fd565bd6f-klwpx          0/1     Completed   0              23h
pod/cvat-backend-worker-export-5447bd5f56-6qgpv              1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-export-5447bd5f56-7446p              0/1     Completed   1 (44h ago)    2d1h
pod/cvat-backend-worker-export-5447bd5f56-77gp5              0/1     Completed   1 (44h ago)    2d1h
pod/cvat-backend-worker-export-5447bd5f56-ktz9c              1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-export-5447bd5f56-p9xph              0/1     Completed   0              23h
pod/cvat-backend-worker-export-5447bd5f56-ptl4h              0/1     Completed   0              23h
pod/cvat-backend-worker-export-5447bd5f56-v9wtf              0/1     Completed   0              22h
pod/cvat-backend-worker-import-589856456f-5vt9k              0/1     Completed   0              23h
pod/cvat-backend-worker-import-589856456f-bbcg7              0/1     Completed   0              22h
pod/cvat-backend-worker-import-589856456f-fxt6t              0/1     Completed   1              2d1h
pod/cvat-backend-worker-import-589856456f-hs98w              1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-import-589856456f-jjmqs              0/1     Completed   0              22h
pod/cvat-backend-worker-import-589856456f-v2h5r              1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-import-589856456f-wdgll              0/1     Completed   1 (44h ago)    2d1h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-7qzbx      0/1     Completed   0              22h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-9f7wm      0/1     Completed   0              24h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-h8rlt      0/1     Completed   0              22h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-lxvj2      0/1     Completed   1 (44h ago)    3d23h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-n8cx4      0/1     Completed   0              23h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-sb69q      0/1     Completed   0              22h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-v8mcn      1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-qualityreports-5f76fdcbc7-x2frk      0/1     Completed   0              22h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-2jhdx            0/1     Completed   0              23h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-8mv8q            0/1     Completed   1 (44h ago)    3d23h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-c96lw            0/1     Completed   0              22h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-cg8rj            1/1     Running     2 (16h ago)    22h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-d2ppg            0/1     Completed   0              24h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-mfv9s            0/1     Completed   0              22h
pod/cvat-backend-worker-webhooks-6bfbcfdf94-rjcz6            0/1     Completed   0              22h
pod/cvat-frontend-6c6bb88fcc-68n4d                           1/1     Running     16 (16h ago)   19d
pod/cvat-nuclio-controller-5cf85b6f65-rbrzw                  1/1     Running     16 (16h ago)   19d
pod/cvat-nuclio-dashboard-7cdf97d98c-jbqv6                   1/1     Running     11 (16h ago)   12d
pod/cvat-opa-67bd5b59d6-qsq4k                                1/1     Running     16 (16h ago)   19d
pod/cvat-postgresql-0                                        1/1     Running     0              6m59s
pod/cvat-redis-master-0                                      1/1     Running     16 (16h ago)   19d
pod/cvat-redis-replicas-0                                    1/1     Running     19 (16h ago)   19d
pod/cvat-redis-replicas-1                                    1/1     Running     19 (16h ago)   19d
pod/cvat-redis-replicas-2                                    1/1     Running     18 (16h ago)   19d
pod/delete-pod-clean-sbr88                                   0/1     Completed   0              24h
pod/nuclio-openvino-omz-public-yolo-v3-tf-5d45676d68-qbj85   1/1     Running     3 (16h ago)    22h

그럼 STATUS가 Completed인 Pod을 정리하려면 어떻게 해야 할까요?

해당 Pod들이 주기적으로 생성된다면 어떻게 주기적으로 정리할 수 있을까요?

이번 포스팅에서는 Kubernetes의 CronJob에 대해서 정리해보려고 합니다.

CronJob

CronJob은 Job을 시간 기반의 Schedule에 따라 실행할 수 있게 해주는 Object입니다.

Kubernetes에서 Job은 Pod을 통해 작업을 수행할 수 있게 해주는 Object입니다.

아래는 Job을 yaml로 정의한 것입니다.

perl 기반 Container를 가진 Pod를 생성해서 해당 command 작업을 진행하는 것입니다.

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl:5.34.0
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

Job은 1번만 실행하고 해당 Pod이 Terminated가 끝납니다.

CronJob은 crontab으로 정의한 Schedule에 따라 Job을 생성하고 작업을 수행합니다.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox:1.28
            imagePullPolicy: IfNotPresent
            command:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

아래처럼 Kubernetes는 job, cronjob을 api로 제공하고 있습니다.

$ kubectl api-resources | grep job
cronjobs	cj		batch/v1		true         CronJob
jobs				batch/v1		true         Job

Pod 정리 CronJob 생성

Role 생성

먼저 Pod을 delete 하려면 권한이 필요합니다.

Pod을 delete할 수 있는 권한과 해당 권한을 가진 ServiceAccount를 생성합니다.

총 3가지 Object를 생성하였습니다.

  • ServiceAccount
  • Role
  • RoleBinding
apiVersion: v1
kind: ServiceAccount
metadata:
  name: internal-deployer
  namespace: cvat
  
---

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: delete-pods
  namespace: cvat
rules:
  - apiGroups: [""]
    resources:
      - pods
    verbs:
      - get
      - list
      - delete

---

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: internal-deployer-rb
  namespace: cvat
subjects:
  - kind: ServiceAccount
    name: internal-deployer
roleRef:
  kind: Role
  name: delete-pods
  apiGroup: rbac.authorization.k8s.io

CronJob 생성

위에 Role을 가진 ServiceAccount를 활용하여 cronjob를 생성하겠습니다.

Pod내 Container의 image는 kubectl를 사용할 수 있는 bitnami/kubectl를 사용하였습니다.

field-selector를 활용해서 STATUS가 Succeeded와 Failed인 Pod를 삭제하는 CronJob를 생성하였습니다.

schedule은 crontab표현식으로 명시하였습니다.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-complete-pod-clean
  namespace: cvat
spec:
  schedule: "0 5 * * *" 
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: internal-deployer
          containers:
          - name: complete-pod-clean
            image: bitnami/kubectl:1.27.3
            imagePullPolicy: IfNotPresent
            command:
            - "/bin/bash" 
            - "-c" 
            - "kubectl delete pods --field-selector=status.phase=Succeeded -n cvat" 
          restartPolicy: Never

---

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-delete-pod-clean
  namespace: cvat
spec:
  schedule: "0 5 * * *" 
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: internal-deployer
          containers:
          - name: delete-pod-clean
            image: bitnami/kubectl:1.27.3
            imagePullPolicy: IfNotPresent
            command:
            - "/bin/bash" 
            - "-c" 
            - "kubectl delete pods --field-selector=status.phase=Failed -n cvat" 
          restartPolicy: Never

해당 yaml 파일들을 기반으로 Cronjob을 생성합니다.

$ kubectl describe role -n cvat delete-pods
Name:         delete-pods
Labels:       <none>
Annotations:  <none>
PolicyRule:
  Resources  Non-Resource URLs  Resource Names  Verbs
  ---------  -----------------  --------------  -----
  pods       []                 []              [get list delete]

$ kubectl describe rolebinding -n cvat internal-deployer-rb
Name:         internal-deployer-rb
Labels:       <none>
Annotations:  <none>
Role:
  Kind:  Role
  Name:  delete-pods
Subjects:
  Kind            Name               Namespace
  ----            ----               ---------
  ServiceAccount  internal-deployer


$ kubectl get cronjob -n cvat
NAME                       SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
daily-complete-pod-clean   0 2 * * *   False     0        8h              6d3h
daily-delete-pod-clean     0 2 * * *   False     0        8h              6d3h

CronJob에서 Job 실행

CronJob을 생성하였지만 제대로 생성한 건지 바로 확인할 수가 없습니다.

정의한 Schedule이 되어야 Job이 생성되기 때문입니다.

바로 Job을 생성해서 확인할 수 있는 방법이 있습니다.

from option을 통해서 바로 job을 생성할 수 있습니다.

$ kubectl create job --from=cronjob/daily-complete-pod-clean manual-complete-pod-clean-001 -ncvat

$ kubectl get job -n cvat
NAME                            COMPLETIONS   DURATION   AGE
manual-complete-pod-clean-001   1/1           20s        115s

$ kubectl get pods -n cvat
NAME                                                     READY   STATUS      RESTARTS       AGE
...                             
manual-complete-pod-clean-001-k6g5b                      0/1     Completed   0              86s
...

그러면 Job을 통해서 Pod이 생성됩니다.

해당 Pod의 log를 확인하면 원하는 대로 Pod이 삭제되는 것을 확인할 수 있습니다.

$ kubectl logs manual-complete-pod-clean-001-k6g5b -n cvat
pod "cvat-backend-server-9684f5576-r89h8" deleted
pod "cvat-backend-server-9684f5576-xgf9s" deleted
pod "cvat-backend-server-9684f5576-z7rkh" deleted
pod "cvat-backend-utils-95cf97f47-5965r" deleted
pod "cvat-backend-utils-95cf97f47-6v5xx" deleted
pod "cvat-backend-utils-95cf97f47-wj2j9" deleted
pod "cvat-backend-worker-annotation-7fd565bd6f-f5nmv" deleted
pod "cvat-backend-worker-annotation-7fd565bd6f-klwpx" deleted
pod "cvat-backend-worker-export-5447bd5f56-7446p" deleted
pod "cvat-backend-worker-export-5447bd5f56-77gp5" deleted
pod "cvat-backend-worker-export-5447bd5f56-p9xph" deleted
pod "cvat-backend-worker-export-5447bd5f56-ptl4h" deleted
pod "cvat-backend-worker-export-5447bd5f56-v9wtf" deleted
pod "cvat-backend-worker-import-589856456f-5vt9k" deleted
pod "cvat-backend-worker-import-589856456f-bbcg7" deleted
pod "cvat-backend-worker-import-589856456f-fxt6t" deleted
pod "cvat-backend-worker-import-589856456f-jjmqs" deleted
pod "cvat-backend-worker-import-589856456f-wdgll" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-7qzbx" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-9f7wm" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-h8rlt" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-lxvj2" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-n8cx4" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-sb69q" deleted
pod "cvat-backend-worker-qualityreports-5f76fdcbc7-x2frk" deleted
pod "cvat-backend-worker-webhooks-6bfbcfdf94-2jhdx" deleted
pod "cvat-backend-worker-webhooks-6bfbcfdf94-8mv8q" deleted
pod "cvat-backend-worker-webhooks-6bfbcfdf94-c96lw" deleted
pod "cvat-backend-worker-webhooks-6bfbcfdf94-d2ppg" deleted
pod "cvat-backend-worker-webhooks-6bfbcfdf94-mfv9s" deleted
pod "cvat-backend-worker-webhooks-6bfbcfdf94-rjcz6" deleted

CronJob 실행

CronJob이 제대로 작동하는지 Job을 생성하여 확인하였습니다.

그럼 이제 Schedule대로 Job이 생성되는지 확인해야 합니다.

CronJob 생성 후 며칠 뒤 제대로 Job이 실행되었는지 확인하였습니다.

$ kubectl get job -n cvat

NAME                                          COMPLETIONS   DURATION   AGE
job.batch/daily-complete-pod-clean-28188120   1/1           22s        2d7h
job.batch/daily-complete-pod-clean-28189560   1/1           25s        31h
job.batch/daily-complete-pod-clean-28191000   1/1           4s         7h49m
job.batch/daily-delete-pod-clean-28188120     1/1           16s        2d7h
job.batch/daily-delete-pod-clean-28189560     1/1           17s        31h
job.batch/daily-delete-pod-clean-28191000     1/1           4s         7h49m
job.batch/manual-complete-pod-clean-001       1/1           20s        6d3h

Job이 제대로 생성되어 수행되는 것을 확인하였습니다.

CronJob을 삭제까지는 안하고 중지하는 방법은 patch command로 가능합니다.

CronJob의 suspend를  true로 바꾸면 CronJob이 중단되어 Job을 생성하지 않습니다.

$ kubectl patch cronjob daily-complete-pod-clean -p '{"spec":{"suspend":true}}' -n cvat

[참고사이트]

728x90
반응형
Comments