본문으로 건너뛰기

Ceph 노드 재시작하기

노드 한 대 재시작

for _option in noout; do
kubectl rook-ceph ceph osd set ${_option}
done
kubectl cordon <node>
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data

해당 노드에 접속하거나 IPMI 등을 통해 노드를 재시작합니다.

kubectl uncordon <node>
for _option in noout; do
kubectl rook-ceph ceph osd unset ${_option}
done

전체 노드 재시작

스토리지를 사용하는 모든 서비스 scale down

Ceph OSD 설정

for _option in noout nodown norebalance nobackfill norecover pause; do
kubectl rook-ceph ceph osd set ${_option}
done

노드 재시작 전에 ceph이 OSD 상태를 체크하거나 복구하려는 시도를 위의 명령어로 막아야합니다.

Rook 컴포넌트 scale down

  1. Rook Operator
  2. CephFS plugin provisioner
  3. RBD plugin provisioner
  4. OSD
  5. MON
  6. MGR
  7. Etc.
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=0
for _category in rook-ceph-rgw csi-cephfsplugin-provisioner csi-rbdplugin-provisioner rook-ceph-osd rook-ceph-mon rook-ceph-mgr rook-ceph-exporter rook-ceph-crashcollector; do
for _item in $(kubectl get deployment -n rook-ceph | awk '/^'"${_category}"'/{print $1}'); do
kubectl -n rook-ceph scale deployment ${_item} --replicas=0;
while [[ $(kubectl get deployment -n rook-ceph ${_item} -o jsonpath='{.status.readyReplicas}') != "" ]]; do
sleep 5;
done;
done;
done

노드 정비/재시작

Rook 컴포넌트 scale up

  1. MON
  2. OSD, MGR
  3. Etc.
  4. Rook Operator
for _item in $(kubectl get deployment -n rook-ceph | awk '/^rook-ceph-mon/{print $1}'); do
kubectl -n rook-ceph scale deployment ${_item} --replicas=1;
while [[ $(kubectl get deployment -n rook-ceph ${_item} -o jsonpath='{.status.replicas}') != "1" ]]; do
sleep 5;
done;
done
for _category in rook-ceph-mgr rook-ceph-osd; do
for _item in $(kubectl get deployment -n rook-ceph | awk '/^'"${_category}"'/{print $1}'); do
kubectl -n rook-ceph scale deployment ${_item} --replicas=1;
while [[ $(kubectl get deployment -n rook-ceph ${_item} -o jsonpath='{.status.replicas}') != "1" ]]; do
sleep 5;
done;
done;
done
for _category in rook-ceph-exporter rook-ceph-crashcollector; do
for _item in $(kubectl get deployment -n rook-ceph | awk '/^'"${_category}"'/{print $1}'); do
kubectl -n rook-ceph scale deployment ${_item} --replicas=1;
while [[ $(kubectl get deployment -n rook-ceph ${_item} -o jsonpath='{.status.replicas}') != "1" ]]; do
sleep 5;
done;
done;
done
kubectl -n rook-ceph scale deployment rook-ceph-operator --replicas=1

Operator가 실행되면 자동으로 컴포넌트들을 복구합니다.

Ceph OSD 설정

for _option in noout nodown norebalance nobackfill norecover pause; do
kubectl rook-ceph ceph osd unset ${_option}
done

재시작 후 k8s 상태를 확인합니다. 문제가 없으면 재시작 전에 set했던 플래그들을 모두 unset 명령어로 해제합니다.