CEPH - CRUSHTOOL (Ver. 10.2.11_jewel, OS. Centos7 ) 사용시 장애 포인트 ( 버그??)
* 해당 크러시맵을 수정 하여 적용하면 드문경우로 인식을 제대로 못할때가 발생됨. 최초에 적용 할때 URL=crushtool_링크 로 적용 했으나 문제가 발생했는데 그 결과가 아래와 같음.
[ 상태 확인 ] [root@MGMT11:25:40:~]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -2 0.90599 root ssd -7 0.45399 ssd_osd OSD-20 4 0.45399 osd.4 up 1.00000 1.00000 -8 0.45200 ssd_osd OSD-21 5 0.45200 osd.5 up 1.00000 1.00000 -1 58.15997 root hdd -3 14.53999 hdd_osd OSD-0 0 14.53999 osd.0 up 1.00000 1.00000 -4 14.53999 hdd_osd OSD-1 1 14.53999 osd.1 up 1.00000 1.00000 -5 14.53999 hdd_osd OSD-2 2 14.53999 osd.2 up 1.00000 1.00000 -6 14.53999 hdd_osd OSD-3 3 14.53999 osd.3 up 1.00000 1.00000 * 음.. 정상인데..?? 하고 외관상 문제가 없어 상태값 확인하니.. [root@MGMT11:25:48:~]# ceph -s cluster 427f2e6a-5722-4365-a475-8fcdc218a418 health HEALTH_WARN 128 pgs stuck unclean monmap e2: 4 mons at {MON-0=192.168.1.13:6789/0,MON-1=192.168.1.14:6789/0,MON-2=192.168.1.15:6789/0,MON-3=192.168.1.16:6789/0} election epoch 6, quorum 0,1,2,3 MON-0,MON-1,MON-2,MON-3 osdmap e79: 6 osds: 6 up, 6 in; 128 remapped pgs flags sortbitwise,require_jewel_osds pgmap v249: 256 pgs, 2 pools, 0 bytes data, 0 objects 659 MB used, 60483 GB / 60484 GB avail 128 active+clean 128 active+remapped [root@MGMT11:26:18:~]# ceph health detail HEALTH_WARN 128 pgs stuck unclean pg 1.7e is stuck unclean for 701.594548, current state active+remapped, last acting [2,1] pg 1.7f is stuck unclean for 699.224062, current state active+remapped, last acting [0,2] pg 1.7c is stuck unclean for 699.223706, current state active+remapped, last acting [0,3] pg 1.7d is stuck unclean for 699.273517, current state active+remapped, last acting [1,2] pg 1.7a is stuck unclean for 701.337639, current state active+remapped, last acting [3,2] . . . * .............. * 필자는 당황.. ㄷㄷㄷ;; 일단 osd와 버킷 룰들이 제대로 매칭이 안되는듯 하여 만인의 해결책인 osd 재시작 시전. pg(pgp) 들이 재정렬 되면서 돌아오는가 싶더니... [root@MGMT11:41:47:~]# ceph -s cluster 427f2e6a-5722-4365-a475-8fcdc218a418 health HEALTH_WARN 255 pgs degraded 228 pgs stale 156 pgs stuck unclean 255 pgs undersized monmap e2: 4 mons at {MON-0=192.168.1.13:6789/0,MON-1=192.168.1.14:6789/0,MON-2=192.168.1.15:6789/0,MON-3=192.168.1.16:6789/0} election epoch 6, quorum 0,1,2,3 MON-0,MON-1,MON-2,MON-3 osdmap e116: 6 osds: 6 up, 6 in; 28 remapped pgs flags sortbitwise,require_jewel_osds pgmap v339: 256 pgs, 2 pools, 0 bytes data, 0 objects 664 MB used, 60483 GB / 60484 GB avail 128 stale+active+undersized+degraded 100 stale+active+undersized+degraded+remapped 27 active+undersized+degraded+remapped 1 active+remapped [root@MGMT11:42:04:~]# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -9 0 root ssd -6 0 ssd_osd OSD-20 -7 0 ssd_osd OSD-21 -8 0 root hdd -2 0 hdd_osd OSD-0 -3 0 hdd_osd OSD-1 -4 0 hdd_osd OSD-2 -5 0 hdd_osd OSD-3 -1 59.06596 root default 5 0.45200 osd.5 up 1.00000 1.00000 4 0.45399 osd.4 up 1.00000 1.00000 3 14.53999 osd.3 up 1.00000 1.00000 2 14.53999 osd.2 up 1.00000 1.00000 1 14.53999 osd.1 up 1.00000 1.00000 0 14.53999 osd.0 up 1.00000 1.00000 * 당황 그 자체... 각 OSD 버킷들은 정상적으로 타입별로 들어 갔으나.. 제일 중요한.. 알맹이인 osd 장치들이 삭제한 default 버킷에 들어가 있는것...;; 아니..왜...?? * crushmap 으로 확인해 보니 default 버킷에 item으로 해당 osd 들이 삽입 되어있는 광경이 펼쳐짐... 왜지? 지웠는데.... [ crushmap 확인 ] [root@MGMT11:42:07:~]# ceph osd getcrushmap -o /tmp/crushmap got crush map from osdmap epoch 116 [root@MGMT11:42:36:~]# crushtool -d /tmp/crushmap -o /tmp/crushmap.txt [root@MGMT11:42:36:~]# cat /tmp/crushmap.txt # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 tunable chooseleaf_vary_r 1 tunable straw_calc_version 1 tunable allowed_bucket_algs 54 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 device 4 osd.4 device 5 osd.5 # types type 0 osd type 1 ssd_osd type 2 hdd_osd type 3 root # buckets root default { id -1 # do not change unnecessarily # weight 59.066 # 갑자기 삭제한 default가 생기더니 알맹이들 다가져감...;; alg straw2 hash 0 # rjenkins1 item osd.5 weight 0.452 item osd.4 weight 0.454 item osd.3 weight 14.540 item osd.2 weight 14.540 item osd.1 weight 14.540 item osd.0 weight 14.540 } hdd_osd OSD-0 { id -2 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 } hdd_osd OSD-1 { id -3 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 } hdd_osd OSD-2 { id -4 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 } hdd_osd OSD-3 { id -5 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 } ssd_osd OSD-20 { id -6 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 } ssd_osd OSD-21 { id -7 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 } root hdd { id -8 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item OSD-0 weight 0.000 item OSD-1 weight 0.000 item OSD-2 weight 0.000 item OSD-3 weight 0.000 } root ssd { id -9 # do not change unnecessarily # weight 0.000 alg straw hash 0 # rjenkins1 item OSD-20 weight 0.000 item OSD-21 weight 0.000 } # rules rule hdd { ruleset 0 type replicated min_size 1 max_size 10 step take hdd step chooseleaf firstn 0 type hdd_osd step emit } rule ssd { ruleset 1 type replicated min_size 1 max_size 10 step take ssd step chooseleaf firstn 0 type ssd_osd step emit } # end crush map * 이때 필자는 적지 않게 당황 함... 이 블로그를 보시는 분들은 당황 하지 말고 처음에 적용 했던 룰을 다시 적용 해보시길.. 그래도 안된다?? 필자도 " 또 " 안됐음.. ㅋㅋㅋ 다시 확인 해보니 #devices 아래 osd 들과. 각 버킷의 아래에 "item osd.4 weight 0.454" 형태의 osd명과 용량의 내용이 삭제 되어 적용 된것. 해당 crushmap 을 다시 설정 해서 적용 하면 정상적으로 나오니 안된다고, 다시 날리고 재설치 하는 일이 없도록 하시긴..ㅠㅠ
[ crushmap 수정 및 복원 ]
i. crushmap 수정
[root@MGMT01:04:11:~]# vi /tmp/crushmap.txt
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1
tunable straw_calc_version 1
tunable allowed_bucket_algs 54
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
# types
type 0 osd
type 1 ssd_osd
type 2 hdd_osd
type 3 root
# buckets
hdd_osd OSD-0 {
id -10 # do not change unnecessarily
# weight 14.540
alg straw
hash 0 # rjenkins1
item osd.0 weight 14.540
}
hdd_osd OSD-1 {
id -11 # do not change unnecessarily
# weight 14.540
alg straw
hash 0 # rjenkins1
item osd.1 weight 14.540
}
hdd_osd OSD-2 {
id -12 # do not change unnecessarily
# weight 14.540
alg straw
hash 0 # rjenkins1
item osd.2 weight 14.540
}
hdd_osd OSD-3 {
id -13 # do not change unnecessarily
# weight 14.540
alg straw
hash 0 # rjenkins1
item osd.3 weight 14.540
}
root hdd {
id -1 # do not change unnecessarily
# weight 58.160
alg straw
hash 0 # rjenkins1
item OSD-0 weight 14.540
item OSD-1 weight 14.540
item OSD-2 weight 14.540
item OSD-3 weight 14.540
}
ssd_osd OSD-20 {
id -20 # do not change unnecessarily
# weight 0.454
alg straw
hash 0 # rjenkins1
item osd.4 weight 0.454
}
ssd_osd OSD-21 {
id -21 # do not change unnecessarily
# weight 0.454
alg straw
hash 0 # rjenkins1
item osd.5 weight 0.454
}
root ssd {
id -2 # do not change unnecessarily
# weight 0.908
alg straw
hash 0 # rjenkins1
item OSD-20 weight 0.454
item OSD-21 weight 0.454
}
# rules
rule hdd {
ruleset 0
type replicated
min_size 1
max_size 10
step take hdd
step chooseleaf firstn 0 type hdd_osd
step emit
}
rule ssd {
ruleset 1
type replicated
min_size 1
max_size 10
step take ssd
step chooseleaf firstn 0 type ssd_osd
step emit
}
# end crush map
ii. crushmap 복원
[root@MGMT10:51:15:~]# crushtool -c /tmp/crushmap.txt -o /tmp/crushmap-new.bin
[root@MGMT11:13:45:~]# crushtool -c /tmp/crushmap.txt -o /tmp/crushmap.coloc
[root@MGMT11:14:19:~]# ceph osd setcrushmap -i /tmp/crushmap.coloc
[root@MGMT01:09:13:~]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-2 0.90799 root ssd
-20 0.45399 ssd_osd OSD-20
4 0.45399 osd.4 up 1.00000 1.00000
-21 0.45399 ssd_osd OSD-21
5 0.45399 osd.5 up 1.00000 1.00000
-1 58.15997 root hdd
-10 14.53999 hdd_osd OSD-0
0 14.53999 osd.0 up 1.00000 1.00000
-11 14.53999 hdd_osd OSD-1
1 14.53999 osd.1 up 1.00000 1.00000
-12 14.53999 hdd_osd OSD-2
2 14.53999 osd.2 up 1.00000 1.00000
-13 14.53999 hdd_osd OSD-3
3 14.53999 osd.3 up 1.00000 1.00000
[root@MGMT01:11:03:~]# ceph -s
cluster 427f2e6a-5722-4365-a475-8fcdc218a418
health HEALTH_OK
monmap e2: 4 mons at {MON-0=192.168.1.13:6789/0,MON-1=192.168.1.14:6789/0,MON-2=192.168.1.15:6789/0,MON-3=192.168.1.16:6789/0}
election epoch 6, quorum 0,1,2,3 MON-0,MON-1,MON-2,MON-3
osdmap e125: 6 osds: 6 up, 6 in
flags sortbitwise,require_jewel_osds
pgmap v424: 256 pgs, 2 pools, 0 bytes data, 0 objects
667 MB used, 60483 GB / 60484 GB avail
256 active+clean
* 복원 완료