预估Ceph集群恢复时间

一、前言

本章很简单,就是预估集群恢复的时间,这个地方是简单的通过计算来预估需要恢复的实际,动态的显示

二、代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import os
import sys
import commands
import json
def main():
gettime()
def conversecs(sec):
d = sec/86400
h = sec%86400/3600
m = sec%3600/60
s = sec%60
return "remain time:%s day %s hour %s min %s sec" %(d,h,m,s)
def gettime():
try:
recover_time = commands.getoutput('timeout 10 ceph -s -f json 2>/dev/null')
json_str = json.loads(recover_time)
if json_str["pgmap"].has_key('degraded_objects') == True:
degraded_objects = json_str["pgmap"]["degraded_objects"]
if json_str["pgmap"].has_key('recovering_objects_per_sec') == True and json_str["pgmap"]["recovering_objects_per_sec"] != 0:
recovering_objects_per_sec = json_str["pgmap"]["recovering_objects_per_sec"]
resec=degraded_objects/recovering_objects_per_sec
print "recovery objects: %s" %(degraded_objects)
print "recovery speed :%s" %(recovering_objects_per_sec)
print conversecs(resec)
else:
resec=degraded_objects/1
print "recovery objects: %s" %(degraded_objects)
print "recovery speed :0"
print conversecs(resec)
else:
print "recover all done!"
except:
print "Ceph Cluster health?try ceph -s"
if __name__ == '__main__':
main()

执行

1
watch python  recoverytime.py 

通过脚本获取的方式

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#! /bin/sh
while ( 2>1 )
do
start=`ceph -s -f json-pretty|grep misplaced_objects|cut -d ":" -f 2|cut -d ',' -f 1`
sleep 5
end=`ceph -s -f json-pretty|grep misplaced_objects|cut -d ":" -f 2|cut -d ',' -f 1`
speed=$((start-end))
#echo $end
#echo $speed
second=$((end/speed*5))

hour=$(( $second/3600 ))
min=$(( ($second-${hour}*3600)/60 ))
sec=$(( $second-${hour}*3600-${min}*60 ))
echo 当前时间:`date`
echo 迁移剩余:$end
echo 迁移速度:$((speed/5))
echo 迁移还需要:${hour}小时${min}${sec}

done

这个是bash脚本获取的方式

三、效果

四、进度

目前只统计了恢复的,还要考虑backfill的,后续增加