python监控某个进程内存
测试场景:
- 某个客户端程序长时间运行后存在内存泄漏问题,现在开发解决了需要去验证这个问题是否还存在,并要求出具相应测试验证报告。
手段:
- 需要有一个工具能够实时去获取该程序进程一直运行下占用内存,CPU使用率情况。
方法:
- python去实现这么个监控功能
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
import sys import time import psutil sys.argv # get pid from args #获取命令行输入的参数个数,sys.ary是一个列表 #如果列表参数<2,说明只输入了python 文件名称.py,则退出不继续运行 if len (sys.argv) < 2 : print ( "没有输入要监控的进程编号" ) sys.exit() #获取进程 print ( "打印进程号:" + sys.argv[ 1 ]) pid = int (sys.argv[ 1 ]) p = psutil.Process(pid) #监控进程并将获取CPU,内存使用情况写入csv文件中 interval = 60 # 获取CPU,内存使用情况轮询时间间隔 num = 100 with open ( "process_monitor_" + p.name() + '_' + str (pid) + ".csv" , "a+" ) as f: f.write( "时间,cpu使用率(%),内存使用率(%),内存使用值MB\n" ) # csv文件表头列名:time,cpu使用率,内存使用率,内存占用值MB while num> 0 : num = num - 1 current_time = time.strftime( '%Y%m%d-%H%M%S' ,time.localtime(time.time())) cpu_percent = p.cpu_percent() # better set interval second to calculate like: p.cpu_percent(interval=0.5) mem_percent = p.memory_percent() mem_info = p.memory_info().rss mem_MB = 4096 / mem_percent print ( '当前进程的内存使用:' ,mem_info) print ( '当前进程的内存使用:%.4f MB' % mem_MB) line = current_time + ',' + str (cpu_percent) + ',' + str (mem_percent) + ',' + str (mem_MB) print (line) f.write(line + "\n" ) time.sleep(interval) |
python监控进程并重启
最近公司的游戏服务器经常掉线,老板只能让员工不定时登陆服务器看死掉没有,都快成机器人了,因此python自动化监测进程运用脚本就产生了。
分析了具体思路
1.做个线程定时器,每隔20s执行系统命令查询指定进程名称是否存在
2.如果不存在,就重启;不存在就不进行后续的操作。
相关代码很简单
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
def restart_process(process_name): red = subprocess.Popen( 'tasklist' , stdout = subprocess.PIPE, stderr = subprocess.PIPE, shell = True ) tasklist_str = red.stdout.read().decode(encoding = 'gbk' ) re_path = process_name.split( "\\" )[ - 1 ] formattime = datetime.datetime.now().strftime( '%Y-%m-%d %H:%M:%S' ) if re_path not in tasklist_str: # obj = connect_emai() # sendmail('程序卡掉正在重启。。。', obj) # 发送HTTP请求 # url = "http://159.138.131.148/server_offline.html" # request = urllib.request(url) global count count + = 1 print (formattime + '第' + str (count) + '次检测发现异常重连' ) cmd = process_name os.system(process_name) # res = subprocess.Popen(cmd,stdout=subprocess.PIPE, stderr=subprocess.PIPE,shell=True) # print(res.stderr.read().decode(encoding='gbk'),res.stdout.read().decode(encoding='gbk')) # sendmail('重启连接成功!',obj) print ( 'yes,connected' ) else : global error_count error_count + = 1 print (formattime + '第' + str (error_count) + '次检测正在运行中' ) global timer timer = Timer( 20 , restart_process, ( "start C:\Progra~1\CloudControlServer\CloudControlServer.exe" ,)) timer.start() count = 0 error_count = 0 timer = Timer( 20 , restart_process, ( "start C:\Progra~1\CloudControlServer\CloudControlServer.exe" ,)) timer.start() |
搞定!!!
接下来有了新的需求~~ 需要监控CPU的运行状态,如果CPU一直维持在80%以上 就主动杀死进程,并重启进程,使用了牛逼的psutil 跨系统平台操作库。实现代码如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
def look_cpu(process_name): res = subprocess.Popen( 'wmic cpu get LoadPercentage' , stdout = subprocess.PIPE, stderr = subprocess.PIPE, shell = True ) res_str = res.stdout.read().decode(encoding = 'gbk' ) num = re.findall( '\d+' , res_str)[ 0 ] if int (num) > 80 : print ( 'cup负载超过10%' ) time.sleep( 10 ) res_twice = subprocess.Popen( 'wmic cpu get LoadPercentage' , stdout = subprocess.PIPE, stderr = subprocess.PIPE, shell = True ) res_twice_str = res_twice.stdout.read().decode(encoding = 'gbk' ) num_twice = re.findall( '\d+' , res_twice_str)[ 0 ] # 判断两次监测稳定在5%以内 杀死进程并重启 if abs ( int (num) - int (num_twice)) < 5 : tasklist = subprocess.Popen( 'tasklist | findstr CloudControlServer.exe' , stdout = subprocess.PIPE, stderr = subprocess.PIPE, shell = True ) res = tasklist.stdout.read().decode(encoding = 'gbk' ) pid = re.search( '\d{1,4}' , res).group() cmd = 'taskkill -f /pid %s' % pid time.sleep( 0.5 ) print (cmd) os.system( 'taskkill -f /pid %s' % pid) os.system(process_name) print ( '正在监测cpu,cpu占用率:%s' % num) global timer timer = Timer( 30 , look_cpu, ( "start C:\Progra~1\CloudControlServer\CloudControlServer.exe" ,)) timer.start() |
但是第三天老板有了新的需求,需要做个web端 将CPU和内存信息开放api 并且支持远程重启,我的思路是利用python自带的http服务类库,省去了socket编程的麻烦,直接输入IP port 即可,这里使用了wsgiref.simple_server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|
# web服务应用函数 def application(environ, start_response): path = environ.get( 'PATH_INFO' ) start_response( '200 OK' , []) # 提供cpu 状态信息 if path = = '/cpu' : res = subprocess.Popen( 'wmic cpu get LoadPercentage' , stdout = subprocess.PIPE, stderr = subprocess.PIPE, shell = True ) res_str = res.stdout.read().decode(encoding = 'gbk' ) resp = { 'cpu' : re.findall( '\d+' , res_str)[ 0 ]} return [json.dumps(resp).encode(encoding = 'utf-8' )] # 提供cpu + memory 信息 elif path = = '/state' : cpu = psutil.cpu_percent() memory = psutil.virtual_memory() memory_lv = float (memory.used) / float (memory.total) * 100 res = { 'cpu' : cpu, 'memory' : memory_lv} return [json.dumps(res).encode(encoding = 'utf-8' )] # 提供重启进程api elif path = = '/restart_process' : # os.system('shutdowm.exe -r') res = remote_restart_process( "start C:\Progra~1\CloudControlServer\CloudControlServer.exe" ) return [b 'success' ] # 启动web服务器提供api .port=8060 httpserver = make_server('', 8060 , application) httpserver.serve_forever() ''' 三个api接口: ip:8060/cpu cpu信息 ip:8060/state cpu+memory状态 ip:8060/restart_process 重启进程 ''' |
以上为个人经验,希望能给大家一个参考,也希望大家多多支持服务器之家。
原文链接:https://blog.csdn.net/weixin_43533308/article/details/123929124