in past 3 months, mongodb server getting slow every 2 hours , 10 minutes, accurate.
my server configuration:
- 3 replica set, , purpose of data backup, 1 of them has 3600 seconds delay.
- no slave servers 3 masters in replica set.
- use mongoose + node.js provide rest api.
- about 9 reads , 1.5 writes per second in average in 24 hours statistics data.
what did after searching stackoverflow , google:
- restart server cannot change slow interval 2 hours , 10 minutes
- create index fields query, no impact
- delete data file in 1 server , use 1 recovery, delete anohter , recovery back, no impact
- shift primary server, no impact
- run 'currentops' when database slow, can see lot of query hung there, many logs paste here, didn't see abnormal query.
- in mongo console, check "serverstatus" when database slow, command waiting until database recovered.
- no memory usage increase "top" command when database slow.
- rest api not access database works well.
i guess there might locking, potential cause may building index. there something special in database:
- i have 14000 collections in 1 database, , increasing. there may 1 3000 records in 1 collections.
- both number of collections , number records increasing dynamically.
- index fields specified when creating new collection.
i have been obsessed issue 3 months. comments/suggestions highly appreciated!
here some logs log file:
fri jul 5 15:20:11.040 [conn2765] serverstatus slow: { after basic: 0, after asserts: 0, after backgroundflushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globallock: 0, after indexcounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersrepl: 0, after recordstats: 222694, after repl: 222694, @ end: 222694 }
fri jul 5 17:30:09.367 [conn4711] serverstatus slow: { after basic: 0, after asserts: 0, after backgroundflushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globallock: 0, after indexcounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersrepl: 0, after recordstats: 199498, after repl: 199498, @ end: 199528 }
fri jul 5 19:40:12.697 [conn6488] serverstatus slow: { after basic: 0, after asserts: 0, after backgroundflushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globallock: 0, after indexcounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersrepl: 0, after recordstats: 204061, after repl: 204061, @ end: 204081 }
here screen shot of pingdom report, server down 4 minutes every 2 hours , 7 minutes. in beginning, server down 2 minutes every 2 hours , 6 minutes.
[edit 1] more monitor result host provider: cpu http://i.minus.com/izbnympzlslrr.png diskio http://i.minus.com/ivgrhr0ghoz92.png connections http://i.minus.com/itbfyq0ssmlns.png periodically increased connections because connections waiting, , count current connection accumulate until database unblocked. this not because of huge traffic.
we found specific 2:10 issue. in our case, execution of dbstats mms. had upgrade cluter , issue got resolved.
Comments
Post a Comment