node.js - MongoDB slows down every 2 hours and 10 minutes accurately -


in past 3 months, mongodb server getting slow every 2 hours , 10 minutes, accurate.

my server configuration:

  • 3 replica set, , purpose of data backup, 1 of them has 3600 seconds delay.
  • no slave servers 3 masters in replica set.
  • use mongoose + node.js provide rest api.
  • about 9 reads , 1.5 writes per second in average in 24 hours statistics data.

what did after searching stackoverflow , google:

  • restart server cannot change slow interval 2 hours , 10 minutes
  • create index fields query, no impact
  • delete data file in 1 server , use 1 recovery, delete anohter , recovery back, no impact
  • shift primary server, no impact
  • run 'currentops' when database slow, can see lot of query hung there, many logs paste here, didn't see abnormal query.
  • in mongo console, check "serverstatus" when database slow, command waiting until database recovered.
  • no memory usage increase "top" command when database slow.
  • rest api not access database works well.

i guess there might locking, potential cause may building index. there something special in database:

  • i have 14000 collections in 1 database, , increasing. there may 1 3000 records in 1 collections.
  • both number of collections , number records increasing dynamically.
  • index fields specified when creating new collection.

i have been obsessed issue 3 months. comments/suggestions highly appreciated!

here some logs log file:

fri jul 5 15:20:11.040 [conn2765] serverstatus slow: { after basic: 0, after asserts: 0, after backgroundflushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globallock: 0, after indexcounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersrepl: 0, after recordstats: 222694, after repl: 222694, @ end: 222694 }

fri jul 5 17:30:09.367 [conn4711] serverstatus slow: { after basic: 0, after asserts: 0, after backgroundflushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globallock: 0, after indexcounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersrepl: 0, after recordstats: 199498, after repl: 199498, @ end: 199528 }

fri jul 5 19:40:12.697 [conn6488] serverstatus slow: { after basic: 0, after asserts: 0, after backgroundflushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globallock: 0, after indexcounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersrepl: 0, after recordstats: 204061, after repl: 204061, @ end: 204081 }

here screen shot of pingdom report, server down 4 minutes every 2 hours , 7 minutes. in beginning, server down 2 minutes every 2 hours , 6 minutes. report pingdom

[edit 1] more monitor result host provider: cpu http://i.minus.com/izbnympzlslrr.png diskio http://i.minus.com/ivgrhr0ghoz92.png connections http://i.minus.com/itbfyq0ssmlns.png periodically increased connections because connections waiting, , count current connection accumulate until database unblocked. this not because of huge traffic.

we found specific 2:10 issue. in our case, execution of dbstats mms. had upgrade cluter , issue got resolved.


Comments