Project

General

Profile

Anomalie #1089

interruption pavot.april.org suite a un overflow de oom-killer dans les vservers

Added by Loïc Dachary over 9 years ago. Updated about 3 years ago.

Status:
Fermé
Priority:
Normale
Assignee:
Category:
Task
Target version:
Start date:
12/10/2012
Due date:
12/14/2012
% Done:

100%

Estimated time:
(Total: 0.00 h)
Spent time:
3.00 h (Total: 18.00 h)
Difficulté:
2 Facile

Description

Diagnostic

Juste avant l'arret de pavot.april.org il y a un oom-killer
Dec 10 18:33:51 pavot kernel: [292373.923965] Killed process amavisd-new(27372:#16)

qui est précédé de nombreux autres dans la même journée. #16 dit que le process 27372 du contexte vserver 16 a été tué. Il semble que cela provienne d'une instabilité de vserver:

Solution

  • augmentation des limites /etc/vservers/.../rlimits/rss....
    • les vservers traitant le spam et le mail on besoin de beaucoup de memoire, mail a ~9GB, spamvir ~8GB
    • nginx bien que peu gourmand arrive a bout de ses 512MB de RAM en cas d'affluence, on lui donne 1GB
    • amphetamine herberge redmine: on lui donne 2 x de RAM mais ca ne servira que ponctuellement, lorsque redmine devient gourmand
      Les limites actuelles sont (en MB):
      /etc/vservers/amphetamine/rlimits/rss.hard : 1953
      /etc/vservers/bots/rlimits/rss.hard : 585
      /etc/vservers/dns/rlimits/rss.hard : 585
      /etc/vservers/ergine/rlimits/rss.hard : 585
      /etc/vservers/harmine/rlimits/rss.hard : 1953
      /etc/vservers/lamp/rlimits/rss.hard : 585
      /etc/vservers/mail/rlimits/rss.hard : 9375
      /etc/vservers/nginx/rlimits/rss.hard : 1171
      /etc/vservers/spamvir/rlimits/rss.hard : 7812
      total = 24604
      
  • ajout d'une sonde nagios qui alerte lorsque oom-killer se met en route sur pavot.april.org
  • documenter l'action corrective a prendre ( a) chercher la raison, b) augmenter la limite hard s'il faut )

Alternatives

  • suppression des limites /etc/vservers/*/rlimits/rss.*

TODO


Subtasks

Anomalie #1090: classe puppet pour oomkillerFerméLoïc Dachary

Actions

Related issues

Related to Admins - Demande #1065: echéance 10 et 11 décembre 2012Fermé12/10/201212/11/2012

Actions

Associated revisions

Revision 00fbc0b8 (diff)
Added by Loïc Dachary over 9 years ago

bind nrpe to the IP of the server instead of * to avoid a conflicting between vserver guests and the vserver host refs #1089

Revision ae22ef9e (diff)
Added by Loïc Dachary over 9 years ago

use hostname instead of IP for allowed hosts nagios.vm.april-int,nagios-hetzner.vm.april-int,nagios.novalocal which will work if the reverse is properly defined refs #1089

History

#1

Updated by Loïc Dachary over 9 years ago

Dec 10 18:19:01 pavot kernel: [291484.458700] 6142458 pages non-shared
Dec 10 18:19:01 pavot kernel: [291484.458704] Out of memory: kill process dovecot(5657:#17) score 46580 or a child
Dec 10 18:19:01 pavot kernel: [291484.458751] Killed process managesieve-log(5712:#17)
#2

Updated by Loïc Dachary over 9 years ago

Les dernières lignes de syslog

Dec 10 18:33:51 pavot kernel: [292373.833415] Node 0 DMA free:15820kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB \
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15280kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB sla\
b_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable\
? yes
Dec 10 18:33:51 pavot kernel: [292373.833425] lowmem_reserve[]: 0 1970 24190 24190
Dec 10 18:33:51 pavot kernel: [292373.833428] Node 0 DMA32 free:90428kB min:1620kB low:2024kB high:2428kB active_anon:1161720kB inactive_anon:386956\
kB active_file:12kB inactive_file:96kB unevictable:0kB isolated(anon):896kB isolated(file):0kB present:2018144kB mlocked:0kB dirty:0kB writeback:52k\
B mapped:96kB shmem:12kB slab_reclaimable:2016kB slab_unreclaimable:2200kB kernel_stack:1456kB pagetables:9660kB unstable:0kB bounce:0kB writeback_t\
mp:0kB pages_scanned:3840 all_unreclaimable? no
Dec 10 18:33:51 pavot kernel: [292373.833438] lowmem_reserve[]: 0 0 22220 22220
Dec 10 18:33:51 pavot kernel: [292373.833441] Node 0 Normal free:18708kB min:18276kB low:22844kB high:27412kB active_anon:21318120kB inactive_anon:1\
522128kB active_file:1176kB inactive_file:1876kB unevictable:0kB isolated(anon):6660kB isolated(file):124kB present:22753280kB mlocked:0kB dirty:0kB\
 writeback:932kB mapped:3924kB shmem:3472kB slab_reclaimable:19536kB slab_unreclaimable:24740kB kernel_stack:2176kB pagetables:76092kB unstable:0kB \
bounce:0kB writeback_tmp:0kB pages_scanned:2656 all_unreclaimable? no
Dec 10 18:33:51 pavot kernel: [292373.833451] lowmem_reserve[]: 0 0 0 0
Dec 10 18:33:51 pavot kernel: [292373.833454] Node 0 DMA: 1*4kB 1*8kB 2*16kB 1*32kB 2*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 1582\
0kB
Dec 10 18:33:51 pavot kernel: [292373.833463] Node 0 DMA32: 13*4kB 295*8kB 355*16kB 83*32kB 41*64kB 32*128kB 29*256kB 20*512kB 14*1024kB 18*2048kB 1\
*4096kB = 90428kB
Dec 10 18:33:51 pavot kernel: [292373.833470] Node 0 Normal: 3743*4kB 7*8kB 2*16kB 0*32kB 1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB \
= 18708kB
Dec 10 18:33:51 pavot kernel: [292373.833478] 17980 total pagecache pages
Dec 10 18:33:51 pavot kernel: [292373.833480] 16195 pages in swap cache
Dec 10 18:33:51 pavot kernel: [292373.833482] Swap cache stats: add 894911, delete 878716, find 311776/332827
Dec 10 18:33:51 pavot kernel: [292373.833483] Free swap  = 0kB
Dec 10 18:33:51 pavot kernel: [292373.833485] Total swap = 1951856kB
Dec 10 18:33:51 pavot kernel: [292373.923907] 6291456 pages RAM
Dec 10 18:33:51 pavot kernel: [292373.923910] 105740 pages reserved
Dec 10 18:33:51 pavot kernel: [292373.923911] 77661 pages shared
Dec 10 18:33:51 pavot kernel: [292373.923912] 6136879 pages non-shared
Dec 10 18:33:51 pavot kernel: [292373.923916] Out of memory: kill process amavisd-new(1743:#16) score 53166 or a child
Dec 10 18:33:51 pavot kernel: [292373.923965] Killed process amavisd-new(27372:#16)
Dec 10 18:33:51 pavot kernel: [292373.932594] nrpe invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0
Dec 10 18:33:51 pavot kernel: [292373.932598] nrpe cpuset=/ mems_allowed=0
Dec 10 18:33:51 pavot kernel: [292373.932601] Pid: 30865, comm: nrpe Not tainted 2.6.32-bpo.3-vserver-amd64 #1
Dec 10 18:33:51 pavot kernel: [292373.932603] Call Trace:
Dec 10 18:33:51 pavot kernel: [292373.932612]  [<ffffffff810bec61>] ? oom_kill_process+0x7f/0x246
Dec 10 18:33:51 pavot kernel: [292373.932615]  [<ffffffff810bf24f>] ? __out_of_memory+0x1b4/0x1cb
Dec 10 18:33:51 pavot kernel: [292373.932618]  [<ffffffff810bf3a6>] ? out_of_memory+0x140/0x172
Dec 10 18:33:51 pavot kernel: [292373.932622]  [<ffffffff810d0d84>] ? congestion_wait+0x74/0x80
Dec 10 18:33:51 pavot kernel: [292373.932626]  [<ffffffff81065a0e>] ? autoremove_wake_function+0x0/0x2e
Dec 10 18:33:51 pavot kernel: [292373.932629]  [<ffffffff810c3031>] ? __alloc_pages_nodemask+0x4b6/0x5cd
Dec 10 18:33:51 pavot kernel: [292373.932634]  [<ffffffff812fcb5d>] ? io_schedule+0x93/0xb7
Dec 10 18:33:51 pavot kernel: [292373.932636]  [<ffffffff810c459d>] ? __do_page_cache_readahead+0x9b/0x1a4
Dec 10 18:33:51 pavot kernel: [292373.932639]  [<ffffffff81065a3c>] ? wake_bit_function+0x0/0x23
Dec 10 18:33:51 pavot kernel: [292373.932642]  [<ffffffff810c46c2>] ? ra_submit+0x1c/0x20
Dec 10 18:33:51 pavot kernel: [292373.932645]  [<ffffffff810bd3b9>] ? filemap_fault+0x183/0x2fc
Dec 10 18:33:51 pavot kernel: [292373.932648]  [<ffffffff810d3235>] ? __do_fault+0x52/0x3f1
Dec 10 18:33:51 pavot kernel: [292373.932652]  [<ffffffff810eeb9d>] ? virt_to_head_page+0x9/0x2b
Dec 10 18:33:51 pavot kernel: [292373.932655]  [<ffffffff810d55be>] ? handle_mm_fault+0x3a2/0x853
Dec 10 18:33:51 pavot kernel: [292373.932659]  [<ffffffff81243646>] ? sys_accept4+0x1ab/0x1d7
Dec 10 18:33:51 pavot kernel: [292373.932663]  [<ffffffff8103288f>] ? do_page_fault+0x266/0x282

#3

Updated by Loïc Dachary over 9 years ago

  • Assignee set to Loïc Dachary
#4

Updated by Loïc Dachary over 9 years ago

root@pavot:~# tail /etc/vservers/*/rlimits/rss.hard
==> /etc/vservers/amphetamine/rlimits/rss.hard <==
250000

==> /etc/vservers/bots/rlimits/rss.hard <==
150000

==> /etc/vservers/dns/rlimits/rss.hard <==
150000

==> /etc/vservers/ergine/rlimits/rss.hard <==
150000

==> /etc/vservers/harmine/rlimits/rss.hard <==
500000

==> /etc/vservers/lamp/rlimits/rss.hard <==
150000

==> /etc/vservers/mail/rlimits/rss.hard <==
600000

==> /etc/vservers/nginx/rlimits/rss.hard <==
150000

==> /etc/vservers/spamvir/rlimits/rss.hard <==
500000
#5

Updated by Loïc Dachary over 9 years ago

La liste des process qui ont été tués par oom-killer. Le #17 signifie qu'il s'agit d'un process du vserver 17 comme dans la liste suivante:

root@pavot:~# vserver-stat 
CTX   PROC    VSZ    RSS  userTIME   sysTIME    UPTIME NAME
10      18 597.4M  66.5M   0m04s80   0m00s71   1h08m29 nginx
11      11 349.9M  27.3M   0m01s36   0m00s21   1h09m32 bots
12      14 400.5M  61.5M   0m07s11   0m05s24   1h08m57 dns
15      32   2.5G   231M   0m04s29   0m01s22   1h09m29 lamp
16      25     1G 944.6M   2m28s87   0m12s28   1h08m29 spamvir
17     103   5.3G 760.7M   1m29s24   0m23s88   1h08m41 mail
22      67 955.9M  65.3M   0m01s76   0m00s59   1h09m32 harmine
32     126   1.8G 184.6M   2m29s84   0m15s28   1h09m32 amphetamine
50      13   844M  33.7M   0m00s16   0m00s22   1h08m46 ergine
55       7 191.5M   9.9M   1m27s68   0m09s24   1h08m27 munin

   3666:Dec 10 18:23:45 pavot kernel: [291763.066656] Out of memory: kill process dovecot(5657:#17) score 51310 or a child
   3725:Dec 10 18:24:13 pavot kernel: [291798.354352] Out of memory: kill process pickup(3808:#12) score 10029 or a child
   3785:Dec 10 18:24:13 pavot kernel: [291798.454499] Out of memory: kill process amavisd-new(1743:#16) score 53633 or a child
   3844:Dec 10 18:24:13 pavot kernel: [291798.546546] Out of memory: kill process nginx(6127:#10) score 14150 or a child
   3902:Dec 10 18:24:13 pavot kernel: [291798.663742] Out of memory: kill process dovecot(5657:#17) score 50031 or a child
   3962:Dec 10 18:24:13 pavot kernel: [291798.792552] Out of memory: kill process apache2(4868:#32) score 299647 or a child
   4070:Dec 10 18:25:19 pavot kernel: [291834.014521] Out of memory: kill process dovecot(5657:#17) score 47217 or a child
   4130:Dec 10 18:28:07 pavot kernel: [291896.799504] Out of memory: kill process dovecot(5657:#17) score 45601 or a child
   4191:Dec 10 18:28:07 pavot kernel: [291896.987174] Out of memory: kill process apache2(3677:#15) score 96865 or a child
   4297:Dec 10 18:28:07 pavot kernel: [292003.056218] Out of memory: kill process dovecot(5657:#17) score 44336 or a child
   4356:Dec 10 18:28:07 pavot kernel: [292036.155362] Out of memory: kill process apache2(4868:#32) score 244351 or a child
   4413:Dec 10 18:28:07 pavot kernel: [292041.538757] Out of memory: kill process dovecot(5657:#17) score 54319 or a child
   4472:Dec 10 18:28:53 pavot kernel: [292065.711568] Out of memory: kill process dovecot(5657:#17) score 55473 or a child
   4534:Dec 10 18:28:53 pavot kernel: [292076.071285] Out of memory: kill process apache2(4868:#32) score 354943 or a child
   4596:Dec 10 18:28:53 pavot kernel: [292076.163528] Out of memory: kill process apache2(4868:#32) score 352296 or a child
   4656:Dec 10 18:28:53 pavot kernel: [292076.255375] Out of memory: kill process apache2(3963:#22) score 343426 or a child
   4716:Dec 10 18:28:53 pavot kernel: [292076.347133] Out of memory: kill process apache2(3963:#22) score 337269 or a child
   4785:Dec 10 18:28:53 pavot kernel: [292076.438465] Out of memory: kill process apache2(3963:#22) score 337269 or a child
   4854:Dec 10 18:28:53 pavot kernel: [292076.530359] Out of memory: kill process apache2(3963:#22) score 337269 or a child
   4913:Dec 10 18:28:53 pavot kernel: [292076.622648] Out of memory: kill process amavisd-new(1743:#16) score 53334 or a child
   4968:Dec 10 18:30:19 pavot kernel: [292133.931105] Out of memory: kill process munin-graph(3841:#55) score 229092 or a child
   5031:Dec 10 18:30:19 pavot kernel: [292134.126113] Out of memory: kill process apache2(4868:#32) score 181441 or a child
   5094:Dec 10 18:30:19 pavot kernel: [292134.329241] Out of memory: kill process postgres(3831:#32) score 26692 or a child
   5153:Dec 10 18:32:36 pavot kernel: [292310.075645] Out of memory: kill process apache2(5319:#17) score 105124 or a child
   5211:Dec 10 18:32:36 pavot kernel: [292310.768597] Out of memory: kill process apache2(3963:#22) score 177187 or a child
   5270:Dec 10 18:32:36 pavot kernel: [292311.643728] Out of memory: kill process apache2(3677:#15) score 89124 or a child
   5326:Dec 10 18:32:40 pavot kernel: [292316.918963] Out of memory: kill process apache2(5319:#17) score 100755 or a child
   5386:Dec 10 18:33:51 pavot kernel: [292369.434667] Out of memory: kill process apache2(4868:#32) score 299870 or a child
   5446:Dec 10 18:33:51 pavot kernel: [292369.831159] Out of memory: kill process apache2(5319:#17) score 113158 or a child
   5505:Dec 10 18:33:51 pavot kernel: [292373.730348] Out of memory: kill process apache2(3677:#15) score 81351 or a child
   5566:Dec 10 18:33:51 pavot kernel: [292373.923916] Out of memory: kill process amavisd-new(1743:#16) score 53166 or a child

#6

Updated by Loïc Dachary over 9 years ago

http://linux-vserver.org/Memory_Limits : une page est 4KiB

root@pavot:~# echo 'int main () { printf ("%dKiB\n", getpagesize ()/1024); return 0; }' | gcc -xc - -o getpagesize && ./getpagesize
<stdin>: In function ‘main’:
<stdin>:1: warning: incompatible implicit declaration of built-in function ‘printf’
*4KiB*

#7

Updated by Loïc Dachary over 9 years ago

root@pavot:~# for limit in /etc/vservers/*/rlimits/rss.* ; do echo -n "$limit : " ; expr $(cat $limit) \* 4 / 1024 ; done
/etc/vservers/amphetamine/rlimits/rss.hard : 976
/etc/vservers/amphetamine/rlimits/rss.soft : 683
/etc/vservers/bots/rlimits/rss.hard : 585
/etc/vservers/bots/rlimits/rss.soft : 488
/etc/vservers/dns/rlimits/rss.hard : 585
/etc/vservers/dns/rlimits/rss.soft : 488
/etc/vservers/ergine/rlimits/rss.hard : 585
/etc/vservers/ergine/rlimits/rss.soft : 488
/etc/vservers/harmine/rlimits/rss.hard : 1953
/etc/vservers/harmine/rlimits/rss.soft : 1562
/etc/vservers/lamp/rlimits/rss.hard : 585
/etc/vservers/lamp/rlimits/rss.soft : 488
/etc/vservers/mail/rlimits/rss.hard : 2343
/etc/vservers/mail/rlimits/rss.soft : 1953
/etc/vservers/nginx/rlimits/rss.hard : 585
/etc/vservers/nginx/rlimits/rss.soft : 488
/etc/vservers/spamvir/rlimits/rss.hard : 1953
/etc/vservers/spamvir/rlimits/rss.soft : 1562
#8

Updated by Vincent-Xavier JUMEL over 9 years ago

La dernière action que j'ai effectuée est loguée dans #1087

#9

Updated by Loïc Dachary over 9 years ago

root@pavot:~# total=0 ; for limit in /etc/vservers/*/rlimits/rss.hard ; do echo -n "$limit : " ; i=$(expr $(cat $limit) \* 4 / 1024) ; echo $i ; tot\
al=$(expr $total + $i) ; done ; echo total = $total
/etc/vservers/amphetamine/rlimits/rss.hard : 976
/etc/vservers/bots/rlimits/rss.hard : 585
/etc/vservers/dns/rlimits/rss.hard : 585
/etc/vservers/ergine/rlimits/rss.hard : 585
/etc/vservers/harmine/rlimits/rss.hard : 1953
/etc/vservers/lamp/rlimits/rss.hard : 585
/etc/vservers/mail/rlimits/rss.hard : 2343
/etc/vservers/nginx/rlimits/rss.hard : 585
/etc/vservers/spamvir/rlimits/rss.hard : 1953
total = 10150
#10

Updated by Loïc Dachary over 9 years ago

commit de62446e48f4b5ae8e1b553dbff4561b1f0093ce
Author: Loic Dachary <loic@dachary.org>
Date:   Mon Dec 10 20:59:03 2012 +0100

    les vservers traitant le spam et le mail on besoin de beaucoup de memoire, mail a ~9GB, spamvir ~8GB

commit 2b47b7ebb9ae7a5ab1caa129f78e430427cfe89e
Author: Loic Dachary <loic@dachary.org>
Date:   Mon Dec 10 20:52:55 2012 +0100

    nginx bien que peu gourmand arrive a bout de ses 512MB de RAM en cas d'affluence, on lui donne 1GB

commit a97a222942168bda67c3e5260006a9cf143bf5a3
Author: Loic Dachary <loic@dachary.org>
Date:   Mon Dec 10 20:46:22 2012 +0100

    amphetamine herberge redmine: on lui donne 2 x de RAM mais ca ne servira que ponctuellement, lorsque redmine devient gourmand

root@pavot:/etc/vservers# total=0 ; for limit in /etc/vservers/*/rlimits/rss.hard ; do echo -n "$limit : " ; i=$(expr $(cat $limit) \* 4 / 1024) ; e\
cho $i ; total=$(expr $total + $i) ; done ; echo total = $total
/etc/vservers/amphetamine/rlimits/rss.hard : 1953
/etc/vservers/bots/rlimits/rss.hard : 585
/etc/vservers/dns/rlimits/rss.hard : 585
/etc/vservers/ergine/rlimits/rss.hard : 585
/etc/vservers/harmine/rlimits/rss.hard : 1953
/etc/vservers/lamp/rlimits/rss.hard : 585
/etc/vservers/mail/rlimits/rss.hard : 9375
/etc/vservers/nginx/rlimits/rss.hard : 1171
/etc/vservers/spamvir/rlimits/rss.hard : 7812
total = 24604
#11

Updated by theo _ over 9 years ago

root@pavot:/var/log# grep 'Out of memory:' kern.log
Dec 10 11:16:42 pavot kernel: [266160.089326] Out of memory: kill process master(5755:#17) score 458028 or a child
Dec 10 11:16:42 pavot kernel: [266160.195420] Out of memory: kill process master(5755:#17) score 458028 or a child
Dec 10 11:16:42 pavot kernel: [266160.300382] Out of memory: kill process master(5755:#17) score 458028 or a child
Dec 10 11:16:43 pavot kernel: [266160.548650] Out of memory: kill process master(5755:#17) score 461412 or a child
Dec 10 11:16:43 pavot kernel: [266160.652553] Out of memory: kill process master(5755:#17) score 461412 or a child
Dec 10 11:16:43 pavot kernel: [266160.787239] Out of memory: kill process master(5755:#17) score 460143 or a child
Dec 10 11:16:43 pavot kernel: [266160.891264] Out of memory: kill process master(5755:#17) score 460143 or a child
Dec 10 11:16:43 pavot kernel: [266160.994435] Out of memory: kill process master(5755:#17) score 460143 or a child
Dec 10 11:16:43 pavot kernel: [266161.098702] Out of memory: kill process master(5755:#17) score 458490 or a child
Dec 10 11:16:43 pavot kernel: [266161.145094] Out of memory: kill process master(5755:#17) score 460164 or a child
Dec 10 11:16:43 pavot kernel: [266161.230687] Out of memory: kill process master(5755:#17) score 460164 or a child
Dec 10 11:16:44 pavot kernel: [266161.543390] Out of memory: kill process master(5755:#17) score 462288 or a child
Dec 10 11:16:45 pavot kernel: [266162.910507] Out of memory: kill process master(5755:#17) score 464494 or a child
Dec 10 11:16:45 pavot kernel: [266162.910537] Out of memory: kill process master(5755:#17) score 464494 or a child
Dec 10 11:16:45 pavot kernel: [266162.910545] Out of memory: kill process master(5755:#17) score 464494 or a child
Dec 10 11:16:45 pavot kernel: [266162.922710] Out of memory: kill process master(5755:#17) score 464564 or a child
Dec 10 11:16:45 pavot kernel: [266162.922717] Out of memory: kill process master(5755:#17) score 464535 or a child
Dec 10 11:16:45 pavot kernel: [266162.922863] Out of memory: kill process master(5755:#17) score 464563 or a child
Dec 10 11:16:45 pavot kernel: [266162.929563] Out of memory: kill process master(5755:#17) score 464489 or a child
Dec 10 11:16:45 pavot kernel: [266162.929703] Out of memory: kill process master(5755:#17) score 464491 or a child
Dec 10 11:16:45 pavot kernel: [266162.929796] Out of memory: kill process master(5755:#17) score 464491 or a child
Dec 10 11:16:45 pavot kernel: [266162.971394] Out of memory: kill process master(5755:#17) score 465879 or a child
Dec 10 11:16:45 pavot kernel: [266162.971399] Out of memory: kill process master(5755:#17) score 465879 or a child
Dec 10 11:16:45 pavot kernel: [266162.971480] Out of memory: kill process master(5755:#17) score 465879 or a child
Dec 10 11:16:45 pavot kernel: [266163.255175] Out of memory: kill process master(5755:#17) score 466756 or a child
Dec 10 18:09:49 pavot kernel: [290943.102890] Out of memory: kill process mysqld_safe(5430:#17) score 340609 or a child
Dec 10 18:09:49 pavot kernel: [290943.199312] Out of memory: kill process apache2(4868:#32) score 621366 or a child
Dec 10 18:09:49 pavot kernel: [290943.314867] Out of memory: kill process apache2(4868:#32) score 618719 or a child
Dec 10 18:09:49 pavot kernel: [290943.719573] Out of memory: kill process apache2(4868:#32) score 506594 or a child
Dec 10 18:11:57 pavot kernel: [291070.159233] Out of memory: kill process apache2(4868:#32) score 356671 or a child
Dec 10 18:12:34 pavot kernel: [291100.534839] Out of memory: kill process apache2(5319:#17) score 90063 or a child
Dec 10 18:15:23 pavot kernel: [291276.670508] Out of memory: kill process mysqld_safe(5430:#17) score 100187 or a child
Dec 10 18:16:12 pavot kernel: [291327.995421] Out of memory: kill process amavisd-new(1743:#16) score 54035 or a child
Dec 10 18:16:12 pavot kernel: [291328.086800] Out of memory: kill process apache2(5319:#17) score 68960 or a child
Dec 10 18:16:36 pavot kernel: [291346.094843] Out of memory: kill process apache2(3963:#22) score 343426 or a child
Dec 10 18:16:36 pavot kernel: [291346.190348] Out of memory: kill process apache2(3677:#15) score 76418 or a child
Dec 10 18:18:04 pavot kernel: [291391.063416] Out of memory: kill process nginx(6127:#10) score 14218 or a child
Dec 10 18:18:04 pavot kernel: [291417.127648] Out of memory: kill process dovecot(5657:#17) score 58585 or a child
Dec 10 18:18:04 pavot kernel: [291417.230963] Out of memory: kill process nginx(6127:#10) score 13033 or a child
Dec 10 18:18:04 pavot kernel: [291417.526503] Out of memory: kill process apache2(3677:#15) score 75908 or a child
Dec 10 18:19:01 pavot kernel: [291483.850692] Out of memory: kill process nginx(6127:#10) score 14188 or a child
Dec 10 18:19:01 pavot kernel: [291484.458704] Out of memory: kill process dovecot(5657:#17) score 46580 or a child
Dec 10 18:22:05 pavot kernel: [291632.392359] Out of memory: kill process pickup(24841:#12) score 10029 or a child
Dec 10 18:22:05 pavot kernel: [291632.490404] Out of memory: kill process apache2(3677:#15) score 104582 or a child
Dec 10 18:23:45 pavot kernel: [291745.659508] Out of memory: kill process apache2(4868:#32) score 473149 or a child
Dec 10 18:23:45 pavot kernel: [291745.754892] Out of memory: kill process apache2(4868:#32) score 470502 or a child
Dec 10 18:23:45 pavot kernel: [291745.850265] Out of memory: kill process dovecot(5657:#17) score 54191 or a child
Dec 10 18:23:45 pavot kernel: [291753.219565] Out of memory: kill process pickup(3787:#12) score 10029 or a child
Dec 10 18:23:45 pavot kernel: [291753.506827] Out of memory: kill process dovecot(5657:#17) score 53038 or a child
Dec 10 18:23:45 pavot kernel: [291763.066656] Out of memory: kill process dovecot(5657:#17) score 51310 or a child
Dec 10 18:24:13 pavot kernel: [291798.354352] Out of memory: kill process pickup(3808:#12) score 10029 or a child
Dec 10 18:24:13 pavot kernel: [291798.454499] Out of memory: kill process amavisd-new(1743:#16) score 53633 or a child
Dec 10 18:24:13 pavot kernel: [291798.546546] Out of memory: kill process nginx(6127:#10) score 14150 or a child
Dec 10 18:24:13 pavot kernel: [291798.663742] Out of memory: kill process dovecot(5657:#17) score 50031 or a child
Dec 10 18:24:13 pavot kernel: [291798.792552] Out of memory: kill process apache2(4868:#32) score 299647 or a child
Dec 10 18:25:19 pavot kernel: [291834.014521] Out of memory: kill process dovecot(5657:#17) score 47217 or a child
Dec 10 18:28:07 pavot kernel: [291896.799504] Out of memory: kill process dovecot(5657:#17) score 45601 or a child
Dec 10 18:28:07 pavot kernel: [291896.987174] Out of memory: kill process apache2(3677:#15) score 96865 or a child
Dec 10 18:28:07 pavot kernel: [292003.056218] Out of memory: kill process dovecot(5657:#17) score 44336 or a child
Dec 10 18:28:07 pavot kernel: [292036.155362] Out of memory: kill process apache2(4868:#32) score 244351 or a child
Dec 10 18:28:07 pavot kernel: [292041.538757] Out of memory: kill process dovecot(5657:#17) score 54319 or a child
Dec 10 18:28:53 pavot kernel: [292065.711568] Out of memory: kill process dovecot(5657:#17) score 55473 or a child
Dec 10 18:28:53 pavot kernel: [292076.071285] Out of memory: kill process apache2(4868:#32) score 354943 or a child
Dec 10 18:28:53 pavot kernel: [292076.163528] Out of memory: kill process apache2(4868:#32) score 352296 or a child
Dec 10 18:28:53 pavot kernel: [292076.255375] Out of memory: kill process apache2(3963:#22) score 343426 or a child
Dec 10 18:28:53 pavot kernel: [292076.347133] Out of memory: kill process apache2(3963:#22) score 337269 or a child
Dec 10 18:28:53 pavot kernel: [292076.438465] Out of memory: kill process apache2(3963:#22) score 337269 or a child
Dec 10 18:28:53 pavot kernel: [292076.530359] Out of memory: kill process apache2(3963:#22) score 337269 or a child
Dec 10 18:28:53 pavot kernel: [292076.622648] Out of memory: kill process amavisd-new(1743:#16) score 53334 or a child
Dec 10 18:30:19 pavot kernel: [292133.931105] Out of memory: kill process munin-graph(3841:#55) score 229092 or a child
Dec 10 18:30:19 pavot kernel: [292134.126113] Out of memory: kill process apache2(4868:#32) score 181441 or a child
Dec 10 18:30:19 pavot kernel: [292134.329241] Out of memory: kill process postgres(3831:#32) score 26692 or a child
Dec 10 18:32:36 pavot kernel: [292310.075645] Out of memory: kill process apache2(5319:#17) score 105124 or a child
Dec 10 18:32:36 pavot kernel: [292310.768597] Out of memory: kill process apache2(3963:#22) score 177187 or a child
Dec 10 18:32:36 pavot kernel: [292311.643728] Out of memory: kill process apache2(3677:#15) score 89124 or a child
Dec 10 18:32:40 pavot kernel: [292316.918963] Out of memory: kill process apache2(5319:#17) score 100755 or a child
Dec 10 18:33:51 pavot kernel: [292369.434667] Out of memory: kill process apache2(4868:#32) score 299870 or a child
Dec 10 18:33:51 pavot kernel: [292369.831159] Out of memory: kill process apache2(5319:#17) score 113158 or a child
Dec 10 18:33:51 pavot kernel: [292373.730348] Out of memory: kill process apache2(3677:#15) score 81351 or a child
Dec 10 18:33:51 pavot kernel: [292373.923916] Out of memory: kill process amavisd-new(1743:#16) score 53166 or a child
#12

Updated by Loïc Dachary over 9 years ago

  • Subject changed from interruption pavot.april.org to interruption pavot.april.org suite a un overflow de oom-killer dans les vservers
#13

Updated by theo _ over 9 years ago

root@pavot:/etc/vservers# for dir in *; do [ -f $dir/rlimits/rss.hard ] && echo /usr/sbin/vlimit -c $(vserver-stat|grep $dir|awk '{print $1}') --rss $(cat $dir/rlimits/rss.hard); done
/usr/sbin/vlimit -c 32 --rss 500000
/usr/sbin/vlimit -c 11 --rss 150000
/usr/sbin/vlimit -c 12 --rss 150000
/usr/sbin/vlimit -c 50 --rss 150000
/usr/sbin/vlimit -c 22 --rss 500000
/usr/sbin/vlimit -c 15 --rss 150000
/usr/sbin/vlimit -c 17 --rss 2400000
/usr/sbin/vlimit -c 10 --rss 300000
/usr/sbin/vlimit -c 16 --rss 2000000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 32 --rss 500000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 11 --rss 150000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 12 --rss 150000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 50 --rss 150000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 22 --rss 500000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 15 --rss 150000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 17 --rss 2400000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 10 --rss 300000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 16 --rss 2000000
#14

Updated by theo _ over 9 years ago

root@pavot:/etc/vservers# for dir in *; do [ -f $dir/rlimits/rss.soft ] && echo /usr/sbin/vlimit -c $(vserver-stat|grep $dir|awk '{print $1}') -S --rss $(cat $dir/rlimits/rss.soft); done
/usr/sbin/vlimit -c 32 -S --rss 350000
/usr/sbin/vlimit -c 11 -S --rss 125000
/usr/sbin/vlimit -c 12 -S --rss 125000
/usr/sbin/vlimit -c 50 -S --rss 125000
/usr/sbin/vlimit -c 22 -S --rss 400000
/usr/sbin/vlimit -c 15 -S --rss 125000
/usr/sbin/vlimit -c 17 -S --rss 2000000
/usr/sbin/vlimit -c 10 -S --rss 250000
/usr/sbin/vlimit -c 16 -S --rss 1600000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 32 -S --rss 350000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 11 -S --rss 125000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 12 -S --rss 125000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 50 -S --rss 125000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 22 -S --rss 400000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 15 -S --rss 125000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 17 -S --rss 2000000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 10 -S --rss 250000
root@pavot:/etc/vservers# /usr/sbin/vlimit -c 16 -S --rss 1600000

#15

Updated by Loïc Dachary over 9 years ago

  • Status changed from En cours de traitement to Résolu

L'alerte nagios est installée

#16

Updated by Loïc Dachary over 9 years ago

  • Status changed from Résolu to En cours de traitement

nrpe ne se lance pas parceque pavot n'a pas d'IP dans le VPN. Il faudrait qu'elle ait 192.168.2.254 comme mit dans /etc/network/interfaces mais elle a 192.168.1.254 dans /etc/hosts. Comme elle sert de passerelle VPN il faut comprendre avant de modifier au risque de perdre la connection.

#17

Updated by Loïc Dachary over 9 years ago

  • Status changed from En cours de traitement to Résolu

L'ip de pavot est 192.168.2.254 et c'etait une erreur d'avoir 192.168.1.254

#18

Updated by Quentin Gibeaux about 3 years ago

  • Status changed from Résolu to Fermé

Also available in: Atom PDF