memory leak?

For those who cannot wait for the official releases, we'll occasionally post test releases here. This includes the NV+/NV/Duo/1100/1000/X6/600/ models.
WARNING: use at your own risk!

Moderator: chirpa

Similar topics


Postby warewolf » Fri Jan 18, 2008 12:05 pm

I'm not sure where to post this, but I think I have a memory leak somewhere in kernel space in this "final beta" release of the ReadyNAS OS.

I'm no stranger to Linux, but from what I can tell it's not an application memory leak, but a kernel memory leak.

Specs:

ReadyNAS NV+ running Raidiator 4.00c1-p2. 1gb of memory (tested twice), 4x1TB disks.
Addons added: ssh, root ssh, apt

The kernel runs out of memory, then invokes the OOM killer and kills off nearly everything. The first time the box died I had to hit the power button -- it was completely unmanageable via the network (ssh, hXXps, no go -- the processes had been killed).
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby chirpa » Fri Jan 18, 2008 12:09 pm

Please send us your logs, instructions in my signature below.
User avatar
chirpa
Jedi Council
 
Posts: 11174
Joined: Mon Sep 24, 2007 11:52 am
Location: T.A.R.D.I.S.
ReadyNAS: Repertoire

Postby warewolf » Fri Jan 18, 2008 12:57 pm

chirpa wrote:Please send us your logs, instructions in my signature below.


sent, Subject: ATTN: chirpa (memory leak) from warewolf
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby Skywalker » Fri Jan 18, 2008 1:11 pm

Are you running SlimServer? I've known SlimServer to leak memory in the past (like a lot of other big perl programs tend to do).
User avatar
Skywalker
Jedi Council
 
Posts: 2763
Joined: Fri Nov 19, 2004 10:47 am
Location: Fremont, CA
ReadyNAS: NV

Postby warewolf » Fri Jan 18, 2008 1:32 pm

Skywalker wrote:Are you running SlimServer? I've known SlimServer to leak memory in the past (like a lot of other big perl programs tend to do).


I have all the streaming services turned off, because I have nothing that supports them :(
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby chirpa » Fri Jan 18, 2008 3:29 pm

There are lines in your system.log that I have not seen on any other ReadyNAS. Since you have SSH enabled, and running custom programs, I can only suggest you stop running those and see if the system behaves.

Have you looked into how Munin (your cpu/mem monitor graphs) could effect the performance, or what other custom apps are running?

Without Slim and other streaming services, you should not have 16MB of 1024MB free, that is way off.
User avatar
chirpa
Jedi Council
 
Posts: 11174
Joined: Mon Sep 24, 2007 11:52 am
Location: T.A.R.D.I.S.
ReadyNAS: Repertoire

Postby warewolf » Fri Jan 18, 2008 10:42 pm

Munin is taking up extremely little memory -- and it's the only resident daemon process I added beyond what comes Out Of The Box[tm]. The thing that confuses me is munin calculates it's "apps" value this way:
Code: Select all
print "apps.value ", $mems{'MemTotal'}
        -$mems{'MemFree'}
        -$mems{'Buffers'}
        -$mems{'Cached'}
        -$mems{'SwapCached'}
        -$mems{'Slab'}
        -$mems{'PageTables'}
        -$mems{'VmallocUsed'}
        ,"\n";


And I can't find a way to invert that calculation, to add up stuff to equal the same value.

What do you suggest I do? Do a firmware reinstall? I honestly don't believe anything I've done to the OS is the direct cause for this, I don't have any processes that are showing up as being memory hogs. I'm almost certian that I've managed to find a way to tickle some kernel bug over time.

Speaking of kernel bugs, where's the kernel source (and any modification patches) for 2.6.17.8ReasyNAS? I'd like to see if I can figure out what's going on, but this is poking in the dark.
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby Skywalker » Tue Jan 22, 2008 3:05 pm

warewolf wrote:Speaking of kernel bugs, where's the kernel source (and any modification patches) for 2.6.17.8ReasyNAS? I'd like to see if I can figure out what's going on, but this is poking in the dark.

Good luck finding the problem. I haven't seen the ReadyNAS kernel leak memory before, but you can grab the source code for the latest kernel from here.
User avatar
Skywalker
Jedi Council
 
Posts: 2763
Joined: Fri Nov 19, 2004 10:47 am
Location: Fremont, CA
ReadyNAS: NV

Postby warewolf » Sat Jan 26, 2008 11:22 am

Updates:

I'm finding that kernels before 2.6.19 have had VM leakage issues, relating to prefetching (Read-Ahead) of data from disk.

I noticed you guys have kernel profiling turned on, any chance I can get the System.map file for
Code: Select all
Linux readynas 2.6.17.8ReadyNAS #1 Mon Dec 17 19:35:18 PST 2007 padre unknown


?
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby warewolf » Mon Jan 28, 2008 12:14 am

Okay: data collected, and now I present a challange!

I've intentionally terminated nearly every process on my readynas, in an attempt to get as much memory free as possible. Buuuuut, free(1) still shows 125184kb in use. My question is: Where is it?.

I've uploaded all the relevant data to my website, and I would greatly appreciate someone at infrant/netgear taking a look at it. There's obviously something very wrong with the 4.0 beta release, and unfortunately I can't roll back to an earlier release because I have 4x1TB drives.

http://www.richardharman.com/readynas/

.. my own personal investigation would be less hindered if I could compile a new kernel on the readynas, warranty void or not.
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby ryanrk » Tue Jan 29, 2008 5:30 pm

warewolf wrote:I'm not sure where to post this, but I think I have a memory leak somewhere in kernel space in this "final beta" release of the ReadyNAS OS.

I'm no stranger to Linux, but from what I can tell it's not an application memory leak, but a kernel memory leak.

Specs:

ReadyNAS NV+ running Raidiator 4.00c1-p2. 1gb of memory (tested twice), 4x1TB disks.
Addons added: ssh, root ssh, apt

The kernel runs out of memory, then invokes the OOM killer and kills off nearly everything. The first time the box died I had to hit the power button -- it was completely unmanageable via the network (ssh, hXXps, no go -- the processes had been killed).


This has happen to me also. It happen like a week ago. I had no access to it and had to hit the power button, in fact the power button didn't seem to work so i had to unplug the thing. At the time I was thinking memory leak also but wanted to wait to see how long it takes to get back to that state, if it's a time base thing.
ryanrk
ReadyNAS Newbie
 
Posts: 30
Joined: Fri Sep 08, 2006 2:09 pm

Found it - we need a kernel revision bump

Postby warewolf » Wed Jan 30, 2008 11:39 am

http://kerneltrap.org/mailarchive/linux ... /26/164963

The kernel on the readynas, 2.6.17.8 is way out of date -- and has known memory leaks in the kernel from the kernel devs.

Can we get another beta release with a kernel >= 2.6.22.5?

Edit: Independent verification from a kernel developer I located on IRC:

Code: Select all
13:38 <warewolf> sweet, slabtop works like a champ
13:39 <warewolf>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
13:39 <warewolf>  24480  23899  97%    0.05K     90      272      1440K buffer_head
13:39 <warewolf>  10112  10106  99%    0.97K    632       16     10112K task_struct
13:39 <warewolf> looks like task_struct is leaking?
13:40 <jdike> there aren't 10K processes on the system?
13:40 <warewolf> there's 74.
13:40 <warewolf> # ps auwwwx  | wc -l 74
13:40 <jdike> yeah
13:40 <jdike> 10106 would be a bit excessive


Okay guys I've done the work for you guys, and identified the problem. What next?
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Postby Skywalker » Wed Jan 30, 2008 12:59 pm

You linked to a very generic, blanket statement that there were memory leaks fixed between 2.6.20 and 2.6.22.5. It doesn't refer to architecture, or usage, or anything else. I didn't see anything relevant in the kernel changelog, so you can't just assume that there's a kernel memory leak in the context of its use on the ReadyNAS because some guy on LKML said some memory leaks were fixed. Also, we don't use an off-the-shelf processor. There is no upstream support for our chip, so all kernel porting has to be done in-house, and it's no small task, as you can probably see by doing a diff between the stock upstream 2.6.17.8 and our kernel source code.
Now, if there is a reproducible memory leak in the kernel, we're more than happy to fix it. And if you can give us enough information to reproduce the issue here, that would be wonderful. But we haven't been able to reproduce your issue here yet, and we've had systems up and running for well over a month at a time. Here's some data from a normal bootup with 1GB RAM.
Code: Select all
nas-01-0E-1C:~# more /proc/meminfo
MemTotal:      1010720 kB
MemFree:        957312 kB
Buffers:         14080 kB
Cached:          37392 kB
SwapCached:          0 kB
Active:          37504 kB
Inactive:        21520 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      1010720 kB
LowFree:        957312 kB
SwapTotal:      255968 kB
SwapFree:       255968 kB
Dirty:              32 kB
Writeback:           0 kB
Mapped:          14400 kB
Slab:             5040 kB
CommitLimit:    862400 kB
Committed_AS:    21424 kB
PageTables:          0 kB
VmallocTotal:   131008 kB
VmallocUsed:      1056 kB
VmallocChunk:   129408 kB

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root         1  0.4  0.0  2000  880 ?        Ss   09:32   0:04 init [3] 
root         2  0.0  0.0     0    0 ?        RN   09:32   0:00 [ksoftirqd/0]
root         3  0.0  0.0     0    0 ?        S<   09:32   0:00 [events/0]
root         4  0.0  0.0     0    0 ?        S<   09:32   0:00 [khelper]
root         5  0.0  0.0     0    0 ?        S<   09:32   0:00 [kthread]
root        10  0.0  0.0     0    0 ?        S<   09:32   0:00 [kblockd/0]
root        13  0.0  0.0     0    0 ?        S<   09:32   0:00 [khubd]
root        41  0.0  0.0     0    0 ?        S    09:32   0:00 [pdflush]
root        42  0.0  0.0     0    0 ?        S    09:32   0:00 [pdflush]
root        43  0.0  0.0     0    0 ?        S    09:32   0:00 [kswapd0]
root        44  0.0  0.0     0    0 ?        S<   09:32   0:00 [aio/0]
root        45  0.0  0.0     0    0 ?        S<   09:32   0:00 [cifsoplockd]
root        46  0.0  0.0     0    0 ?        S<   09:32   0:00 [cifsdnotifyd]
root        92  0.0  0.0     0    0 ?        S<   09:32   0:00 [kvblade]
root        93  0.0  0.0     0    0 ?        S    09:32   0:00 [mtdblockd]
root       106  0.0  0.0     0    0 ?        S    09:32   0:00 [hotplug-sata]
root       116  0.0  0.0     0    0 ?        S    09:33   0:00 [djsyncd]
root       117  0.0  0.0     0    0 ?        S    09:33   0:00 [djcheckd]
root       120  0.0  0.0     0    0 ?        S    09:33   0:00 [hotplug-gmac]
root       293  0.0  0.0     0    0 ?        S<   09:33   0:00 [kjournald]
root       576  0.0  0.0     0    0 ?        S<   09:33   0:00 [kjournald]
daemon     616  0.0  0.0  2240  624 ?        Ss   09:33   0:00 /sbin/portmap
root       625  0.0  0.0  2128  960 ?        Ss   09:33   0:00 /sbin/syslogd -m 0
daemon     629  0.0  0.0  2480  864 ?        Ss   09:33   0:00 /usr/sbin/atd
root       635  0.0  0.0  2016  720 ?        Ss   09:33   0:00 /sbin/klogd -x -c 3
root       637  0.0  0.0  2000  704 ?        Ss   09:33   0:00 /usr/sbin/inetd
admin      653  0.0  0.1  3648 1808 ?        Ss   09:33   0:00 avahi-daemon: running [nas-01-0E-1C.local]
root       654  0.0  0.1  2656 1264 ?        Ss   09:33   0:00 /usr/sbin/cron
root       661  0.0  0.1  4672 1792 ?        Ss   09:33   0:00 /usr/sbin/cupsd
root       684  0.0  0.4 10160 4688 ?        Ss   09:33   0:00 /usr/sbin/smbd -D
root       698  0.0  0.3 10160 3680 ?        S    09:33   0:00 /usr/sbin/smbd -D
root       880  0.0  0.0  2000  272 ?        Ss   09:33   0:00 udhcpc -i eth0 -H nas-01-0E-1C -n
root       896  0.0  0.1  2224 1120 ?        Ss   09:33   0:00 /frontview/bin/monitor_enclosure
root       999  0.0  0.2  5680 2432 ttyS1    Ss   09:34   0:00 -bash
root      1007  0.0  0.2  8704 2560 ?        Ss   09:34   0:00 nmbd -D
root      1019  0.0  0.0  2000  832 ?        Ss   09:34   0:00 /usr/sbin/upnpd -a 192.168.7.178
root      1023  0.0  0.1  3424 1072 ?        S    09:34   0:00 /usr/sbin/cnid_metad
root      1028  0.0  0.2  8080 2480 ?        S    09:34   0:00 /usr/sbin/afpd -U uams_dhx.so,uams_clrtxt.so,uams_guest.so -c 50 -n nas-01-0E-1C
User avatar
Skywalker
Jedi Council
 
Posts: 2763
Joined: Fri Nov 19, 2004 10:47 am
Location: Fremont, CA
ReadyNAS: NV

Re: Found it - we need a kernel revision bump

Postby Skywalker » Wed Jan 30, 2008 1:02 pm

warewolf wrote:Edit: Independent verification from a kernel developer I located on IRC:

Code: Select all
13:38 <warewolf> sweet, slabtop works like a champ
13:39 <warewolf>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
13:39 <warewolf>  24480  23899  97%    0.05K     90      272      1440K buffer_head
13:39 <warewolf>  10112  10106  99%    0.97K    632       16     10112K task_struct
13:39 <warewolf> looks like task_struct is leaking?
13:40 <jdike> there aren't 10K processes on the system?
13:40 <warewolf> there's 74.
13:40 <warewolf> # ps auwwwx  | wc -l 74
13:40 <jdike> yeah
13:40 <jdike> 10106 would be a bit excessive


Okay guys I've done the work for you guys, and identified the problem. What next?

All right, that's helpful. We'll keep trying to reproduce it. What has the system done? Is this just right after a normal boot? What services are/have been running?
User avatar
Skywalker
Jedi Council
 
Posts: 2763
Joined: Fri Nov 19, 2004 10:47 am
Location: Fremont, CA
ReadyNAS: NV

Postby warewolf » Thu Jan 31, 2008 3:57 pm

There's a kernel memory leak test patch for the kernel

http://homepage.ntlworld.com/cmarinas/k ... ak-0.7.bz2

this patch applies cleanly to the infrant 2.6.17.8ReadyNAS kernel source, listed above. Is it possible for me (or Infrant) to compile a new kernel for me to run on my NV+? I'd like to get as much detailed debugging info for you guys as possible.
warewolf
ReadyNAS Newbie
 
Posts: 25
Joined: Fri Jan 18, 2008 10:30 am

Next

Return to Public RAIDiator Beta for ReadyNAS NV+/Duo/1100

Similar topics


Who is online

Users browsing this forum: No registered users and 5 guests