Page 2 of 2

Re: Memory leak?

Posted: November 19th, 2011, 5:24 am
by sander
To show memory usage, I tried heapy (after a "sudo apt-get install python-guppy" and removing the assert in line 705 of /usr/lib/pymodules/python2.7/guppy/heapy/Part.py ), and the result is below. You can a lot of int's are junk 68% of memory, an 177 list are junk an additional 13% (adding up to 91%).

heapy shows what kind of objects are junk the most memory. I don't know if heapy can show the names of those variables.

I tried to put a regular heapy dump into SABnzbd. As bpsmeter shows it output regularly, I tried putting it there, but that impacted the SAB download to go down to zero download. And I had to put a "from guppy import hpy" there too (in SABnzbd.py alone was not enough), so bpsmeter.py has its own scope? And if so, it will no see the real stack?

@shypike: how can I make SABnzbd to call heapy() each x minutes?

Code: Select all

from guppy import hpy
h = hpy()
print h.heap()
a = [1,2,3]

for x in range(1000000):
	# print x
	a.append(x)

heapdump = h.heap()
print heapdump

Code: Select all

sander@R540:~/heapy-spul$ time python test.py 
Partition of a set of 25934 objects. Total size = 3307064 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  11830  46   923512  28    923512  28 str
     1   5848  23   471784  14   1395296  42 tuple
     2    346   1   271600   8   1666896  50 dict (no owner)
     3     69   0   213624   6   1880520  57 dict of module
     4   1632   6   208896   6   2089416  63 types.CodeType
     5    177   1   197016   6   2286432  69 dict of type
     6   1595   6   191400   6   2477832  75 function
     7    199   1   177008   5   2654840  80 type
     8    124   0   135328   4   2790168  84 dict of class
     9   1044   4    83520   3   2873688  87 __builtin__.wrapper_descriptor
<90 more rows. Type e.g. '_.more' to view.>
Partition of a set of 1025810 objects. Total size = 35432688 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 1000092  97 24002208  68  24002208  68 int
     1    177   0  8151624  23  32153832  91 list
     2  11832   1   923640   3  33077472  93 str
     3   5847   1   471720   1  33549192  95 tuple
     4    352   0   273280   1  33822472  95 dict (no owner)
     5     69   0   213624   1  34036096  96 dict of module
     6   1632   0   208896   1  34244992  97 types.CodeType
     7    177   0   197016   1  34442008  97 dict of type
     8   1594   0   191280   1  34633288  98 function
     9    199   0   177008   0  34810296  98 type
<90 more rows. Type e.g. '_.more' to view.>

real	0m14.177s
user	0m14.020s
sys	0m0.080s
sander@R540:~/heapy-spul$


Re: Memory leak?

Posted: November 19th, 2011, 5:34 am
by shypike
If "int" variables are filling the pool, then that's a clear signal of a failing garbage collector.
Python is notorious for not cleaning up circular references, but int-s cannot reference other objects.

You can do periodic stuff in SABnzbd's main loop.
Look for "### 30 sec polling tasks" in the SABnzbd.py file.
Or maybe even better:
The function check_all_tasks() in sabnzbd/__init__.py

Re: Memory leak?

Posted: November 19th, 2011, 5:50 am
by sander
Please note: the above was my own program creating of list of int's. It's not SAB.

Here's my nice stuff with SABnzbd, thanks to http://www.toofishes.net/blog/using-gup ... ory-leaks/ : you can put just a few lines in SAB, and then from another terminal send heapy() commands

In SABnzbd.py:

Code: Select all

import platform
import time

import guppy
from guppy.heapy import Remote
Remote.on()
Then, in another terminal, start poking (peeking?) heapy() inside SABnzbd.py:
At the start:

Code: Select all

sander@R540:~/heapy-spul$ python -c "from guppy import hpy;hpy().monitor()"
<Monitor> 
*** Connection 1 opened ***
<Monitor> lc
CID PID   ARGV
  1 11024 ['./SABnzbd.py']
<Monitor> sc 1
Remote connection 1. To return to Monitor, type <Ctrl-C> or .<RETURN>
<Annex> int
Remote interactive console. To return to Annex, type '-'.
>>> hp.heap()
Partition of a set of 114837 objects. Total size = 16230176 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  56606  49  4176656  26   4176656  26 str
     1   2150   2  2431168  15   6607824  41 unicode
     2  24396  21  2178248  13   8786072  54 tuple
     3    307   0  1234888   8  10020960  62 dict of module
     4    530   0  1098416   7  11119376  69 dict (no owner)
     5   6854   6   877312   5  11996688  74 types.CodeType
     6   6726   6   807120   5  12803808  79 function
     7    701   1   629144   4  13432952  83 type
     8    700   1   604576   4  14037528  86 dict of type
     9    284   0   310688   2  14348216  88 dict of class
<351 more rows. Type e.g. '_.more' to view.>
>>> 

Two minutes after uploading a NZB pointing to 33 GB download:

Code: Select all

<Monitor> sc 1
Remote connection 1. To return to Monitor, type <Ctrl-C> or .<RETURN>
<Annex> int
Remote interactive console. To return to Annex, type '-'.
>>> hp.heap()
Partition of a set of 185700 objects. Total size = 33919656 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  10506   6 11010288  32  11010288  32 dict of sabnzbd.nzbstuff.Article
     1  77089  42  7204920  21  18215208  54 str
     2   2324   1  2475200   7  20690408  61 unicode
     3  24549  13  2191080   6  22881488  67 tuple
     4    756   0  1887456   6  24768944  73 dict (no owner)
     5  12802   7  1240272   4  26009216  77 list
     6    309   0  1239288   4  27248504  80 dict of module
     7   6895   4   882560   3  28131064  83 types.CodeType
     8   6767   4   812040   2  28943104  85 function
     9  10506   6   756432   2  29699536  88 sabnzbd.nzbstuff.Article
<364 more rows. Type e.g. '_.more' to view.>
>>> 

So that looks like a start to find out what SAB is spending it's memory on...

Re: Memory leak?

Posted: November 19th, 2011, 6:34 am
by finalius
Would it help if I did those code modifications on my system and trigger this memory situation again or are you guys good? (could do it tomorrow-ish)

Re: Memory leak?

Posted: November 19th, 2011, 7:17 am
by shypike
All help is welcome in cases like this.

Re: Memory leak?

Posted: November 19th, 2011, 7:55 am
by sander
Here some more snippets.

As said, I installed python-guppy on my Ubuntu, and I patched Part.py.

Changes in SABnzbd.py:

import at the beginning:

Code: Select all

import platform
import time

from guppy import hpy
h = hpy()
heapdump = h.heap()
print "heapdump", heapdump

logging.info('Heap dump by heapy: \n %s', heapdump)

Then in the 30-sec polling part:

Code: Select all

        ### 30 sec polling tasks
        if timer > 9:
            timer = 0

	    h = hpy()
	    heapdump = h.heap()
	    # print "heapdump", heapdump
	    logging.info("please heapy, put something in the log ... Dump! Dump!")
	    logging.info('Heap dump by heapy: \n%s', heapdump)

            # Keep Windows awake (if needed)

... which leads to a heapdump in the log each 16 minutes. See contents below. First remarks:
Total size = 57MB. That does not look very special to me.
"dict of sabnzbd.nzbstuff.Article" is the most memory loving, but 26 MB does not look alarming to me.

My harddisk is full, so I can't do any big downloads.

If other can do the same patch + test, we can see if heapy() is going to report more memory usage (Total size), and if so, which objects ...

Code: Select all

sander@R540:~/.sabnzbd/logs$ grep -A13  Heap sabnzbd.log
2011-11-19 12:31:00,782::INFO::[SABnzbd:1527] Heap dump by heapy: Partition of a set of 297945 objects. Total size = 57388744 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25065   8 26268120  46  26268120  46 dict of sabnzbd.nzbstuff.Article
     1  99113  33  8167184  14  34435304  60 str
     2   1618   1  3231664   6  37666968  66 dict (no owner)
     3   2844   1  2627496   5  40294464  70 unicode
     4  28260   9  2481856   4  42776320  75 tuple
     5  28471  10  2427704   4  45204024  79 list
     6    693   0  2322936   4  47526960  83 dict of sabnzbd.nzbstuff.NzbFile
     7  25065   8  1804680   3  49331640  86 sabnzbd.nzbstuff.Article
     8  56662  19  1359888   2  50691528  88 int
     9    324   0  1264224   2  51955752  91 dict of module
<395 more rows. Type e.g. '_.more' to view.>
2011-11-19 12:41:52,729::INFO::[__init__:840] Saving data for totals9.sab in /home/sander/.sabnzbd/admin/totals9.sab
--
2011-11-19 12:47:11,556::INFO::[SABnzbd:1527] Heap dump by heapy: Partition of a set of 300958 objects. Total size = 58092416 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25458   8 26679984  46  26679984  46 dict of sabnzbd.nzbstuff.Article
     1 100614  33  8365864  14  35045848  60 str
     2   1616   1  3261824   6  38307672  66 dict (no owner)
     3   2844   1  2627496   5  40935168  70 unicode
     4  28274   9  2482912   4  43418080  75 tuple
     5  28854  10  2456624   4  45874704  79 list
     6    693   0  2322936   4  48197640  83 dict of sabnzbd.nzbstuff.NzbFile
     7  25458   8  1832976   3  50030616  86 sabnzbd.nzbstuff.Article
     8  57056  19  1369344   2  51399960  88 int
     9    324   0  1264224   2  52664184  91 dict of module
<388 more rows. Type e.g. '_.more' to view.>
2011-11-19 12:47:47,443::INFO::[downloader:351] [email protected]:119: Initiating connection
--
2011-11-19 13:03:10,042::INFO::[SABnzbd:1527] Heap dump by heapy: Partition of a set of 303449 objects. Total size = 58489680 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25458   8 26679984  46  26679984  46 dict of sabnzbd.nzbstuff.Article
     1 103106  34  8763880  15  35443864  61 str
     2   1616   1  3262208   6  38706072  66 dict (no owner)
     3   2844   1  2627496   4  41333568  71 unicode
     4  28274   9  2482912   4  43816480  75 tuple
     5  28854  10  2455664   4  46272144  79 list
     6    693   0  2322936   4  48595080  83 dict of sabnzbd.nzbstuff.NzbFile
     7  25458   8  1832976   3  50428056  86 sabnzbd.nzbstuff.Article
     8  57055  19  1369320   2  51797376  89 int
     9    324   0  1264224   2  53061600  91 dict of module
<388 more rows. Type e.g. '_.more' to view.>
2011-11-19 13:04:58,629::INFO::[downloader:607] Thread [email protected]:119: timed out
--
2011-11-19 13:19:18,584::INFO::[SABnzbd:1527] Heap dump by heapy: Partition of a set of 307665 objects. Total size = 59478800 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25982   8 27229136  46  27229136  46 dict of sabnzbd.nzbstuff.Article
     1 105227  34  9064368  15  36293504  61 str
     2   1616   1  3311360   6  39604864  67 dict (no owner)
     3   2844   1  2627496   4  42232360  71 unicode
     4  29378  10  2495504   4  44727864  75 list
     5  28274   9  2482912   4  47210776  79 tuple
     6    693   0  2322936   4  49533712  83 dict of sabnzbd.nzbstuff.NzbFile
     7  25982   8  1870704   3  51404416  86 sabnzbd.nzbstuff.Article
     8  57578  19  1381872   2  52786288  89 int
     9    324   0  1264224   2  54050512  91 dict of module
<388 more rows. Type e.g. '_.more' to view.>
2011-11-19 13:30:12,142::INFO::[downloader:610] Thread [email protected]:119: server closed connection
sander@R540:~/.sabnzbd/logs$ 


Re: Memory leak?

Posted: November 19th, 2011, 8:05 am
by shypike
I assume there were jobs in the queue?
The sizeable "Article" dictionary is to be expected: each article in the queue has it's own object
(when it's in memory, which it isn't necessarily.)
So far I see nothing unexpected, assuming the queue is not empty.

I just finished a huge download on OSX.
(To test instability claims that I have never been able to reproduce.)
The activity manager says that SABnzbd is still holding on to 256M,
consistent with the peak of memory usage.
So it looks like this might happen on OSX too.
Strangely enough, the Windows (and probably the Leopard) version do not have this behaviour.

Re: Memory leak?

Posted: November 19th, 2011, 8:22 am
by sander
Yes, I had put stuff in the queue: a few DVDs and a few BluRays. Reason: to test whether only stuff in the queue raises the memory usage.

BTW: the strange thing is that 'top' is reporting that SABnzbd is already using 596 MB virtual and 149 MB 'RES' (=real, physical?) memory. Hopefully there is correlation between top's and heapy's memory usage reports; if top goes up, but heapy() doesn't, I don't know what that means ...

Code: Select all

$ top 

top - 14:09:04 up 17:48,  6 users,  load average: 5.38, 3.11, 1.93
Tasks: 286 total,   7 running, 276 sleeping,   0 stopped,   3 zombie
Cpu(s): 61.6%us,  9.5%sy,  0.1%ni,  9.9%id, 18.5%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:   3851204k total,  3746448k used,   104756k free,   375628k buffers
Swap:  4881404k total,   337752k used,  4543652k free,   394932k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                    
12497 sander    20   0  596m 149m 4276 R   98  4.0  19:32.29 SABnzbd.py                                                                                                 
 2771 felker    20   0  828m 538m 6024 R   83 14.3 322:04.62 npviewer.bin  

Re: Memory leak?

Posted: November 19th, 2011, 9:12 am
by sander
@shypike: It looks like the heapy() dump itself can take quite some time to generate (minutes?). During that time SABnzbd becomes unresponsive and its download slows down.

Is there a way / place to do it asynchronously?

BTW: I changed the code a bit: an import-check, and now a function call.

Code: Select all

import socket
import platform
import time

try:
	from guppy import hpy
except:
	print "No guppy, hpy or heapy found. Did you install it? On Ubuntu it's the package python-guppy"
	guppyOK = False
	
else:
	print "OK, importing guppy / hpy / heapy went OK"
	guppyOK = True


def HeapDump():

   if guppyOK:
	logging.info('Heap dump coming up ...')
	h = hpy()
	heapdump = h.heap()
	# print "heapdump", heapdump

	logging.info('Heap dump by heapy: \n %s', heapdump)

	topcmd = 'top -n1 | grep -i sabnzbd'
	topoutput = os.popen(topcmd).readline()
	logging.info('Output from top-command: %s', topoutput)
	logging.info('Heap dump done.')
   else:
	logging.info('no Heap dump as heapy was not imported')

HeapDump()

Re: Memory leak?

Posted: November 19th, 2011, 11:33 am
by shypike
I'm not sure doing it async would help.
I can imagine that it stops all to avoid shooting at a moving target.
But you could create a task in sabnzbd/scheduler.py for it.

Re: Memory leak?

Posted: November 19th, 2011, 5:19 pm
by finalius
I've added the guppy code (was a bit of a pain as there was no guppy-package for python2.7 in debian?! Had to compile it myself). Anyway, memory profiling is running and SAB should be busy for the night. I'll post the logs/findings in the morning

Re: Memory leak?

Posted: November 20th, 2011, 4:01 am
by finalius
Okay, results are in. Downloads finished around 4am, so SAB is idling since a couple of ours. According to top, it uses 40% of 2GB, so roughly 800MB of RAM. To make the story short, the heap dumps won't be of much help.

This is what the dump looked like while downloading. I've picked one of the highest memory consumptions while downloading, and it's still hardly over 250MB. This kind of makes sense, my article cache limit is set to 256M:

Code: Select all

2011-11-20 02:50:38,226::INFO::[SABnzbd:1474] Heap dump by heapy:
Partition of a set of 384312 objects. Total size = 255115240 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0 112317  29 174632720  68 174632720  68 str
     1  49539  13 51916872  20 226549592  89 dict of sabnzbd.nzbstuff.Article
     2   1194   0  7290864   3 233840456  92 dict (no owner)
     3  52070  14  4063408   2 237903864  93 list
     4  49539  13  3566808   1 241470672  95 sabnzbd.nzbstuff.Article
     5   2781   1  3006232   1 244476904  96 unicode
     6  24648   6  2202024   1 246678928  97 tuple
     7  66263  17  1590312   1 248269240  97 int
     8    305   0  1022360   0 249291600  98 dict of sabnzbd.nzbstuff.NzbFile
     9    311   0   990248   0 250281848  98 dict of module
<367 more rows. Type e.g. '_.more' to view.>
and this is what it looks like while idling:

Code: Select all

2011-11-20 09:55:41,570::INFO::[SABnzbd:1474] Heap dump by heapy:
Partition of a set of 242100 objects. Total size = 51347776 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  25468  11 26690464  52  26690464  52 dict of sabnzbd.nzbstuff.Article
     1  79570  33  6064360  12  32754824  64 str
     2   2576   1  2961888   6  35716712  70 unicode
     3    974   0  2598992   5  38315704  75 dict (no owner)
     4  24645  10  2201776   4  40517480  79 tuple
     5  27575  11  2155064   4  42672544  83 list
     6  25468  11  1833696   4  44506240  87 sabnzbd.nzbstuff.Article
     7    311   0   990248   2  45496488  89 dict of module
     8   6641   3   850048   2  46346536  90 types.CodeType
     9   6473   3   776760   2  47123296  92 function
<365 more rows. Type e.g. '_.more' to view.>
According to guppy, consumption is down to 51MB - according to top we're at 800.
I don't really know python unfortunately, I code in Java and PHP, but this seems like a garbage collector issue - maybe it can be triggered manually or something like that? However, it could also be that guppy is missing something.

Re: Memory leak?

Posted: November 20th, 2011, 8:43 am
by shypike
Thanks for your tests.
It certainly looks like a garbage collector issue.
SABnzbd uses circular references which makes life difficult for the garbage collector.
However, when running on Windows this issue does not occur.
On OSX it's less bad, but also worse than on Windows.
I am working on eliminating circular references for a future release.
On the other hand it could be related to memory fragmentation and possibly the
way Python's memory handling maps to the OS's memory handling.

Re: Memory leak?

Posted: November 20th, 2011, 9:52 am
by shypike
Found this info about memory management in Python:
Today, there are two cases when malloc returns memory on a typical
Unix system (in particular, in Linux malloc):
a) if the malloc block block is small (below page size), it is allocated
from the brk heap, where it can only be returned if the last page of
that heap is completely free, and
b) if the block is large (multiple pages), it gets returned to the
system right away.

Case b) can only happen if the C malloc uses mmap to allocate large
blocks. For Python, case b) is realistic, in the sense that most
allocations go to Python's obmalloc, and those allocate from C malloc
in chunks of 256kiB (IIRC). So when such an arena is completely free
(not a single Python object allocated on it anymore), it can get
returned to the system.

In some case, Python also allocates smaller blocks from C malloc; in
this case, a) will trigger. So as long as something at the end of the
heap stays allocated, C malloc will not return anything from the brk
heap to the system.
I assume that somehow this works a bit different on Windows and OSX.