High wakeups/second count (0.7.16)

Get help with all aspects of SABnzbd
Forum rules
Help us help you:
  • Are you using the latest stable version of SABnzbd? Downloads page.
  • Tell us what system you run SABnzbd on.
  • Adhere to the forum rules.
  • Do you experience problems during downloading?
    Check your connection in Status and Interface settings window.
    Use Test Server in Config > Servers.
    We will probably ask you to do a test using only basic settings.
  • Do you experience problems during repair or unpacking?
    Enable +Debug logging in the Status and Interface settings window and share the relevant parts of the log here using [ code ] sections.
Post Reply
doctorow
Newbie
Newbie
Posts: 5
Joined: October 4th, 2013, 3:01 pm

High wakeups/second count (0.7.16)

Post by doctorow »

Hi,

I noticed very high wakeups per second for sabnzbd running in background. The process is completely idle (no downloads or whatsoever) and no client is connected to the webbrowser. Yet, the number of wakeups is around 50 per second on my system (for any other idle process on this machine the number is either 0.05 or 0.10).

I did a quick strace on the idle sabnzd process and noticed a crazy amount of polling:
[pid 3474] 20:00:05 poll([{fd=8, events=POLLIN}], 1, 1000 <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 9052}) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout)
[pid 3472] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 8000} <unfinished ...>
[pid 3472] 20:00:05 select(0, NULL, NULL, NULL, {3, 0} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3509] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3509] 20:00:05 select(0, NULL, NULL, NULL, {1, 0} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 32000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3504] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3504] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
[pid 3473] 20:00:05 <... select resumed> ) = 0 (Timeout)
[pid 3473] 20:00:05 select(0, NULL, NULL, NULL, {0, 50000} <unfinished ...>
etc.

I even tried to limit various options that could be the culprit here, including minimizing logging output, and making sure that any configurable polling number is at least 60s (such as queue file detecting) without any effect.

Any ideas? ???

Doc
doctorow
Newbie
Newbie
Posts: 5
Joined: October 4th, 2013, 3:01 pm

Re: High wakeups/second count (0.7.16)

Post by doctorow »

Btw, a "select(0, NULL, NULL, NULL,..." in python probably means it's in time.sleep(). It seems like there is a particular loop here responsible for calling the sleep function repeatedly within one second. I understand why that would be needed if sabnzbd was busy (for example the decoder has a sleep(0.001) within its decode class), but why should it be the case whensabnzbd is completely idle? Looks like a bug to me.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: High wakeups/second count (0.7.16)

Post by shypike »

I haven't seen that behaviour on any system yet.
There is some polling going on, but idle time should be below 5% cpu load.
Does restart solve this issue?
doctorow
Newbie
Newbie
Posts: 5
Joined: October 4th, 2013, 3:01 pm

Re: High wakeups/second count (0.7.16)

Post by doctorow »

Hi shypike,

it doesn't affect CPU usage as much; rather, it prevents the CPU from going into idle. Restarting wouldn't solve this issue.

You can use the powertop command to see the number of wakeups per second for every running process. For instance, I have nginx on the same box, and the number of wakeups for its process is zero (nginx is configured to use epoll). Output:

Code: Select all

**Overview of Software Power Consumers**, 

Usage,  Wakeups, GPU ops, Disk IO, GFX Wakeups, Category, Description, 
  1.2 ms/s,  45.2, , , , Process, /usr/bin/python -OO /usr/bin/sabnzbdplus --daemon --server 192.168.0.212:80 , 
134.0 us/s,   8.6, , , , Interrupt, [1] timer(softirq), 
120.3 us/s,   2.7, , , , Interrupt, [3] net_rx(softirq), 
  0.7 ms/s,   1.8, , , , Process, /usr/sbin/smbd -D , 
 53.0 us/s,   1.0, , , , Process, [events/1], 
 30.0 us/s,   1.0, , , , Process, [events/0], 
 22.7 us/s,   0.6, , , , Process, init [2] , 
437.8 us/s,  0.10, , , , Process, powertop --csv test , 
438.3 us/s, , , , , Interrupt, [0] timer/0, 
 14.7 us/s,  0.15, , , , Process, qmgr -l -t fifo -u , 
  1.7 us/s,  0.15, , , , Process, [sync_supers], 
  1.0 us/s,  0.15, , , , Process, [ksoftirqd/1], 
 59.7 us/s,  0.10, , , , Process, [kswapd0], 
 16.1 us/s,  0.10, , , , Process, /usr/sbin/nmbd -D , 
  3.4 us/s,  0.10, , , , Process, /usr/sbin/openvpn --writepid /var/run/openvpn.server.pid --daemon ovpn-server --cd /etc/openvpn --config /etc/openvpn/server.co, 
  1.4 us/s,  0.10, , , , Process, [flush-8:0], 
  9.8 us/s,  0.05, , , , Process, /usr/lib/postfix/master , 
  2.0 us/s,  0.05, , , , Process, [khungtaskd], 
  1.5 us/s,  0.05, , , , Process, [bdi-default], 
 47.1 us/s, , , , , Interrupt, [9] RCU(softirq), 
 38.5 us/s, , , , , Process, [kondemand/1], 
 34.6 us/s, , , , , Process, [kondemand/0], 
 15.8 us/s, , , , , Interrupt, [7] sched(softirq), 
 12.2 us/s, , , , , Interrupt, [27] eth1, 
  0.8 us/s, , , , , Interrupt, [8] hrtimer(softirq), 
  0.3 us/s, , , , , Interrupt, [2] net tx(softirq) 
You can see above how sabnzbd is the only process on this idle machine that is causing CPU wakeups although it is completely idle.

If you strace the sabnzbd process (including its children processes), you'll see the how active sabnzbd is when doing nothing. Hook into an existing process like this:

strace -v -f -t -s 500 -p <pid>

where <pid> is the process id of the main sabnzbd process.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: High wakeups/second count (0.7.16)

Post by shypike »

I see, haven't really looked at this before.
There is a pause_all scheduler event that should stop most activity.
I've put this on the long todo list for now.
doctorow
Newbie
Newbie
Posts: 5
Joined: October 4th, 2013, 3:01 pm

Re: High wakeups/second count (0.7.16)

Post by doctorow »

Thanks for looking into this.
idontknow
Newbie
Newbie
Posts: 2
Joined: September 4th, 2012, 6:31 am

Re: High wakeups/second count (0.7.16)

Post by idontknow »

This is a known problem with python 2.x: http://bugs.python.org/issue17748

Since locks do not support timeouts, when a thread is asked to wait with a timeout it will just repeatedly sleep() with a maximum time of 50 ms and check the lock manually.

In the PostProcessor thread there is a call to Queue.get() with a timeout:

Code: Select all

nzo = self.queue.get(timeout=3)
which causes the thread to wait with a timeout and hence the high wakeup count.

This is fixed in python 3.x but sabnzbdplus does not support it. In 2.x it can be worked around by not specifying a timeout when calling Queue.get() and doing the timeout in another thread.
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: High wakeups/second count (0.7.16)

Post by shypike »

Python 3 isn't supported because it would require making the code Python 2.7/3.x only.
This would exclude quite a few embedded systems.
(Also it's a lot of work of course.)
I'll look into using your suggestion.
Thanks for the tip!
doctorow
Newbie
Newbie
Posts: 5
Joined: October 4th, 2013, 3:01 pm

Re: High wakeups/second count (0.7.16)

Post by doctorow »

idontknow, thanks for the pointer!!
User avatar
shypike
Administrator
Administrator
Posts: 19774
Joined: January 18th, 2008, 12:49 pm

Re: High wakeups/second count (0.7.16)

Post by shypike »

I've implemented a fix, well hopefully anyway.
Post Reply