Page 1 of 2
help with dup detection and unwanted extensions.
Posted: October 15th, 2016, 8:33 pm
by diamaunt
Hi, I'm running 1.1.0, and facing a vexing issue. there's some pinhead posting crap with extensions like .1.2.txt, .workflow, .wflow, .xlsx, .plist. .exe (all in the same rar with a mkv)... I've put "workflow, wflow, exe, com" in the unwanted extensions field, with action of 'abort', and those are still coming through... help?
also, I've set 'Detect duplicate episodes in series' to 'pause' and those are also still being downloaded and dropped onto disk.
is there some magic I need to do that I'm missing?
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 3:31 am
by sander
Strange.
are the unwanted extensions in the rar file, or in a zip in the rar file?
can you post a link to such an NZB?
Can you generate a sabnzbd log via de Status (=wrench) symbol, and find these lines:
Code: Select all
unwanted_extensions = xxxx, exe, com, ppt
action_on_unwanted_extensions = 1
Post what you've got
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 11:28 am
by diamaunt
sander wrote:Strange.
are the unwanted extensions in the rar file, or in a zip in the rar file?
can you post a link to such an NZB?
Can you generate a sabnzbd log via de Status (=wrench) symbol, and find these lines:
Code: Select all
unwanted_extensions = xxxx, exe, com, ppt
action_on_unwanted_extensions = 1
Post what you've got
Code: Select all
unwanted_extensions = workflow, wflow, exe, com
action_on_unwanted_extensions = 2
here's a sample nzb:
https://www.dropbox.com/s/jgqfadv4r0k7l ... zb.gz?dl=0
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 11:43 am
by safihre
I checked and unfortunately this poster is smart and posts in RAR5 format.
Our unwanted-extensions check can (for now) only check files in RAR3 format.. We're working on RAR5 support.
Why don't you put a cleanup list to remove theses type of files after extraction automatically? Or the video doesn't even work?
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 1:01 pm
by sander
@safihre You mean RarFile() is not able to read RAR5 formatted files, right? I had a look at rarfile.py ... wow, what a lot of code. Can't we the unrar binary take care of that?
(Oh, I now see
https://github.com/markokr/rarfile/issues/13 )
@diamaunt do you see lines in the format 'RAR file %s cannot be inspected.' in your sabnzb.log, so for example:
Code: Select all
2016-10-16 19:51:12,527::DEBUG::[assembler:349] RAR file /home/sander/Downloads/incomplete/blabla.S13E02/mDTKIZeyrzyPzqQrwmHV.part36.rar cannot be inspected.
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 2:12 pm
by safihre
I checked his example NZB, it indeed says "cannot be inspected"
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 2:47 pm
by sander
@safihre: I got it working with a few lines of code in assembler.py. Result:
Code: Select all
WARNING: In "bla.bla.S13E02" unwanted extension in RAR file. Unwanted file is niet lezen hoor!!/QoQ-REVERSE/Reverse.MAC.2014/QoQ-reverse.workflow/Contents/document.wflow
Code is based on
Does that "lb" work on Windows?
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 3:11 pm
by diamaunt
safihre wrote:I checked and unfortunately this poster is smart and posts in RAR5 format.
Our unwanted-extensions check can (for now) only check files in RAR3 format.. We're working on RAR5 support.
Why don't you put a cleanup list to remove theses type of files after extraction automatically? Or the video doesn't even work?
the video is an unwanted duplicate, usually posted a few days (or a week or two) after I got a good copy.
the "Detect duplicate episodes in series" is also not working for me, (it shows that it's set to 'pause'), but doesn't pause anything.
Re: help with dup detection and unwanted extensions.
Posted: October 16th, 2016, 3:14 pm
by diamaunt
sander wrote:@safihre You mean RarFile() is not able to read RAR5 formatted files, right? I had a look at rarfile.py ... wow, what a lot of code. Can't we the unrar binary take care of that?
(Oh, I now see
https://github.com/markokr/rarfile/issues/13 )
@diamaunt do you see lines in the format 'RAR file %s cannot be inspected.' in your sabnzb.log, so for example:
Code: Select all
2016-10-16 19:51:12,527::DEBUG::[assembler:349] RAR file /home/sander/Downloads/incomplete/blabla.S13E02/mDTKIZeyrzyPzqQrwmHV.part36.rar cannot be inspected.
yes, I do see a number of those.
:/f1/news/logs$ grep -c "cannot be inspected" *
cherrypy.log:0
sabnzbd.error.log:0
sabnzbd.log:0
sabnzbd.log.1:72
sabnzbd.log.2:0
sabnzbd.log.3:144
sabnzbd.log.4:128
sabnzbd.log.5:238
Re: help with dup detection and unwanted extensions.
Posted: October 17th, 2016, 1:20 am
by safihre
I will finish today the upgrade of our build in Rar Reader.
Yes we could do it using the external unrar command, but then we would have to write an interpreter for the rar output since we also want it to report on encryption and all the files inside the rar.
Regarding the duplicate detection, sabnzbd can only do so much. We can detect truly identical NZBs (exact same content) or ones with the identical name. Or we can try to extract the title of a show using smart regular expressions. But if we misinterpreted the name or episode number of the show, it won't work anymore.
Luckily a tool like Sonarr is made for this and will use NZB indexers to make sure you don't get crazy, maybe try that?
Re: help with dup detection and unwanted extensions.
Posted: October 17th, 2016, 1:21 am
by safihre
But maybe it is a bug within sabnzbd, do you maybe have the NZB of the original download and the one it should've marked as duplicate? Can send them to
[email protected]
Re: help with dup detection and unwanted extensions.
Posted: October 17th, 2016, 8:44 am
by diamaunt
safihre wrote:Regarding the duplicate detection, sabnzbd can only do so much. We can detect truly identical NZBs (exact same content) or ones with the identical name. Or we can try to extract the title of a show using smart regular expressions. But if we misinterpreted the name or episode number of the show, it won't work anymore.
Luckily a tool like Sonarr is made for this and will use NZB indexers to make sure you don't get crazy, maybe try that?
But maybe it is a bug within sabnzbd, do you maybe have the NZB of the original download and the one it should've marked as duplicate? Can send them to
[email protected]
All I know is that I spotted the relatively new feature: "Detect duplicate episodes in series, Detect identical episodes in series (based on "name/season/episode")" and turned that on, set to 'pause' and I keep getting duplicates dropped into the same directory.
I'll email you some nzbs.
Re: help with dup detection and unwanted extensions.
Posted: October 19th, 2016, 2:41 am
by diamaunt
sander wrote:@safihre: I got it working with a few lines of code in assembler.py. Result:
Code: Select all
WARNING: In "bla.bla.S13E02" unwanted extension in RAR file. Unwanted file is niet lezen hoor!!/QoQ-REVERSE/Reverse.MAC.2014/QoQ-reverse.workflow/Contents/document.wflow
Code is based on
Does that "lb" work on Windows?
any chance of sharing those few lines?

Re: help with dup detection and unwanted extensions.
Posted: October 19th, 2016, 3:07 am
by sander
diamaunt wrote:sander wrote:@safihre: I got it working with a few lines of code in assembler.py. Result:
Code: Select all
WARNING: In "bla.bla.S13E02" unwanted extension in RAR file. Unwanted file is niet lezen hoor!!/QoQ-REVERSE/Reverse.MAC.2014/QoQ-reverse.workflow/Contents/document.wflow
Code is based on
Does that "lb" work on Windows?
any chance of sharing those few lines?

Certainly! A few checks:
- you run SABnzbd from source, so on Linux or Unix?
- you know how to edit a source code file?
Re: help with dup detection and unwanted extensions.
Posted: October 19th, 2016, 10:55 am
by diamaunt
sander wrote:diamaunt wrote:sander wrote:@safihre: I got it working with a few lines of code in assembler.py. Result:
Code: Select all
WARNING: In "bla.bla.S13E02" unwanted extension in RAR file. Unwanted file is niet lezen hoor!!/QoQ-REVERSE/Reverse.MAC.2014/QoQ-reverse.workflow/Contents/document.wflow
Code is based on
Does that "lb" work on Windows?
any chance of sharing those few lines?

Certainly! A few checks:
- you run SABnzbd from source, so on Linux or Unix?
- you know how to edit a source code file?
Yes, I'm running SABnzbd from the tar source on solaris 11.3.
no problems at all editing the source.