4.2.0Beta1 Smart Duplicate detection

Questions and bug reports for Beta releases should be posted here.
Forum rules
Help us help you:
  • Tell us what system you run SABnzbd on.
  • Adhere to the forum rules.
  • Do you experience problems during downloading?
    Check your connection in Status and Interface settings window.
    Use Test Server in Config > Servers.
    We will probably ask you to do a test using only basic settings.
  • Do you experience problems during repair or unpacking?
    Enable +Debug logging in the Status and Interface settings window and share the relevant parts of the log here using [ code ] sections.
goldeneyes
Newbie
Newbie
Posts: 5
Joined: March 30th, 2011, 4:41 am

4.2.0Beta1 Smart Duplicate detection

Post by goldeneyes »

I am noticing that when grabbing a multiple episodes of a (standard) series, sabnzbd is marking every episode after the first as an alternative even though the season and episode numbers are unique. It looks like the smart matching for the queue items is causing the confusion

Code: Select all

023-12-02 15:05:24,083::DEBUG::[nzbstuff:1929] Duplicate checking NZB The Great British Bake Off - 2x05 - Pies avi-xpost (md5sum=ffbd690f3083400e6b3f59fa820f4b86)
2023-12-02 15:05:24,086::DEBUG::[nzbstuff:1932] Duplicate in history: False
2023-12-02 15:05:24,087::DEBUG::[nzbstuff:1936] Duplicate in backup: False
2023-12-02 15:05:24,088::DEBUG::[nzbstuff:1939] Duplicate in queue: False
2023-12-02 15:05:24,109::DEBUG::[nzbstuff:1944] Smart duplicate checking (The Great British Bake Off - 2x05 - Pies avi-xpost): the great british bake off//
2023-12-02 15:05:24,111::DEBUG::[nzbstuff:1947] Duplicate in history: False
2023-12-02 15:05:24,112::DEBUG::[nzbstuff:1950] Duplicate in queue: True
User avatar
safihre
Administrator
Administrator
Posts: 5366
Joined: April 30th, 2015, 7:35 am
Contact:

Re: 4.2.0Beta1 Smart Duplicate detection

Post by safihre »

It seems it's unable to detect the season and episode numbers. That would unfortunately have also failed on 4.1.0 if any were in history, as the same detection is applied.
It does highlight another bug, it shouldn't create a duplicate matching key when just the show is know but episode and season are not. I'll fix that.

Thanks for testing and reporting!
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
wilberfan
Newbie
Newbie
Posts: 6
Joined: September 3rd, 2011, 9:44 am

Re: 4.2.0Beta1 Smart Duplicate detection

Post by wilberfan »

I noticed this issue last night for the first time. Is there a manual override, or any way to 'force' a downlaod? I'm running inside docker on a Synology setup. Is it safe to downgrade until the issue is resolved?

[edit} I created a new container from stable, so all seems well for now. ;D
goldeneyes
Newbie
Newbie
Posts: 5
Joined: March 30th, 2011, 4:41 am

Re: 4.2.0Beta1 Smart Duplicate detection

Post by goldeneyes »

You can change the smart detection to tag instead of pause in the settings under Switches tab.
djones
Newbie
Newbie
Posts: 7
Joined: May 4th, 2023, 3:43 pm

Re: 4.2.0Beta1 Smart Duplicate detection

Post by djones »

I was happy to see duplicates finally getting some attention in the latest beta build, after so many years of waiting, and seeing "just use Sonarr" replied whenever the issue has been raised. The reality is there's lots of RSS fetched content for example that falls outside of what Sonarr handles.

I'm hoping the feature will continue to be fleshed out so some basic user control (overrides for example) are provided for Smart Duplicate detection, since right now I guess the feature is a black box. I *have* read how GuessIt works, as well as the Wiki page for SABNzbd's duplicate-detection.

More immediately, I'm hoping Smart Duplicate detection bypassing the prequeue script gets addressed. I had a lot of renaming logic in a prequeue python script to clean up messy filenames precisely so they would be easier to de-dupe (visually+manually) later.
Last edited by djones on December 4th, 2023, 9:26 pm, edited 2 times in total.
User avatar
safihre
Administrator
Administrator
Posts: 5366
Joined: April 30th, 2015, 7:35 am
Contact:

Re: 4.2.0Beta1 Smart Duplicate detection

Post by safihre »

You can bypass by setting the job to Force, either when adding or when the job is in the queue.

@djones: What if we run the duplicate detection again after the pre-queue result, only if it changed the name?
That seems reasonable.
Or what else would you like to do from the pre-queue script that isn't possible?
Indeed GuessIt is a black box, also to me. It just works, or sometimes it doesn't...
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
User avatar
safihre
Administrator
Administrator
Posts: 5366
Joined: April 30th, 2015, 7:35 am
Contact:

Re: 4.2.0Beta1 Smart Duplicate detection

Post by safihre »

@goldeneyes I see that I made a stupid mistake in smart duplicate detection so it never works for shows.. Very stupid.
You can download the fixed release here in a few minutes (does require a GitHub account).
Will release a new beta soon, just need to add automated-testing so this stupid mistake doesn't happen again.

https://github.com/sabnzbd/sabnzbd/acti ... 7099019691
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
djones
Newbie
Newbie
Posts: 7
Joined: May 4th, 2023, 3:43 pm

Re: 4.2.0Beta1 Smart Duplicate detection

Post by djones »

safihre wrote: December 5th, 2023, 4:49 am You can bypass by setting the job to Force, either when adding or when the job is in the queue.

@djones: What if we run the duplicate detection again after the pre-queue result, only if it changed the name?
That seems reasonable.
Or what else would you like to do from the pre-queue script that isn't possible?
Indeed GuessIt is a black box, also to me. It just works, or sometimes it doesn't...
That's an interesting idea. Can queue smart duplicate checking somehow always be active rather for the queue items rather than checking only once when a new job comes in? Or maybe create a user option for how often to check (on a timer lets say).

For your reference, here is a shortened version of the pre-queue script I was using, "shortened" meaning my actual script has 90+ rename/substitute lines:

prequeue.py

Code: Select all

import sys
import re

try:
    (scriptname, nzbname, postprocflags, category, script, prio, downloadsize, grouplist, showname, season, episodenumber, episodename, is_proper, resolution, decade, year, month, day, job_type) = sys.argv
    downloadsize = int(downloadsize)
   
except:
    sys.exit(1)    # exit with 1 causes SABnzbd to ignore the output of this script

fwp = nzbname
fwp = nzbname.replace('...', '.').replace('..', '.')
fwp = re.sub('(?i)\.4k', '.2160p', fwp)
fwp = re.sub('(?i)-Obfuscated$', '', fwp)
fwp = re.sub('(?i).READ.NFO', '', fwp)
fwp = re.sub('(?i).com.', '.', fwp)
fwp = re.sub('(?i).par2', '', fwp)

print("1")    # Accept
print(fwp)
print()
print()
print()
print() 
print()
# 0 means OK
sys.exit(0)
Last edited by djones on December 10th, 2023, 3:56 pm, edited 1 time in total.
djones
Newbie
Newbie
Posts: 7
Joined: May 4th, 2023, 3:43 pm

Re: 4.2.0Beta1 Smart Duplicate detection

Post by djones »

One more request for feature add/change: a field to define Smart Duplicate detection bypass keywords.

Currently there is simply a checkbox "Allow proper releases" with PROPER, REAL or REPACK words hardcoded. Perhaps it could be changed to "Allow duplicate bypass" and a field for user defined keywords - and PROPER, REAL, REPACK could be pre-populated in the field just for continuity.

Image

Example, I'd define "2160p" as a bypass keyword, because currently my RSS feeds will download a 1080p version of a video, then afterward a 2160p version enters the queue but it's marked as a duplicate.
Last edited by djones on December 10th, 2023, 3:45 pm, edited 1 time in total.
User avatar
safihre
Administrator
Administrator
Posts: 5366
Joined: April 30th, 2015, 7:35 am
Contact:

Re: 4.2.0Beta1 Smart Duplicate detection

Post by safihre »

@djones: In the new 4.2.0RC1 release I implemented that if the pre-queue script supplies a new job name, the Duplicate Detection is re-analysed.
Regarding your other request: that is really something that tools like Sonarr/Radarr are made for, they allow exactly such things.
The integration with TV/Movie-databases make them so much better at getting only 1 release even if the names don't really match, or updating a 1080p to 4K once it comes available.
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
djones
Newbie
Newbie
Posts: 7
Joined: May 4th, 2023, 3:43 pm

Re: 4.2.0Beta1 Smart Duplicate detection

Post by djones »

safihre wrote: December 10th, 2023, 3:45 pm @djones: In the new 4.2.0RC1 release I implemented that if the pre-queue script supplies a new job name, the Duplicate Detection is re-analysed.
My issue is the pre-queue py script I supplied does not function anymore since 4.2.0Beta1, the jobname substitutions are no longer happening. Unclear how to troubleshoot, I've looked in the logs but not experienced enough to know what to look for.
safihre wrote: December 10th, 2023, 3:45 pmRegarding your other request: that is really something that tools like Sonarr/Radarr are made for, they allow exactly such things.
The integration with TV/Movie-databases make them so much better at getting only 1 release even if the names don't really match, or updating a 1080p to 4K once it comes available.
Understood, there's lots of non-TV/Movie content that Sonarr/Radarr don't handle or I wouldn't be bothering to ask. VR content for example. All I'm suggesting is a whitelist field instead of PROPER, etc being hardcoded.
Last edited by djones on December 10th, 2023, 4:43 pm, edited 1 time in total.
djones
Newbie
Newbie
Posts: 7
Joined: May 4th, 2023, 3:43 pm

Re: 4.2.0Beta1 Smart Duplicate detection

Post by djones »

Another idea to enhance user control of Smart Duplicate detection, consideration of filesize - and again, for the vast exabytes of Usenet content that Sonarr/Radarr do not support:

A dropdown in Switches with options: Keep smallest size, Keep largest size, etc

I always want the largest version because it's usually the highest resolution, and would consider anything smaller a dupe. Other people like the smallest version of a video for their mobile device, etc.
User avatar
safihre
Administrator
Administrator
Posts: 5366
Joined: April 30th, 2015, 7:35 am
Contact:

Re: 4.2.0Beta1 Smart Duplicate detection

Post by safihre »

djones wrote: December 10th, 2023, 3:48 pm
safihre wrote: December 10th, 2023, 3:45 pm @djones: In the new 4.2.0RC1 release I implemented that if the pre-queue script supplies a new job name, the Duplicate Detection is re-analysed.
My issue is the pre-queue py script I supplied does not function anymore since 4.2.0Beta1, the jobname substitutions are no longer happening. Unclear how to troubleshoot, I've looked in the logs but not experienced enough to know what to look for.
Can you try 4.2.0RC1? I cannot reproduce this, my modified name from the pre-queue script is used.
If you enable Debug logging you can send the logs to me at [email protected]

Regarding the special keyworks for override: the proper/real/etc check is done by GuessIt, so not us checking specific keywords ourselves.
If you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
djones
Newbie
Newbie
Posts: 7
Joined: May 4th, 2023, 3:43 pm

Re: 4.2.0Beta1 Smart Duplicate detection

Post by djones »

safihre wrote: December 11th, 2023, 2:39 amCan you try 4.2.0RC1? I cannot reproduce this, my modified name from the pre-queue script is used.
Yep emailed you the logs. Meantime, I worked around it by re-writing the python script in powershell. It's jank since it has to be spawned by a windows batch file, but it works (4.2.0RC1).
Scarfaro
Newbie
Newbie
Posts: 10
Joined: February 28th, 2022, 5:28 am

Re: 4.2.0Beta1 Smart Duplicate detection

Post by Scarfaro »

The Smart Duplicate detection unfortunately ignores nzbs that are in subfolders. (NZB-Backup-Folder)

e.g. /incoming/nzb/M/Movie.gz

Could you please fix this?
Last edited by Scarfaro on December 23rd, 2023, 8:52 am, edited 1 time in total.
Post Reply