Page 1 of 1

Final file detected as a sample and deleted in error.

Posted: June 25th, 2015, 6:44 am
by vdown
This is the first time I've ever seen this happen and only happens if you use the sample clear up functionality. Apologies if this is a known bug, I couldn't find it in the forums.

If a video file name ends with an upper case S, the file is deleted as a sample file, e.g.:-

**********.*******.720p.WEB-DL.DD5.1.H.264-S

This happened on Ubuntu 14.10 with 0.7.20 and OSX 10.10 with 0.8.0Alpha2 and the main 1.3GB file was the only file in the RAR archive.

Log:-

2015-06-25 12:23:01,297::INFO::[postproc:356] unpack_magic finished on **********.*******.720p.WEB-DL.DD5.1.H.264-S
2015-06-25 12:23:01,299::INFO::[postproc:772] Removing unwanted sample file /Users/downloads**********.*******.720p.WEB-DL.DD5.1.H.264-S/**********.*******.720p.WEB-DL.DD5.1.H.264-S.mkv
2015-06-25 12:23:01,765::INFO::[postproc:539] Cleaning up **********.*******.720p.WEB-DL.DD5.1.H.264-S (keep_basic=False)

Re: Final file detected as a sample and deleted in error.

Posted: June 25th, 2015, 8:38 am
by sander
"Removing unwanted sample file" is printed by funtion remove_samples(), which uses 'sample_match' as regexp criterium:

Code: Select all

sample_match = r'((^|[\W_])sample\d*[\W_])|(-s\.\w+$)' # something-sample.avi something-s.avi
So let's play with that regexp:

Code: Select all

$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:38)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> sample_match = r'((^|[\W_])sample\d*[\W_])|(-s\.\w+$)' # something-sample.avi something-s.avi
>>> RE_SAMPLE = re.compile(sample_match, re.I)

>>> RE_SAMPLE.search('hello there.avi')
>>> RE_SAMPLE.search('hello there-s.avi')
<_sre.SRE_Match object at 0xb7485200>
>>> RE_SAMPLE.search('hello there-S.avi')
<_sre.SRE_Match object at 0xb7485250>
>>> RE_SAMPLE.search('hello there-SOB.avi')
>>> RE_SAMPLE.search('hello thereS.avi')
So indeed a "-s" before the extension is considered to be a sample file; see the regexp code "(-s\.\w+$)".

As it is explicitly programmed, there must be a reason for that. Hopefully Shypike knows that reason.

You as an Ubuntu user can of course remove that piece of the regexp from the file sabnzbd/constants.py source code.

HTH

Re: Final file detected as a sample and deleted in error.

Posted: June 25th, 2015, 1:08 pm
by ALbino
I'm glad you posted this, as I had the same thing happen yesterday (probably the same file). I've never seen it before either, so it was a first for me as well.

Re: Final file detected as a sample and deleted in error.

Posted: June 25th, 2015, 1:13 pm
by shypike
sander wrote: So indeed a "-s" before the extension is considered to be a sample file; see the regexp code "(-s\.\w+$)".
As it is explicitly programmed, there must be a reason for that. Hopefully Shypike knows that reason.
Added at a time when a lot of stuff like "my movie-s.avi" was posted as being a sample.
Maybe that should be removed again.
To be honest, I think the whole sample removal wasn't such good idea.
In 0.8.0 the option "don't download" samples has been removed, because it leads to problems
when posters put the sample file(s) in the same par-set as the payload.

So what should we do? Remove the "-s" thing?

Re: Final file detected as a sample and deleted in error.

Posted: June 25th, 2015, 2:03 pm
by ALbino
This is the first time I've ever seen it, so I would either just leave it as it is and call it a fluke, or I would just remove only the -s thing. Everything else seems to work fine 99% of the time.

Re: Final file detected as a sample and deleted in error.

Posted: June 25th, 2015, 2:38 pm
by safihre
Remove just the -s, there's probably still people out there using this!

Re: Final file detected as a sample and deleted in error.

Posted: June 26th, 2015, 2:50 am
by an3k
I also would say "remove the -s" and if you're back and we all have more time we can improve the "remove sample" code so that it deletes them but only if its not the only (video) file. As I told you I'm not only interested in maintaining Plush but also in helping you guys with the code. Btw. when are you coming back? :)

Re: Final file detected as a sample and deleted in error.

Posted: July 2nd, 2015, 2:22 pm
by shypike
Removed -s in Alpha3.

Re: Final file detected as a sample and deleted in error.

Posted: July 4th, 2015, 6:35 am
by zoggy
i still see a ton of stuff that use -s for their samples.. but ideally one should not rely on a name.. look at the runtime of the file and decide off that..

Re: Final file detected as a sample and deleted in error.

Posted: July 4th, 2015, 7:53 am
by shypike
zoggy wrote: look at the runtime of the file and decide off that..
Can you explain this further?

Re: Final file detected as a sample and deleted in error.

Posted: July 4th, 2015, 2:05 pm
by ALbino
I think maybe he means do a mediainfo on the files and if one is only a minute in length and the other is longer, then declare that smaller one the sample and remove it.

Re: Final file detected as a sample and deleted in error.

Posted: July 4th, 2015, 6:39 pm
by zoggy
TV rules:
│ Samples: │
│ - REQUIRED! │
│ - 50-70 seconds in length and in a separate folder marked as Sample │
Bluray:
- SAMPLE
S1) A sample roughly 60 seconds in length is required.
It must be cut from the movie m2ts file and not encoded separately.
BDr:
- SAMPLE
S1) A sample of 1 minute, to a maximum size of 250 MB, cut from the
release is required in a 'Sample' directory. A .m2ts file must
be cut and used as sample. You cannot rename a rar and use it as
the sample.
uk-tv:
▐░▌ - Sample: ▐░▌
▐░▌ - All releases must include a sample of length :50-1:10 cut
older sd xvid rule:
│ [ Samples ] │
│ 9.1) It is required to include a 50-70 seconds sample for each release.
etc etc.. anyways so if you just had a default threshold of 90secs, if you see that the runtime is less than that.. pretty safe to assume its a sample.
several apps have been using this method for awhile and works great... since rarely is there a movie or tv show that would be 90 secs or less.. even some of the smallest tv shows are still 3mins+ (youtube/one off things).

sonarr uses 90 sec or 70mb as its thresholds = https://github.com/Sonarr/Sonarr/blob/d ... tSample.cs

in sb, we used hachoir metadata to get the info.. sadly sb just never does anything with it though...
http://hachoir3.readthedocs.org/metadata.html

im about to run off to work but i can provide more info later