What is the easiest way to measure retention? I'm considering doing up a little chart of some usenet providers, + my own home provider
and I'm interested in measuring it myself rather than going with (possibly nonexistent/outdated) info published on websites.
thanks for any ideas!
How to measure retention?
Re: How to measure retention?
See what the host claims their retention is, attempt to download something that old, then continue going either up or down in days until you find the edge.
Testing this isn't fun, but good luck!
Testing this isn't fun, but good luck!
Re: How to measure retention?
I've been thinking about the same. Methods to do it:
Passively:
1) analyze SAB's log files; SAB knows the age of a post, and knows which servers don't have the post ("missing").
(BTW: I don't see positive mentions in the log file ... I don't know if you can turn that on. Even "+Debug" does not show positive hits.)
So ... analyze "missing" lines like:
Combined with the age of the post, you now have an indication that the retention of news.lightningusenet.com is lower than the age of the post. All SAB-users together (meaning: SAB itself, after user's consent) could post this anonymously, and we'll get a good overview.
Special attention should be paid to "missing from all servers": it could mean the post is older than the retention of all servers you have enabled, *or* the post is simply not there at all as in "never posted". In the last case, it should be left out from the retention determination.
@Shypike: is there a way to get positive hits into the log file (and not only "missing")? Or is it already there, and am I overlooking it? With positive hits, you get a better overview what's going on.
Or, actively:
2) post a 10MB file each day (or week), with a per day (or week)) unique identifier like "retention-20110401.bin". Then, each day (or week), try to retrieve old posts from different newsservers. Post the results.
Passively:
1) analyze SAB's log files; SAB knows the age of a post, and knows which servers don't have the post ("missing").
(BTW: I don't see positive mentions in the log file ... I don't know if you can turn that on. Even "+Debug" does not show positive hits.)
So ... analyze "missing" lines like:
Code: Select all
Thread [email protected]:119: Article [email protected] missing
Special attention should be paid to "missing from all servers": it could mean the post is older than the retention of all servers you have enabled, *or* the post is simply not there at all as in "never posted". In the last case, it should be left out from the retention determination.
@Shypike: is there a way to get positive hits into the log file (and not only "missing")? Or is it already there, and am I overlooking it? With positive hits, you get a better overview what's going on.
Or, actively:
2) post a 10MB file each day (or week), with a per day (or week)) unique identifier like "retention-20110401.bin". Then, each day (or week), try to retrieve old posts from different newsservers. Post the results.
Last edited by sander on April 1st, 2011, 2:16 am, edited 1 time in total.
Please don't send me unrequested PM's; the forum is the best way to communicate.
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: How to measure retention?
@shypike:
Question 1:
Can I put in a "positive" hit (so article successfully download) somewhere in the SAB source? I tried downloader.py: "logging.error('SJ-code')" around line 566 , but it prints a lot of line per article ...
Question 2:
the date of the post is not in the sabnzbd.log, or is it? Do I need to search the .NZB.gz in the nzb-backup-dir to find the date?
Question 1:
Can I put in a "positive" hit (so article successfully download) somewhere in the SAB source? I tried downloader.py: "logging.error('SJ-code')" around line 566 , but it prints a lot of line per article ...
Question 2:
the date of the post is not in the sabnzbd.log, or is it? Do I need to search the .NZB.gz in the nzb-backup-dir to find the date?
Please don't send me unrequested PM's; the forum is the best way to communicate.
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: How to measure retention?
Q1: the decoder already logs positive hits or do you want the specific server?
Q2: That's right, the age isn't logged.
Don't go overboard with this.
You'll never get an exact figure anyway.
The actual deletion policy is very likely to be determined by available space
instead of the age of the article.
Q2: That's right, the age isn't logged.
Don't go overboard with this.
You'll never get an exact figure anyway.
The actual deletion policy is very likely to be determined by available space
instead of the age of the article.
Re: How to measure retention?
In the mean time I've written a proof of concept. It first parses the NZB-backup directory to store all article id's and their post dates in a hash. Then it parses the sabnzbd.log* files looking for "missing", and constructs the statistics based on that.
The result is here:
So:
newszilla6's retention is something less than 33 days
ipv6.xsnews' retention is something less than 836 days
With more different download (each with a different age), there will be more specific retention info. And indeed that's probably not a fixed cut-off date, but it will give an indication of the retention. And even the completeness will be clear.
The result is here:
Code: Select all
sander@lifebook:~/.sabnzbd$ python retention-determination.py | sort | uniq -c
1242 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 217
79 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 33
36 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 836
13 newsserver X retention probably less than Y days: reader.ipv6.xsnews.nl 836
sander@lifebook:~/.sabnzbd$
So:
newszilla6's retention is something less than 33 days
ipv6.xsnews' retention is something less than 836 days
With more different download (each with a different age), there will be more specific retention info. And indeed that's probably not a fixed cut-off date, but it will give an indication of the retention. And even the completeness will be clear.
Last edited by sander on April 1st, 2011, 9:04 am, edited 1 time in total.
Please don't send me unrequested PM's; the forum is the best way to communicate.
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: How to measure retention?
How can I see a positive article hit? IMHO with a clean download, there's no article id mentioned at all in the log file?shypike wrote: Q1: the decoder already logs positive hits or do you want the specific server?
I do see "oot.105.1/oot.105-diff.r02", but that's the file, not the article id, right?
EDIT:
The server name would be very nice, as it would be a positive signal about retention, but if not, I have to rely on "missing". I just want to make sure I'm not looking at a non-posted article, and blaming a specific newsserver for that.
Last edited by sander on April 1st, 2011, 9:16 am, edited 1 time in total.
Please don't send me unrequested PM's; the forum is the best way to communicate.
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: How to measure retention?
Interesting:
From the above, I think I can deduct that
1) I tried to download post with age: 33, 217, 541, 710 and 836 days (and maybe younger, but that's not logged)
2) newszilla6's retention is probably below 33 days
3) ipv6.xsnews apparantly can handle post of 33, 327 and 541 days (and not of 710 and 836 days). So retention is between 541 and 710 days
4) eweka has only 2 missing 710-day articles, and *no* missing 836-day articles. I would say retention is above 836 days ... Interesting.
Code: Select all
sander@lifebook:~/.sabnzbd$ python retention-determination.py | sort | uniq -c
2 newsserver X retention probably less than Y days: newsreader3.eweka.nl 710
1242 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 217
79 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 33
11 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 541
829 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 710
36 newsserver X retention probably less than Y days: newszilla6.xs4all.nl 836
83 newsserver X retention probably less than Y days: reader.ipv6.xsnews.nl 710
14 newsserver X retention probably less than Y days: reader.ipv6.xsnews.nl 836
sander@lifebook:~/.sabnzbd$
1) I tried to download post with age: 33, 217, 541, 710 and 836 days (and maybe younger, but that's not logged)
2) newszilla6's retention is probably below 33 days
3) ipv6.xsnews apparantly can handle post of 33, 327 and 541 days (and not of 710 and 836 days). So retention is between 541 and 710 days
4) eweka has only 2 missing 710-day articles, and *no* missing 836-day articles. I would say retention is above 836 days ... Interesting.
Please don't send me unrequested PM's; the forum is the best way to communicate.
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: How to measure retention?
@bluenote: Which Operating System do you use: Linux, Mac or Windows?bluenote wrote: What is the easiest way to measure retention? I'm considering doing up a little chart of some usenet providers, + my own home provider
and I'm interested in measuring it myself rather than going with (possibly nonexistent/outdated) info published on websites.
thanks for any ideas!
Please don't send me unrequested PM's; the forum is the best way to communicate.
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
If someone helps you, please reply to that help.
f you like our support, check our special newsserver deal or donate at: https://sabnzbd.org/donate
Re: How to measure retention?
Hey sander
that's some impressive stuff you've got going there:)
I use windows though ..
thx
that's some impressive stuff you've got going there:)
I use windows though ..
thx