Page 14 of 15

Re: Autodownload script

Posted: February 18th, 2009, 11:01 pm
by bnevets27
Thanks for the heads up on the new thread. Definitely need some organization of these scripts. It's hard to keep track of updates and tweaks to existing scripts.

I tried version 0.88 too but get the same error. "FAILED PROCESSING"

Should I be posting here or in the new thread from now on??

Also how do I get an output log? I can watch it run in the command prompt but I have a feeling I'm going to need to output a log. Bare with me this is the first time I've worked with scripts.

More info I forgot to include, I'm using nzbs.org if that matters.

Re: Autodownload script

Posted: February 18th, 2009, 11:30 pm
by rollingeyeball
New thread :P copy paste those questions there too.

Re: Autodownload script

Posted: February 19th, 2009, 4:16 am
by slippers_123
Thanks bnevets27!  I will have a play this weekend.  Many thanks.

Re: Autodownload script

Posted: February 19th, 2009, 12:24 pm
by BrandonG777
Ok, here's my contribution to this thread. Mine is for newzbin and the changes are to maintain a history file, verify that the movie hasn't been downloaded previously by checking the history file and also checking the dirs for existing movies. I also changed it to look for a watch file instead of dirs. It also supports a username and password for sabnzbd. This is my first time dealing with python so there are probably better ways to handle these things but it appears to be working.

Code: Select all

#!/usr/bin/python

import urllib
import xml.dom.minidom
import re
import time
import os
import sys

##########
# MOVIES #
##########

################################### change to your own values #####################################

# set this to the location of your downloads.  the first entry must be your live download area
movie_dirs=[ "/mnt/disk1/share/nzb/Movies", "/mnt/disk1/share/Movies" ]

# history file (must be create a blank file first)
history_file="/home/brandon/.movies.dat"

# a directory that will hold string to match against movies that you want no matter what score they get
watch_file="/mnt/disk1/share/nzb/watchfile.txt"

# your SABNZB address and credentials
my_SAB="http://hostname:8000"
username="brandon"
password="******"

# the RSS link to your search
url="newzbinurl"

################################### change to your own values #####################################

def removeIllegalChars(origstring):
	if sys.platform == "win32":
		newstring=re.sub("[\\?,]", ".", origstring)
		newstring=re.sub("[:]", ";", newstring)
	else:
		newstring=re.sub("[/?,]", ".", origstring)
#	print origstring+" becomes "+newstring
	return newstring
	

movie_list = []
for dir in movie_dirs:
	for movie in os.listdir(dir):
		movie_list.append(movie)
# Read History File
filein = open(history_file,"r")
for line in filein.readlines(): 
	movie_list.append(line)
filein.close()
print str(len(movie_list))+" movies already downloaded"

watch_list = []
watch_filein = open(watch_file,"r")
for line in watch_filein.readlines():
	watch_list.append(line)
watch_filein.close()
print "watching for "+str(len(watch_list))+" movies"

dom=xml.dom.minidom.parse(urllib.urlopen(url))
# go through each entry returned from the above newzbin search
for node in dom.getElementsByTagName("item"):
	
	report_title=node.getElementsByTagName("title")[0].childNodes[0].data
	print time.strftime("%Y-%m-%d %H:%M:%S")+": "+ report_title

	optimized_movie_title = report_title.split(" (")[0]
	if re.match(re.compile("[0-9][0-9][0-9][0-9]\)"), report_title):
		if re.match(re.compile("[0-9][0-9][0-9][0-9]\)"), report_title.split(" (")[1]):
			movie_year = report_title.split(" (")[1].split(")")[0]
	#optimized_movie_title = movie_title

	legal_movie_title = removeIllegalChars(report_title)
						
	more_info=node.getElementsByTagName("report:moreinfo")[0].childNodes[0].data
	msgid=node.getElementsByTagName("report:id")[0].childNodes[0].data
	
	# is there an imdb link?  if there is, follow it and check the score
	if re.compile("http://www\.imdb\.com.*").match(more_info):
		
		sock = urllib.urlopen(more_info)
		htmlSource = sock.read()                     
		sock.close()
		
		if re.compile("<b>[7-9]\.[0-9]/10</b>").search(htmlSource) and re.compile(">[0-9]*,[0-9][0-9][0-9] votes<").search(htmlSource):
			download = 1
		else:
			download = 0

# doesn't get a good imdb score, but am I watching for it?
		if download == 0:
			for watchfile in watch_list:
				if re.match(optimized_movie_title, watchfile):
					print optimized_movie_title+" is not very good but according to watch keyword \""+watchfile+"\" you want it - you're the boss!"
					download = 1
					break

			
		if download == 1:
			# it's worth a download but do we have it already?
			regex_movie_title_string = removeIllegalChars(optimized_movie_title)
			regex_movie_title = re.compile(regex_movie_title_string)
			regex_movie_year = re.compile(".*\([0-9][0-9][0-9][0-9]\).*")
			print "Checking to see if we have : \"" + regex_movie_title_string+"\"",
			for movie_file in movie_list:
				if re.match(regex_movie_title, movie_file) and re.match(regex_movie_year ,movie_file):
					print " - yes"
					download = 0
					break

		
			if download == 1:
				print " - no"

				# we want it, we don't have it, but is it in the queue already?
				queue=xml.dom.minidom.parse(urllib.urlopen(my_SAB+"/sabnzbd/api?mode=qstatus&output=xml&ma_username="+username+"&ma_password="+password))
				for job in queue.getElementsByTagName("job"):
					filename=job.getElementsByTagName("filename")[0].childNodes[0].data
					regex_filename_string = removeIllegalChars(optimized_movie_title)
					regex_filename = re.compile(regex_filename_string)
					if re.match(regex_filename, filename):
						download = 0

			if download == 1:
				print time.strftime("%Y-%m-%d %H:%M:%S")+": ++++++ added "+ optimized_movie_title+" ++++++"
				# this will prevent the same movie being downloaded twice in this session
				#os.makedirs(movie_dirs[0]+"\\"+legal_movie_title)
				# add this title to our view of what movies we have
				movie_list.append(legal_movie_title)
				
				# Append movie to history file
				fileout = open(history_file, "a")
				fileout.write( legal_movie_title+ "\n" )
				fileout.close()
				urllib.urlopen(my_SAB+"/sabnzbd/api?mode=addid&name="+msgid+"&ma_username="+username+"&ma_password="+password)
				

dom.unlink()

Re: Autodownload script

Posted: February 22nd, 2009, 4:35 am
by rollingeyeball
Brandon is there a bug in

Code: Select all

if re.compile("<b>[7-9]\.[0-9]/10</b>").search(htmlSource) and re.compile(">[0-9]*,[0-9][0-9][0-9] votes<").search(htmlSource):
Because for example, it downloaded a 7.8 rated movie, but not an 8.8 rated one.

Also, whats the size / quality limits?

I'm going to try this with the regex from binhex' script and see how she blows :D

EDIT: My bad. <1000 Votes..
I might change it to 600 or something, since I'm aiming at docos, and i doubt they'd be voted the same way.

EDIT2: Fixd I think.
My question on sieze remains though. I want only xvid; 700mb~ movies, but I think this downloads everything?
Even though the rss was xvid only i see 8gb items here.. hmmm.
I feel an edit3 coming on.

EDIT3:
Ohkai. Mostly sorted I thought. Added some stuff from binhex' to fetch only 2000+ movies
Some issues though..

#1

Code: Select all

Traceback (most recent call last):
  File "E:\Python26\brandonmovies.py", line 76, in <module>
    more_info=node.getElementsByTagName("report:moreinfo")[0].childNodes[0].data
IndexError: list index out of range
It gets a fair way few the feed, but stops at a particular one. I can't see anything of relevance thats different with that one (yet)

#2
Any ideas from anyone to only get true docos?
The doco attribute on newzbin appears to be like .. well - it was on tombraider..
I do not want tombraider.

Re: Autodownload script

Posted: February 27th, 2009, 11:01 am
by bnevets27
How is the watch file supposed to be formated? Is it just a straight list of movie names?
Like:

movie1
movie2
etc.....

A really nice compliment to the watch file would be a firefox plugin (maybe possible through userscript?) that you could right click anywere on an IMDB page and in the context box click on something like "send to watch file" This would be a lot easier then typing up a file.

EDIT:
Well it looks like if you register with imdb you can create a list called "my movies" If a script could scrape that list. I think this would be VERY neat. You can make the list public if that makes it easier to scrape. I'll keep look to see if something similar has been done already. But if this could sneak its way into this script..... :)

Re: Autodownload script

Posted: February 27th, 2009, 10:31 pm
by pilGrim
I am back after a few weeks of travel. 

For those of you not using newzbin, here is an update of the MOVIE DOWNLOAD script minimad posted for me sometime last week.  The Tomato Rating score was not working right and I simplified the code that checks for year and tomato rating, eliminated the regex so very easy to see the logic and change.  The regex stuff gets complicated.  I will change the way the IMDB rating works next to eliminate the regex there to make the IMDB score and number of votes simple to change for those not familiar with regex.

Very little original programing here, just a lot of blending from previous scripts.  thanks to all who contributed to non newzbin scripts in previous postings.

Features are:

- Works with nzbs.org, best to create a rss feed and then process by feedburner
- Checks to see if movie gets a > 75 score with > 1, 000 votes from IMDB or gets > 75 rating from rotten tomatoes
- Year check, currently set to is movie newer than 1995? (easy to change, no regex)
- Do you already have the movie?  checks your folder list to see if there is a folder with movie name, uses IMDB name
- Are you looking for the movie?  watch list, just create a folder in a movie watch directory, will download even if score is less than 75 or older than year check
- is movie in queue already?

Script is located here: http://pastebin.com/ff6429c3

Enjoy.

pilGrim

Re: Autodownload script

Posted: April 15th, 2009, 6:43 am
by jdraughn
Ok, the links to the scripts are both gone and skimming through the 14 pages of replies it's looking like people took the original scripts, tweaked them and posted their versions. Is that what happened? Can the author or a moderator edit the original post with new links or remove the dead links or something?

Re: Autodownload script

Posted: February 23rd, 2010, 3:43 am
by MrAcid
I'd love to try this script but it seems to be gone..?

Anybody who cares to share? Thanks a lot in advance!

Re: Autodownload script

Posted: February 23rd, 2010, 11:22 am
by shypike
The pastebin link still works.

Re: Autodownload script

Posted: February 24th, 2010, 2:50 pm
by rollingeyeball
MrAcid, if you search here for 'MediaRover' it's a similiar script that is actively maintained, and makes this obsolete, in my opinion.
- For TV shows anyway.

Re: Autodownload script

Posted: February 26th, 2010, 8:26 am
by binhex
quick heads up guys, for anybody creating automated scripts which hit imdb they are now actively blocking them, you can of course get around this by faking your agent string :-)

Re: Autodownload script

Posted: February 28th, 2010, 6:11 am
by doubledrat
binhex wrote: quick heads up guys, for anybody creating automated scripts which hit imdb they are now actively blocking them, you can of course get around this by faking your agent string :-)
thanks for the heads-up.  for the lazy amongst us ;) just what do we need to do to fake the agent string in our python code?

Re: Autodownload script

Posted: March 1st, 2010, 4:20 pm
by shypike
(Simplified) excerpt from SABnzbd:

Code: Select all

            opener = urllib.FancyURLopener({})
            opener.prompt_user_passwd = None
            opener.addheaders = []
            opener.addheader('User-Agent', 'SABnzbd+/%s' % sabnzbd.version.__version__)
            opener.addheader('Accept-encoding','gzip')
            fn, header = opener.retrieve(url)
Have fun.