Page 1 of 1

[Windows] Post-Processing Movie scraper

Posted: June 18th, 2012, 6:22 pm
by PatientZro
Disclaimers:
I am not a programmer, I tinker. I am also not responsible for content that this program is used against. This program will only pull information from IMDB to your local system based on whatever folder name you pass it.

With that out of the way, I have written an automatic movie information scraper post-processing script/program for sabnzbd. This program will take the full path/folder name of the movie you want scraped, search for it on IMDB, pull up the first link then scrape the information into local files that can be used with whichever media center software you use (but it's really only been tested against XBMC). Once the information has been scraped, it can also send an update command to your XBMC library so that your new addition can be added right away.

It does not currently scrape fanart. If you want fanart, you'll have to pull it in manually for now...I am looking at how I can grab fanart that work for the majority of users.

An added bonus feature is that if you get a movie in CD1/CD2 format, the program can optionally use mencoder to join the two files together into a single file.

Due to how SAB does it's file cleanup, you will need to ensure that SAB is leaving .nfo files alone. I have coded in an option to also keep the scene .nfo files if you wish, but by default it will remove them.

It runs based on the folder name given to it, either by SAB or by you from a command line (in case your folder isn't named properly and you need to modify it before it will scrape info).

command line usage is:
sabnzbmoviescraper "<full path to movie folder>"
eg: sabnzbmoviescraper "d:\movies\Batman (1989)"

It will do a search in IMDB for "Batman+(1989)" and hit the first returned result. From here it will scrape all the info it can.

If you find it is not scraping the information, do a search yourself at IMDB for the folder name and hit the search button (do not use the quick results that pop up) and see if it actually returns the correct movie as the first result.

blah blah blah, developed in my own time, for my own usage, but found many requests for such a program, so I'm sharing it. Blah blah blah, take no responsibility for it's usage or content scraped...

I may at times update this with various functions (I would like to have a -r switch to point the program to the root of the movie folder collection and it'll recursively scrape the info as needed...) but those are a story for another day, when I have more time.

Updated on June 25th, 2012
Download it here:
http://www56.zippyshare.com/v/27266789/file.html

Edit the included cfg file and put the two in your SAB scripts folder then set SAB to run it.

*change log*
- Fixed a potential infinite wait loop if the browser object doesn't open the IMDB page
- Thanks to Eejit, it now looks for a scene NFO and tries to pull the IMDB link from there before searching


If you like what it does, feel free to drop a couple $ to the sugar/caffeine/nicotine donations jar here:
Imagehttps://www.paypal.com/cgi-bin/webscr?c ... HV976TBGGE

Enjoy,

Patient Zro

Re: [Windows] Post-Processing Movie scraper

Posted: June 20th, 2012, 7:21 am
by sander
Hi,

I've no Windows, so I can't test it: can your script handle things like "d:\movies\Batman (1989) PROMO_XVID_900 MB_DD 2.0_(Ingebakken) Subs Made By XYZ"?

Maybe be recursively shortening the directory name until a hit is found on IMDB?

Re: [Windows] Post-Processing Movie scraper

Posted: June 20th, 2012, 9:35 pm
by PatientZro
Hey Sander,

Thanks for confirming the idea I was considering...and for bringing up another possibility (second set of brackets).

My other option/thought was if it can't pull up a movie with the full folder name, maybe start with just the first word and check all the results that are returned for the best match...initial testing with recursively shortening two examples gave me results after removing one or two items, but they weren't the right results. I am trying to strip out some of the more common scene words (BDRip, xvid, HD, etc) so that does help, but still a bit wonky.

Patient Zro

Re: [Windows] Post-Processing Movie scraper

Posted: June 22nd, 2012, 1:36 pm
by Eejit
Just a thought, but a lot of movies, not all, contain the imdb link in the info. Could this also be parsed? and checked against the folder name?

Re: [Windows] Post-Processing Movie scraper

Posted: June 25th, 2012, 5:08 pm
by PatientZro
That's a great idea Eejit....now have it pulling up the nfo from the release and scanning it for an IMDB link.

Really need some faster internets so I can test this with SAB more....but it seems to be working 98% some of the time now. Seems one of my changes broke things...but I'm also thinking it may be my internets connection... *sigh*

Patient Zro

Re: [Windows] Post-Processing Movie scraper

Posted: June 25th, 2012, 5:12 pm
by Eejit
That's brilliant PZ,
Unfortunately I haven't had a chance to test out your code as I'm away from home but should be back from work in a few days time.

Re: [Windows] Post-Processing Movie scraper

Posted: June 27th, 2012, 6:35 am
by lewc
This is looking really useful! I currently have movies downloaded then use ember media manager to get this kind of data before moving them to a compeleted directory that XBMC reads from

A few suggestions for 'new features' :

[*]Option to Scrape Fanart as well
[*]Choose thumbnail provider, EMM seems to have more / higher quality thumbnails
[*]Option to move the completed directory and contents once all the info has been fetched (not a major issue as i guess you could just tell SABnzdb to move the completed download to the correct directory and then run the script there)

Brilliant work though, any help you need let us know!