Simple Python IMDB Check Help

Come up with a useful post-processing script? Share it here!
Post Reply
pilGrim
Full Member
Full Member
Posts: 116
Joined: January 9th, 2009, 7:28 pm
Location: Hong Kong, SAR China

Simple Python IMDB Check Help

Post by pilGrim »

Hi All,

Well this should be simple but I am a complete noob in python.  I am getting stuck in the following situation:

I set a variable:  IMDB_Votes = 1,000

I pull out the current votes from the IMDB HTML Page: imdbwebvotes = re.search(">([0-9]*,*[0-9][0-9][0-9]) votes= imdb_vote):


I think I have defined the imdbwebvotes variable definition is wrong as when the number is high xxx,xxx I get an invalid literal for int() with base 10: '177,362' error.

The code does work when the variable is low x,xxx 

Can someone point me in the right direction on how to make this work.  Is it the code that turns the variable into an integer or is it the actual creating of the imdbwebvotes variable.

Many thanks in advance!!

pilgrim
Camelot
Jr. Member
Jr. Member
Posts: 64
Joined: August 18th, 2008, 6:23 am

Re: Simple Python IMDB Check Help

Post by Camelot »

I would think that the regex should be:
">([0-9]+,?[0-9]{0,3}) votes<"
Instead :)
I changed the first digit to being "1 or more" instead of "0 or more", I made the comma "zero or one occurrences", and instead of repeating the last digit pattern 3 times, we say we want 0-3 of that type. I am sure this regex could be improved upon, as it won't deal with cases where the number is: 1,000,000

My thoughts on your error; You say you set:

Code: Select all

IMDB_Votes = 1,000
If you try doing this in the python interpreter, you end up with a tuple containing 0 and 1:
(0,1)
This is obviously not what you wanted, instead you want

Code: Select all

IMDB_Votes = 1000
Which will be the number one thousand.

I think the same is going wrong with the second number (the one retrieved from imdb), but python is screaming at you because you are specifically casting it to an int, and python doesn't know how to do that when the string is not all numbers. You need to strip out any non-digit characters, and then create an int.

Hope that at least points you in the right direction!
pilGrim
Full Member
Full Member
Posts: 116
Joined: January 9th, 2009, 7:28 pm
Location: Hong Kong, SAR China

Re: Simple Python IMDB Check Help

Post by pilGrim »

Many thanks Camelot,

That helps a lot.  Any simple ideas on how to strip the , out of the string that represents the number of votes?

I did setup the code to just skip anything with less than 1000 votes, but wanted to be able to adjust beyond that with a variable.

Cheers,

pilgrim
Camelot
Jr. Member
Jr. Member
Posts: 64
Joined: August 18th, 2008, 6:23 am

Re: Simple Python IMDB Check Help

Post by Camelot »

hey,

This should remove the commas

Code: Select all

imdbwebvotes = '6,563'
numbersSplit = imdbwebvotes.split(',')
numberStringWithoutCommas = ""
for num in numbersSplit:
    numberStringWithoutCommas = "%s%s" % (numberStringWithoutCommas, num)
real_number = int(numberStringWithoutCommas)
print "%d" % real_number
6563
This assumes that whatever is returned from imdb is a string (which it probably will be)

Hope that helps
pilGrim
Full Member
Full Member
Posts: 116
Joined: January 9th, 2009, 7:28 pm
Location: Hong Kong, SAR China

Re: Simple Python IMDB Check Help

Post by pilGrim »

Camelot wrote: hey,

This should remove the commas

Code: Select all

imdbwebvotes = '6,563'
numbersSplit = imdbwebvotes.split(',')
numberStringWithoutCommas = ""
for num in numbersSplit:
    numberStringWithoutCommas = "%s%s" % (numberStringWithoutCommas, num)
real_number = int(numberStringWithoutCommas)
print "%d" % real_number
6563
This assumes that whatever is returned from imdb is a string (which it probably will be)

Hope that helps
Hi Camelot, many many thanks.  Will try this out later today.

cheers,

pilgrim
pilGrim
Full Member
Full Member
Posts: 116
Joined: January 9th, 2009, 7:28 pm
Location: Hong Kong, SAR China

Re: Simple Python IMDB Check Help

Post by pilGrim »

Hi Camelot,

Worked like a charm.  Big thanks for the help.

Cheers!
Camelot
Jr. Member
Jr. Member
Posts: 64
Joined: August 18th, 2008, 6:23 am

Re: Simple Python IMDB Check Help

Post by Camelot »

no worries :)
Post Reply