Hi All,
Well this should be simple but I am a complete noob in python. I am getting stuck in the following situation:
I set a variable: IMDB_Votes = 1,000
I pull out the current votes from the IMDB HTML Page: imdbwebvotes = re.search(">([0-9]*,*[0-9][0-9][0-9]) votes= imdb_vote):
I think I have defined the imdbwebvotes variable definition is wrong as when the number is high xxx,xxx I get an invalid literal for int() with base 10: '177,362' error.
The code does work when the variable is low x,xxx
Can someone point me in the right direction on how to make this work. Is it the code that turns the variable into an integer or is it the actual creating of the imdbwebvotes variable.
Many thanks in advance!!
pilgrim
Simple Python IMDB Check Help
Re: Simple Python IMDB Check Help
I would think that the regex should be:
">([0-9]+,?[0-9]{0,3}) votes<"
Instead
I changed the first digit to being "1 or more" instead of "0 or more", I made the comma "zero or one occurrences", and instead of repeating the last digit pattern 3 times, we say we want 0-3 of that type. I am sure this regex could be improved upon, as it won't deal with cases where the number is: 1,000,000
My thoughts on your error; You say you set:
If you try doing this in the python interpreter, you end up with a tuple containing 0 and 1:
(0,1)
This is obviously not what you wanted, instead you want
Which will be the number one thousand.
I think the same is going wrong with the second number (the one retrieved from imdb), but python is screaming at you because you are specifically casting it to an int, and python doesn't know how to do that when the string is not all numbers. You need to strip out any non-digit characters, and then create an int.
Hope that at least points you in the right direction!
">([0-9]+,?[0-9]{0,3}) votes<"
Instead
I changed the first digit to being "1 or more" instead of "0 or more", I made the comma "zero or one occurrences", and instead of repeating the last digit pattern 3 times, we say we want 0-3 of that type. I am sure this regex could be improved upon, as it won't deal with cases where the number is: 1,000,000
My thoughts on your error; You say you set:
Code: Select all
IMDB_Votes = 1,000
(0,1)
This is obviously not what you wanted, instead you want
Code: Select all
IMDB_Votes = 1000
I think the same is going wrong with the second number (the one retrieved from imdb), but python is screaming at you because you are specifically casting it to an int, and python doesn't know how to do that when the string is not all numbers. You need to strip out any non-digit characters, and then create an int.
Hope that at least points you in the right direction!
Re: Simple Python IMDB Check Help
Many thanks Camelot,
That helps a lot. Any simple ideas on how to strip the , out of the string that represents the number of votes?
I did setup the code to just skip anything with less than 1000 votes, but wanted to be able to adjust beyond that with a variable.
Cheers,
pilgrim
That helps a lot. Any simple ideas on how to strip the , out of the string that represents the number of votes?
I did setup the code to just skip anything with less than 1000 votes, but wanted to be able to adjust beyond that with a variable.
Cheers,
pilgrim
Re: Simple Python IMDB Check Help
hey,
This should remove the commas
This assumes that whatever is returned from imdb is a string (which it probably will be)
Hope that helps
This should remove the commas
Code: Select all
imdbwebvotes = '6,563'
numbersSplit = imdbwebvotes.split(',')
numberStringWithoutCommas = ""
for num in numbersSplit:
numberStringWithoutCommas = "%s%s" % (numberStringWithoutCommas, num)
real_number = int(numberStringWithoutCommas)
print "%d" % real_number
6563
Hope that helps
Re: Simple Python IMDB Check Help
Hi Camelot, many many thanks. Will try this out later today.Camelot wrote: hey,
This should remove the commasThis assumes that whatever is returned from imdb is a string (which it probably will be)Code: Select all
imdbwebvotes = '6,563' numbersSplit = imdbwebvotes.split(',') numberStringWithoutCommas = "" for num in numbersSplit: numberStringWithoutCommas = "%s%s" % (numberStringWithoutCommas, num) real_number = int(numberStringWithoutCommas) print "%d" % real_number 6563
Hope that helps
cheers,
pilgrim
Re: Simple Python IMDB Check Help
Hi Camelot,
Worked like a charm. Big thanks for the help.
Cheers!
Worked like a charm. Big thanks for the help.
Cheers!
Re: Simple Python IMDB Check Help
no worries