Page 1 of 1

What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 4:19 am
by Usenet
I have some issues when trying to use the "Add by file path" api.
Then documentation states:

Code: Select all

api?mode=addlocalfile&name=full/local/path/to/file.ext
Adding the path, "C:\home\users\Örn\the.nzb" in utf-8 and then urlencoded

Code: Select all

api?mode=addlocalfile&name=C%3A%5Chome%5Cusers%5C%C3%96rn%5Cthe.nzb

Fails with a "no file exists" error.
However if I use the urlencoded unicode character for Ö it works.

Code: Select all

api?mode=addlocalfile&name=C%3A%5Chome%5Cusers%5C%D6rn%5Cthe.nzb
.
Is this the intended behavior?
Any python snippet is appreciated :D

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 6:13 am
by shypike
The calling program is supposed to add headers specifying the encoding.
What does your Python snippet look like?

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 6:21 am
by Usenet
shypike wrote:The calling program is supposed to add headers specifying the encoding.
Aha. I use this:

Code: Select all

def _sabResponse(self, url):
        try:
            req = urllib2.Request(url)
            response = urllib2.urlopen(req)
        except:
            responseMessage = "unable to load url: " + url
        else:
            log = response.read()
            response.close()
            if "ok" in log:
                responseMessage = 'ok'
            else:
                responseMessage = log
        return responseMessage

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 6:31 am
by shypike
If you send UTF-8 to urllib2, it won't know that it is UTF-8 (it might consider it to be Latin-1).
When you send it Unicode, it will know and convert that to UTF-8 and send the proper headers along.

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 7:42 am
by Usenet
In this case I build the url by

Code: Select all

url = self.baseurl + "mode=addlocalfile&name=" + urllib.quote_plus(local_file_name.encode('utf-8'))
since urllib.qoute_plus doesnt like unicode characters.
How would I url encode the url before sending it to urllib2?

BTW, thanks for the help!

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 10:55 am
by shypike
The problem is that quote_plus doesn't understand UTF-8 very well.
You should encode in Latin-1, this maps sufficiently to Unicode to work for this case.
At least when Latin-1 covers your needs.
When you get byte string values from system calls, assume that they are Latin-1.

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 11:38 am
by Usenet
Thanks! Something like you do in https://github.com/sabnzbd/sabnzbd/blob ... ing.py#L45 would work then..

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 1:08 pm
by shypike
Not exactly.
The Python function will only accept 8bit ASCII with the implicit
assumption that it's Latin-1.

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 3:11 pm
by Usenet
Aha, thanks, *sigh* I really hate these different encodings. I'll dig further.

Re: What encoding when using the "Add by file path" API?

Posted: March 8th, 2012, 4:32 pm
by Usenet
When reading e bit more it seems as if the standard for a GET request is somewhat vague. The encoding specified in the urllib2 request is only for the response.
Anyway, encoding to latin-1 does the trick.