Windows10 - accents issue with post processing script [Solved]

Come up with a useful post-processing script? Share it here!
Post Reply
ErikBrown
Release Testers
Release Testers
Posts: 135
Joined: December 20th, 2009, 1:25 am
Location: Home

Windows10 - accents issue with post processing script [Solved]

Post by ErikBrown »

I am running SABnzbd 2.3.9 on Windows 10 and have made a post processing script (console .exe application) using Visual Studio VB.net. It all works fine except when I download something using a NZB with an accent in the name. Then the NZB name (and job directory) that my script receives from SABnzbd using the command line parameters has the accented character changed.

For instance when I download something using a NZB with Joël or Börjlind in the name, then the accented character is correctly shown everywhere within SABnzbd. But when my script receives the NZB name via the command line parameters, then my script shows it as Joël or Börjlind.

In case it is relevant, I am using US English as my Windows language, both on my SABnzbd server and my development PC. Any idea why this goes wrong and how I can fix this?
Last edited by ErikBrown on April 23rd, 2020, 5:22 am, edited 1 time in total.

User avatar
sander
Release Testers
Release Testers
Posts: 6910
Joined: January 22nd, 2008, 2:22 pm

Re: Windows10 - accents issue with post processing script

Post by sander »

It has to do with the encoding. Keyword: Unicode / UTF-8. I don't know how that works on Windows.

User avatar
safihre
Administrator
Administrator
Posts: 3595
Joined: April 30th, 2015, 7:35 am
Location: Switzerland
Contact:

Re: Windows10 - accents issue with post processing script

Post by safihre »

Your receiving application needs to parse the data as utf8 indeed.
You can also try to parse the filename from the environment-variables that are set by SABnzbd (see the documentation)

ErikBrown
Release Testers
Release Testers
Posts: 135
Joined: December 20th, 2009, 1:25 am
Location: Home

Re: Windows10 - accents issue with post processing script

Post by ErikBrown »

Hi Safihre and Sander,

Thanks for the reply. But there appears to be more to this since my pre-processing script receives the accented characters correctly via the command line parameters while the post-processing script does not. See below as a sample the sections from the debug logs from my pre-queue and post processing script applications that show the received command line parameters (I have removed part of the NZB name):

The script name is C:\SABnzbd\Scripts\SABnzbdPreQueueScript.exe
Parameter 1 = Søren xxxxxxxxxxxxxxxxx
Parameter 2 =
Parameter 3 = books
Parameter 4 = SABnzbdPostProcScript.exe
Parameter 5 = 1
Parameter 6 = 2588346
Parameter 7 = alt.binaries.nl
Parameter 8 =
Parameter 9 =
Parameter 10 =
Parameter 11 =

The script name is C:\SABnzbd\Scripts\SABnzbdPostProcScript.exe
Parameter 1 = H:\SABnzbd\Books\Søren xxxxxxxxxxxxxxxxx
Parameter 2 = Søren xxxxxxxxxxxxxxx.nzb
Parameter 3 = Søren xxxxxxxxxxxxxxx
Parameter 4 =
Parameter 5 = books
Parameter 6 = alt.binaries.nl
Parameter 7 = 0
Parameter 8 =

Would it not be that SABnzbd converts the codepage correctly before using the NZB name in the command line parameters for the pre-processing script but not for the post processing script?

I will try to see if the environment variables will give me a different result.

Edit
See below the entries for the scripts in the SABnzbd debug log. They indeed show that the scripts are called with different code pages for the NZB name. But surprisingly, the used code pages are the other way around than expected from what is seen in the log of my scripts (below it shows the pre-queue script is called with a wrong code page while the pre-queue script log shows that the codepage is correct, and the other way around for the post processing script)

2020-04-20 12:08:16,744::INFO::[newsunpack:2335] Running pre-queue script [u'C:\\SABnzbd\\Scripts\\SABnzbdPreQueueScript.exe', u'S\xf8ren xxxxxxxxxxxxxxxxxxx', '', u'books', u'SABnzbdPostProcScript.exe', u'1', u'2588346', u'alt.binaries.nl', '', '', '', '']
2020-04-20 12:08:16,884::INFO::[newsunpack:2345] Pre-queue script returns 0 and output=

2020-04-20 12:08:16,884::INFO::[newsunpack:2359] Pre-Q accepts Søren xxxxxxxxxxxxxxxxxxx



2020-04-20 12:08:21,588::INFO::[newsunpack:169] Running external script C:\SABnzbd\Scripts\SABnzbdPostProcScript.exe(H:\SABnzbd\Books\Søren xxxxxxxxxxxxxxxxxxx, Søren xxxxxxxxxxxxxxxxxxx.nzb, Søren xxxxxxxxxxxxxxxxxxx, , books, alt.binaries.nl, 0, )


Erik

User avatar
jcfp
Release Testers
Release Testers
Posts: 930
Joined: February 7th, 2008, 12:45 pm

Re: Windows10 - accents issue with post processing script

Post by jcfp »

ErikBrown wrote:
April 20th, 2020, 2:42 am
See below the entries for the scripts in the SABnzbd debug log. They indeed show that the scripts are called with different code pages for the NZB name.
Not quite. The apparent difference is caused by how the logging in sab displays the variables. For the pre-queue script, it shows an internal Python list structure used to build the command (essentially an array of unicode strings). The \xf8 in u'S\xf8ren' means the unicode char at (hex) position f8, i.e. ø. For the postprocessing script on the other hand, the vars are printed one by one in sabnzbd's code which results in a human-readable text representation for strings such as "Søren".

Your script and/or windoze os may well not use unicode by default though.

ErikBrown
Release Testers
Release Testers
Posts: 135
Joined: December 20th, 2009, 1:25 am
Location: Home

Re: Windows10 - accents issue with post processing script

Post by ErikBrown »

Thanks for the feedback. I managed to get both the pre-queue and post processing scripts to work. For anybody who also struggles with the UTF-8 character coding that is used by SABnzbd while writing a script in VB.net, the following is what I found:

The command line parameters from SABnzbd to the pre-queue script are already in UTF-16 (what is used by Windows). So there is no need to convert them. But the command line parameters from SABnzbd to the post processing script are in UTF-8 so these need to be converted from UTF-8 to UTF-16. I use the following routine to read the command line parameters and do the character conversion:

Private Function GetUTF8CommandLineParam(CommandLineParamNumber As Integer) As String
'Gets the command line parameter with the provided index and converts it from a UTF8 string.

Dim ParamBytes As Byte() = System.Text.Encoding.Default.GetBytes(Environment.GetCommandLineArgs(CommandLineParamNumber))
Dim ParamString As String = System.Text.Encoding.UTF8.GetString(ParamBytes)

GetUTF8CommandLineParam = ParamString

End Function

The output from the pre-queue script to SABnzbd via the console needs to be in UTF-8 format. Normally the output of the console will be in UTF-16. The following line sets the console output of the VB.net script to UTF-8:

Console.OutputEncoding = System.Text.Encoding.UTF8

User avatar
safihre
Administrator
Administrator
Posts: 3595
Joined: April 30th, 2015, 7:35 am
Location: Switzerland
Contact:

Re: Windows10 - accents issue with post processing script [Solved]

Post by safihre »

Aaah interesting! Did not know at all about UTF16 being used by Windows. Never saw it before.

Post Reply