Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks

12 Mar 2024

Hi Michael,
On 12/03/2024 11:02, Michael Tremer wrote:
...
Thank you.
I merged this for now so that we can fix this problem quickly.
However I was wondering whether we should consider making the decode statement a part of the “cleanhtml” function.
That makes a lot of sense. It would also mean that the problem of 
umlauts etc would be fixed everywhere that cleanhtml is used rather than 
needing to fix every invocation of cleanhtml.
I will look at putting something together for that.
...
I am still unsure why this is happening in the first place. We should be receiving UTF-8 from the browser, and I believe that perl doesn’t natively store things in UTF-8. That is however not a problem, because it should read files the same way it wrote them and so there should not be any difference when we re-read the configuration files. Unless some parts of the code specify any kind of encoding.
We do receive UTF-8 from the browser. The problem seems to be that the 
HTML::Entities::encode_entities command doesn't work with UTF-8 but with 
ISO-8859-1 encoding. I can't find where I found this the other day when 
I was searching on this topic to understand how to overcome it.
The fix is not encoding the text from the browser remark box into UTF-8 
but decoding it from UTF-8. Once the text is in the files then it is fine.
Of course my reasoning for doing the decoding may or may not be right, 
so I am always open to alternative suggestions.
Regards,
Adolf.
...
-Michael
...
On 11 Mar 2024, at 12:19, Adolf Belka adolf.belka@ipfire.org wrote:

If Freifunk München e.V. is entered as a remark it gets converted to
 Freifunk MÃ¼nchen e.V.
This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
 and the HTML::Entities::encode_entities command that is run on that remark text does
 not work with UTF-8 text.
If the UTF-8 text in the remark is decoded before running through the cleanhtml command
 then the characters with diacritical marks are correctly shown.
Have tested out the fix on a remark with a range of different characters with
 diacritical marks and all of the ones tested were displayed correctly with the fix while
 in the original form they were mangled.

Fixes: Bug#12395
Tested-by: Adolf Belka adolf.belka@ipfire.org
Signed-off-by: Adolf Belka adolf.belka@ipfire.org

html/cgi-bin/dns.cgi | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
index 0a34d3fd6..eb6f908d5 100644
--- a/html/cgi-bin/dns.cgi
+++ b/html/cgi-bin/dns.cgi
@@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
# Go further if there was no error.
if ( ! $errormessage) {
# Check if a remark has been entered.


# decode the UTF-8 text so that characters with diacritical marks such as
# umlauts are treated correctly by the following cleanhtml command
$cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});

# run the REMARK text through cleanhtml to ensure all unsafe html characters
# are correctly encoded to their html entities

$cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
my %dns_servers = ();
2.44.0
-- 
Sent from my laptop

    

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks

my %dns_servers = ();