public inbox for development@lists.ipfire.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] dns.cgi: Revert "dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks"
@ 2024-03-11 12:19 Adolf Belka
  2024-03-11 12:19 ` [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks Adolf Belka
  0 siblings, 1 reply; 7+ messages in thread
From: Adolf Belka @ 2024-03-11 12:19 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 861 bytes --]

- This reverts commit 7c6ff5ff12331a53f416080a44c8d6145e78bfac
- That commit removed the cleanhtml command which is not advised, based on feedback from
   Michael Tremer from other patch submissions as it creates a potential security problem.

Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
---
 html/cgi-bin/dns.cgi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
index f3dd5c7a9..0a34d3fd6 100644
--- a/html/cgi-bin/dns.cgi
+++ b/html/cgi-bin/dns.cgi
@@ -141,6 +141,8 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
 
 	# Go further if there was no error.
 	if ( ! $errormessage) {
+		# Check if a remark has been entered.
+		$cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
 
 		my %dns_servers = ();
 		my $id;
-- 
2.44.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks
  2024-03-11 12:19 [PATCH v2 1/2] dns.cgi: Revert "dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks" Adolf Belka
@ 2024-03-11 12:19 ` Adolf Belka
  2024-03-12 10:02   ` Michael Tremer
  0 siblings, 1 reply; 7+ messages in thread
From: Adolf Belka @ 2024-03-11 12:19 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]

- If Freifunk München e.V. is entered as a remark it gets converted to
   Freifunk München e.V.
- This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
   and the HTML::Entities::encode_entities command that is run on that remark text does
   not work with UTF-8 text.
- If the UTF-8 text in the remark is decoded before running through the cleanhtml command
   then the characters with diacritical marks are correctly shown.
- Have tested out the fix on a remark with a range of different characters with
   diacritical marks and all of the ones tested were displayed correctly with the fix while
   in the original form they were mangled.

Fixes: Bug#12395
Tested-by: Adolf Belka <adolf.belka(a)ipfire.org>
Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
---
 html/cgi-bin/dns.cgi | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
index 0a34d3fd6..eb6f908d5 100644
--- a/html/cgi-bin/dns.cgi
+++ b/html/cgi-bin/dns.cgi
@@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
 	# Go further if there was no error.
 	if ( ! $errormessage) {
 		# Check if a remark has been entered.
+
+		# decode the UTF-8 text so that characters with diacritical marks such as
+		# umlauts are treated correctly by the following cleanhtml command
+		$cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});
+
+		# run the REMARK text through cleanhtml to ensure all unsafe html characters
+		# are correctly encoded to their html entities
 		$cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
 
 		my %dns_servers = ();
-- 
2.44.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks
  2024-03-11 12:19 ` [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks Adolf Belka
@ 2024-03-12 10:02   ` Michael Tremer
  2024-03-12 12:27     ` Adolf Belka
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Tremer @ 2024-03-12 10:02 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 2469 bytes --]

Thank you.

I merged this for now so that we can fix this problem quickly.

However I was wondering whether we should consider making the decode statement a part of the “cleanhtml” function.

I am still unsure why this is happening in the first place. We should be receiving UTF-8 from the browser, and I believe that perl doesn’t natively store things in UTF-8. That is however not a problem, because it should read files the same way it wrote them and so there should not be any difference when we re-read the configuration files. Unless some parts of the code specify any kind of encoding.

-Michael

> On 11 Mar 2024, at 12:19, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
> 
> - If Freifunk München e.V. is entered as a remark it gets converted to
>   Freifunk München e.V.
> - This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
>   and the HTML::Entities::encode_entities command that is run on that remark text does
>   not work with UTF-8 text.
> - If the UTF-8 text in the remark is decoded before running through the cleanhtml command
>   then the characters with diacritical marks are correctly shown.
> - Have tested out the fix on a remark with a range of different characters with
>   diacritical marks and all of the ones tested were displayed correctly with the fix while
>   in the original form they were mangled.
> 
> Fixes: Bug#12395
> Tested-by: Adolf Belka <adolf.belka(a)ipfire.org>
> Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
> ---
> html/cgi-bin/dns.cgi | 7 +++++++
> 1 file changed, 7 insertions(+)
> 
> diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
> index 0a34d3fd6..eb6f908d5 100644
> --- a/html/cgi-bin/dns.cgi
> +++ b/html/cgi-bin/dns.cgi
> @@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
> # Go further if there was no error.
> if ( ! $errormessage) {
> # Check if a remark has been entered.
> +
> + # decode the UTF-8 text so that characters with diacritical marks such as
> + # umlauts are treated correctly by the following cleanhtml command
> + $cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});
> +
> + # run the REMARK text through cleanhtml to ensure all unsafe html characters
> + # are correctly encoded to their html entities
> $cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
> 
> my %dns_servers = ();
> -- 
> 2.44.0
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks
  2024-03-12 10:02   ` Michael Tremer
@ 2024-03-12 12:27     ` Adolf Belka
  2024-03-12 14:56       ` Michael Tremer
  0 siblings, 1 reply; 7+ messages in thread
From: Adolf Belka @ 2024-03-12 12:27 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 3452 bytes --]

Hi Michael,

On 12/03/2024 11:02, Michael Tremer wrote:
> Thank you.
> 
> I merged this for now so that we can fix this problem quickly.
> 
> However I was wondering whether we should consider making the decode statement a part of the “cleanhtml” function.
That makes a lot of sense. It would also mean that the problem of 
umlauts etc would be fixed everywhere that cleanhtml is used rather than 
needing to fix every invocation of cleanhtml.

I will look at putting something together for that.

> 
> I am still unsure why this is happening in the first place. We should be receiving UTF-8 from the browser, and I believe that perl doesn’t natively store things in UTF-8. That is however not a problem, because it should read files the same way it wrote them and so there should not be any difference when we re-read the configuration files. Unless some parts of the code specify any kind of encoding.
We do receive UTF-8 from the browser. The problem seems to be that the 
HTML::Entities::encode_entities command doesn't work with UTF-8 but with 
ISO-8859-1 encoding. I can't find where I found this the other day when 
I was searching on this topic to understand how to overcome it.

The fix is not encoding the text from the browser remark box into UTF-8 
but decoding it from UTF-8. Once the text is in the files then it is fine.

Of course my reasoning for doing the decoding may or may not be right, 
so I am always open to alternative suggestions.

Regards,

Adolf.
> 
> -Michael
> 
>> On 11 Mar 2024, at 12:19, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>
>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>    Freifunk München e.V.
>> - This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
>>    and the HTML::Entities::encode_entities command that is run on that remark text does
>>    not work with UTF-8 text.
>> - If the UTF-8 text in the remark is decoded before running through the cleanhtml command
>>    then the characters with diacritical marks are correctly shown.
>> - Have tested out the fix on a remark with a range of different characters with
>>    diacritical marks and all of the ones tested were displayed correctly with the fix while
>>    in the original form they were mangled.
>>
>> Fixes: Bug#12395
>> Tested-by: Adolf Belka <adolf.belka(a)ipfire.org>
>> Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
>> ---
>> html/cgi-bin/dns.cgi | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
>> index 0a34d3fd6..eb6f908d5 100644
>> --- a/html/cgi-bin/dns.cgi
>> +++ b/html/cgi-bin/dns.cgi
>> @@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
>> # Go further if there was no error.
>> if ( ! $errormessage) {
>> # Check if a remark has been entered.
>> +
>> + # decode the UTF-8 text so that characters with diacritical marks such as
>> + # umlauts are treated correctly by the following cleanhtml command
>> + $cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});
>> +
>> + # run the REMARK text through cleanhtml to ensure all unsafe html characters
>> + # are correctly encoded to their html entities
>> $cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
>>
>> my %dns_servers = ();
>> -- 
>> 2.44.0
>>
> 

-- 
Sent from my laptop

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks
  2024-03-12 12:27     ` Adolf Belka
@ 2024-03-12 14:56       ` Michael Tremer
  2024-03-13 22:05         ` Adolf Belka
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Tremer @ 2024-03-12 14:56 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 4447 bytes --]

Hello,

> On 12 Mar 2024, at 12:27, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
> 
> Hi Michael,
> 
> On 12/03/2024 11:02, Michael Tremer wrote:
>> Thank you.
>> I merged this for now so that we can fix this problem quickly.
>> However I was wondering whether we should consider making the decode statement a part of the “cleanhtml” function.
> That makes a lot of sense. It would also mean that the problem of umlauts etc would be fixed everywhere that cleanhtml is used rather than needing to fix every invocation of cleanhtml.
> 
> I will look at putting something together for that.

This is a common problem with any kind of software. I suppose a lot of people are used to it and leave out any special characters where possible. I generally tend to comment in English because it’s shorter, doesn’t have the special character issue and allows customers to hire any staff where German isn’t their first language for example. For international customers I am using English anyways. Hence I never really ran into this problem.

>> I am still unsure why this is happening in the first place. We should be receiving UTF-8 from the browser, and I believe that perl doesn’t natively store things in UTF-8. That is however not a problem, because it should read files the same way it wrote them and so there should not be any difference when we re-read the configuration files. Unless some parts of the code specify any kind of encoding.
> We do receive UTF-8 from the browser. The problem seems to be that the HTML::Entities::encode_entities command doesn't work with UTF-8 but with ISO-8859-1 encoding. I can't find where I found this the other day when I was searching on this topic to understand how to overcome it.

Ah, yeah that would make sense. So basically we store it correctly but once we pass it through cleanhtml() we mess it up. Okay. That is good to know.

So we just need to encode/decode back to UTF-8.

> The fix is not encoding the text from the browser remark box into UTF-8 but decoding it from UTF-8. Once the text is in the files then it is fine.

We should always us UTF-8 everywhere. We might have some problems with Chinese or so (no idea really), but for 99% of our user-base UTF-8 should work fine.

> Of course my reasoning for doing the decoding may or may not be right, so I am always open to alternative suggestions.

I believe you found the right path :)

-Michael

> Regards,
> 
> Adolf.
>> -Michael
>>> On 11 Mar 2024, at 12:19, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>> 
>>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>>   Freifunk München e.V.
>>> - This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
>>>   and the HTML::Entities::encode_entities command that is run on that remark text does
>>>   not work with UTF-8 text.
>>> - If the UTF-8 text in the remark is decoded before running through the cleanhtml command
>>>   then the characters with diacritical marks are correctly shown.
>>> - Have tested out the fix on a remark with a range of different characters with
>>>   diacritical marks and all of the ones tested were displayed correctly with the fix while
>>>   in the original form they were mangled.
>>> 
>>> Fixes: Bug#12395
>>> Tested-by: Adolf Belka <adolf.belka(a)ipfire.org>
>>> Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
>>> ---
>>> html/cgi-bin/dns.cgi | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>> 
>>> diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
>>> index 0a34d3fd6..eb6f908d5 100644
>>> --- a/html/cgi-bin/dns.cgi
>>> +++ b/html/cgi-bin/dns.cgi
>>> @@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
>>> # Go further if there was no error.
>>> if ( ! $errormessage) {
>>> # Check if a remark has been entered.
>>> +
>>> + # decode the UTF-8 text so that characters with diacritical marks such as
>>> + # umlauts are treated correctly by the following cleanhtml command
>>> + $cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});
>>> +
>>> + # run the REMARK text through cleanhtml to ensure all unsafe html characters
>>> + # are correctly encoded to their html entities
>>> $cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
>>> 
>>> my %dns_servers = ();
>>> -- 
>>> 2.44.0
>>> 
> 
> -- 
> Sent from my laptop



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks
  2024-03-12 14:56       ` Michael Tremer
@ 2024-03-13 22:05         ` Adolf Belka
  2024-03-15 10:53           ` Michael Tremer
  0 siblings, 1 reply; 7+ messages in thread
From: Adolf Belka @ 2024-03-13 22:05 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 5406 bytes --]

Hi Michael,

On 12/03/2024 15:56, Michael Tremer wrote:
> Hello,
> 
>> On 12 Mar 2024, at 12:27, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>
>> Hi Michael,
>>
>> On 12/03/2024 11:02, Michael Tremer wrote:
>>> Thank you.
>>> I merged this for now so that we can fix this problem quickly.
>>> However I was wondering whether we should consider making the decode statement a part of the “cleanhtml” function.
>> That makes a lot of sense. It would also mean that the problem of umlauts etc would be fixed everywhere that cleanhtml is used rather than needing to fix every invocation of cleanhtml.
>>
>> I will look at putting something together for that.
> 
> This is a common problem with any kind of software. I suppose a lot of people are used to it and leave out any special characters where possible. I generally tend to comment in English because it’s shorter, doesn’t have the special character issue and allows customers to hire any staff where German isn’t their first language for example. For international customers I am using English anyways. Hence I never really ran into this problem.
> 
>>> I am still unsure why this is happening in the first place. We should be receiving UTF-8 from the browser, and I believe that perl doesn’t natively store things in UTF-8. That is however not a problem, because it should read files the same way it wrote them and so there should not be any difference when we re-read the configuration files. Unless some parts of the code specify any kind of encoding.
>> We do receive UTF-8 from the browser. The problem seems to be that the HTML::Entities::encode_entities command doesn't work with UTF-8 but with ISO-8859-1 encoding. I can't find where I found this the other day when I was searching on this topic to understand how to overcome it.
> 
> Ah, yeah that would make sense. So basically we store it correctly but once we pass it through cleanhtml() we mess it up. Okay. That is good to know.
> 
> So we just need to encode/decode back to UTF-8.
Two things with the previous patch I did.

Firstly I forgot to add

use Encode

at the top of the dns.cgi file. I had it in the vm test system but forgot to copy it across to the patch I created.

Secondly, I decode  the UTF-8 so cleanhtml works correctly with the text but I didn't then encode it back to UTF-8 after doing the html::entities cleaning as you have indicated would be the right approach.

I tested that today in my vm testbed and I confirm it works correctly, keeping the text with the correct html entity values but also then ensures that the text is back in UTF-8 format.

As the patch set from this original email has been merged into next I will submit a separate patch to add the use Encode and the encode back to UTF-8 lines for adding to next.

Regards,

Adolf.
> 
>> The fix is not encoding the text from the browser remark box into UTF-8 but decoding it from UTF-8. Once the text is in the files then it is fine.
> 
> We should always us UTF-8 everywhere. We might have some problems with Chinese or so (no idea really), but for 99% of our user-base UTF-8 should work fine.
> 
>> Of course my reasoning for doing the decoding may or may not be right, so I am always open to alternative suggestions.
> 
> I believe you found the right path :)
> 
> -Michael
> 
>> Regards,
>>
>> Adolf.
>>> -Michael
>>>> On 11 Mar 2024, at 12:19, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>>>
>>>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>>>    Freifunk München e.V.
>>>> - This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
>>>>    and the HTML::Entities::encode_entities command that is run on that remark text does
>>>>    not work with UTF-8 text.
>>>> - If the UTF-8 text in the remark is decoded before running through the cleanhtml command
>>>>    then the characters with diacritical marks are correctly shown.
>>>> - Have tested out the fix on a remark with a range of different characters with
>>>>    diacritical marks and all of the ones tested were displayed correctly with the fix while
>>>>    in the original form they were mangled.
>>>>
>>>> Fixes: Bug#12395
>>>> Tested-by: Adolf Belka <adolf.belka(a)ipfire.org>
>>>> Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
>>>> ---
>>>> html/cgi-bin/dns.cgi | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
>>>> index 0a34d3fd6..eb6f908d5 100644
>>>> --- a/html/cgi-bin/dns.cgi
>>>> +++ b/html/cgi-bin/dns.cgi
>>>> @@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
>>>> # Go further if there was no error.
>>>> if ( ! $errormessage) {
>>>> # Check if a remark has been entered.
>>>> +
>>>> + # decode the UTF-8 text so that characters with diacritical marks such as
>>>> + # umlauts are treated correctly by the following cleanhtml command
>>>> + $cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});
>>>> +
>>>> + # run the REMARK text through cleanhtml to ensure all unsafe html characters
>>>> + # are correctly encoded to their html entities
>>>> $cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
>>>>
>>>> my %dns_servers = ();
>>>> -- 
>>>> 2.44.0
>>>>
>>
>> -- 
>> Sent from my laptop
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks
  2024-03-13 22:05         ` Adolf Belka
@ 2024-03-15 10:53           ` Michael Tremer
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Tremer @ 2024-03-15 10:53 UTC (permalink / raw)
  To: development

[-- Attachment #1: Type: text/plain, Size: 5804 bytes --]

Hello,

> On 13 Mar 2024, at 22:05, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
> 
> Hi Michael,
> 
> On 12/03/2024 15:56, Michael Tremer wrote:
>> Hello,
>>> On 12 Mar 2024, at 12:27, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>> 
>>> Hi Michael,
>>> 
>>> On 12/03/2024 11:02, Michael Tremer wrote:
>>>> Thank you.
>>>> I merged this for now so that we can fix this problem quickly.
>>>> However I was wondering whether we should consider making the decode statement a part of the “cleanhtml” function.
>>> That makes a lot of sense. It would also mean that the problem of umlauts etc would be fixed everywhere that cleanhtml is used rather than needing to fix every invocation of cleanhtml.
>>> 
>>> I will look at putting something together for that.
>> This is a common problem with any kind of software. I suppose a lot of people are used to it and leave out any special characters where possible. I generally tend to comment in English because it’s shorter, doesn’t have the special character issue and allows customers to hire any staff where German isn’t their first language for example. For international customers I am using English anyways. Hence I never really ran into this problem.
>>>> I am still unsure why this is happening in the first place. We should be receiving UTF-8 from the browser, and I believe that perl doesn’t natively store things in UTF-8. That is however not a problem, because it should read files the same way it wrote them and so there should not be any difference when we re-read the configuration files. Unless some parts of the code specify any kind of encoding.
>>> We do receive UTF-8 from the browser. The problem seems to be that the HTML::Entities::encode_entities command doesn't work with UTF-8 but with ISO-8859-1 encoding. I can't find where I found this the other day when I was searching on this topic to understand how to overcome it.
>> Ah, yeah that would make sense. So basically we store it correctly but once we pass it through cleanhtml() we mess it up. Okay. That is good to know.
>> So we just need to encode/decode back to UTF-8.
> Two things with the previous patch I did.
> 
> Firstly I forgot to add
> 
> use Encode

I try to always explicitly name the function that I call from a module. Like &Encode::encode(…) instead of just “encode()”. I am not sure whether that changes anything else but my feeling that this function belongs to a certain module.

> at the top of the dns.cgi file. I had it in the vm test system but forgot to copy it across to the patch I created.
> 
> Secondly, I decode  the UTF-8 so cleanhtml works correctly with the text but I didn't then encode it back to UTF-8 after doing the html::entities cleaning as you have indicated would be the right approach.
> 
> I tested that today in my vm testbed and I confirm it works correctly, keeping the text with the correct html entity values but also then ensures that the text is back in UTF-8 format.
> 
> As the patch set from this original email has been merged into next I will submit a separate patch to add the use Encode and the encode back to UTF-8 lines for adding to next.
> 
> Regards,
> 
> Adolf.
>>> The fix is not encoding the text from the browser remark box into UTF-8 but decoding it from UTF-8. Once the text is in the files then it is fine.
>> We should always us UTF-8 everywhere. We might have some problems with Chinese or so (no idea really), but for 99% of our user-base UTF-8 should work fine.
>>> Of course my reasoning for doing the decoding may or may not be right, so I am always open to alternative suggestions.
>> I believe you found the right path :)
>> -Michael
>>> Regards,
>>> 
>>> Adolf.
>>>> -Michael
>>>>> On 11 Mar 2024, at 12:19, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>>>> 
>>>>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>>>>   Freifunk München e.V.
>>>>> - This is because cleanhtml is used on the UTF-8 remark text before saving it to the file
>>>>>   and the HTML::Entities::encode_entities command that is run on that remark text does
>>>>>   not work with UTF-8 text.
>>>>> - If the UTF-8 text in the remark is decoded before running through the cleanhtml command
>>>>>   then the characters with diacritical marks are correctly shown.
>>>>> - Have tested out the fix on a remark with a range of different characters with
>>>>>   diacritical marks and all of the ones tested were displayed correctly with the fix while
>>>>>   in the original form they were mangled.
>>>>> 
>>>>> Fixes: Bug#12395
>>>>> Tested-by: Adolf Belka <adolf.belka(a)ipfire.org>
>>>>> Signed-off-by: Adolf Belka <adolf.belka(a)ipfire.org>
>>>>> ---
>>>>> html/cgi-bin/dns.cgi | 7 +++++++
>>>>> 1 file changed, 7 insertions(+)
>>>>> 
>>>>> diff --git a/html/cgi-bin/dns.cgi b/html/cgi-bin/dns.cgi
>>>>> index 0a34d3fd6..eb6f908d5 100644
>>>>> --- a/html/cgi-bin/dns.cgi
>>>>> +++ b/html/cgi-bin/dns.cgi
>>>>> @@ -142,6 +142,13 @@ if (($cgiparams{'SERVERS'} eq $Lang::tr{'save'}) || ($cgiparams{'SERVERS'} eq $L
>>>>> # Go further if there was no error.
>>>>> if ( ! $errormessage) {
>>>>> # Check if a remark has been entered.
>>>>> +
>>>>> + # decode the UTF-8 text so that characters with diacritical marks such as
>>>>> + # umlauts are treated correctly by the following cleanhtml command
>>>>> + $cgiparams{'REMARK'} = decode("UTF-8", $cgiparams{'REMARK'});
>>>>> +
>>>>> + # run the REMARK text through cleanhtml to ensure all unsafe html characters
>>>>> + # are correctly encoded to their html entities
>>>>> $cgiparams{'REMARK'} = &Header::cleanhtml($cgiparams{'REMARK'});
>>>>> 
>>>>> my %dns_servers = ();
>>>>> -- 
>>>>> 2.44.0
>>>>> 
>>> 
>>> -- 
>>> Sent from my laptop



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-15 10:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-11 12:19 [PATCH v2 1/2] dns.cgi: Revert "dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks" Adolf Belka
2024-03-11 12:19 ` [PATCH v2 2/2] dns.cgi: Fixes bug#12395 - German umlauts not correctly displayed in remarks Adolf Belka
2024-03-12 10:02   ` Michael Tremer
2024-03-12 12:27     ` Adolf Belka
2024-03-12 14:56       ` Michael Tremer
2024-03-13 22:05         ` Adolf Belka
2024-03-15 10:53           ` Michael Tremer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox