Hi Michael,
I think I know how to solve the problem.
I tested out using HTML::Entities::encode_entities in a very simple Perl program and found I got the same type of entity encoding as in the WUI CGI pages.
However, if I treated the string of characters as utf8 then the HTML::Entities::encode_entities gave the results expected.
So I need to figure out how to treat the remark strings as utf8 and hopefully that should fix the problem. At least I have a view of a path forward on this issue now, that will keep the protection of the cleanhtml command while also allowing characters with diacritical marks, plus special characters such as the Cyrillic alphabet and also things like the german eszet that currently all get mangled.
Will let you know how I get on.
Additionally I will also later on create patches for the WUI CGI pages for the Firewall Groups and for WIO as they do not use the cleanhtml command at all yet they also have many Remark entries. I will also check out the other WUI pages that don't use the cleanhtml command to see if they have remarks etc that should use it.
Regards,
Adolf.
On 06/03/2024 23:23, Adolf Belka wrote:
Hi Michael,
On 06/03/2024 22:28, Michael Tremer wrote:
Hello Adolf,
I believe that I cannot merge these patches.
Then you need to also look back at the dns.cgi patch for the bug fix due to german umlauts being changed. The acceptance of that patch is what made me create these patches as they all had the same problem with remarks as well. If this can't be accepted as is then that patch needs to be reverted.
https://git.ipfire.org/?p=ipfire-2.x.git;a=commit;h=7c6ff5ff12331a53f416080a...
The reason simply is that it would create a store cross-site scripting attack vector because someone could store some <script> tags with some JS which will be executed if another admin opens the same page.
That is why we escape the content so that if there are any special characters like <> for HTML tags they won’t be interpreted by the browser.
We might have problems where we accidentally call the cleanhtml function twice which should show garbage. We might also have a problem where the function is not giving us the output that we want.
Which strings have been causing problems? Just German umlauts like “äöü”?
It is any character that has an accent or other diacritical mark.
I just tried entering the following into the remark section for a dns server entry as an example Ä ã ö â á à
and the remark was changed to Ä ã ö â á Ã
and if I edit the entry but don't change the remark the new characters above get changed again into Ä ã ö â á ÃÂ
I would have expected that running cleanhtml should result in characters that are considered safe after one run through but it seems that the encoding creates characters that are encoded again by cleanhtml as being unsafe and then those ones are again still considered unsafe.
If the cleanhtml command needs to stay being used also for the remark entries then I have no idea how to allow german umlauts and other accented characters to be shown correctly because they are all higher bit ascii characters and those are encoded by default by the cleanhtml process as being considered unsafe so I would either need some help on how to deal with it or maybe someone else needs to pick up the original bug#12395
The only thing I found is that cleanhtml calls the escape function which calls the HTML::Entities::encode_entities command but that command can have an additional option which defines the characters that are considered unsafe, but then I would need some guidance on which characters are considered unsafe and if that set applies to all invocations of the cleanhtml command. ie should a modified cleanhtml command with the extra option for the HTML::Entities::encode_entities command only be used for some of the cleanhtml calls and if so would that be only the remark/comment entries?
Regards, Adolf.
-Michael
On 5 Mar 2024, at 19:44, Adolf Belka adolf.belka@ipfire.org wrote:
- Using cleanhtml on Remarks means that all characters with
diacritical marks such as umlauts or grave accents etc get encoded into other characters.
- If Freifunk München e.V. is entered as a remark it gets converted to
Freifunk München e.V.
- cleanhtml is only removed from Remarks or Comment fields. In other
places it has been left in place.
Tested-by: Adolf Belka adolf.belka@ipfire.org Signed-off-by: Adolf Belka adolf.belka@ipfire.org
html/cgi-bin/connscheduler.cgi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/html/cgi-bin/connscheduler.cgi b/html/cgi-bin/connscheduler.cgi index cc78cbc1b..817247cc4 100644 --- a/html/cgi-bin/connscheduler.cgi +++ b/html/cgi-bin/connscheduler.cgi @@ -2,7 +2,7 @@ ###############################################################################
# # # IPFire.org - A linux based firewall # -# Copyright (C) 2007 Michael Tremer & Christian Schmidt # +# Copyright (C) 2007-2024 IPFire Team info@ipfire.org # # # # This program is free software: you can redistribute it and/or modify # # it under the terms of the GNU General Public License as published by # @@ -186,7 +186,7 @@ if ( ($cgiparams{'ACTION'} eq 'add') || ($cgiparams{'ACTION'} eq 'update') ) $CONNSCHED::config[$i]{'DAYSTYPE'} = lc($cgiparams{'ACTION_DAYSTYPE'}); $CONNSCHED::config[$i]{'DAYS'} = $l_days; $CONNSCHED::config[$i]{'WEEKDAYS'} = $l_weekdays; - $CONNSCHED::config[$i]{'COMMENT'} = &Header::cleanhtml($cgiparams{'ACTION_COMMENT'}); + $CONNSCHED::config[$i]{'COMMENT'} = $cgiparams{'ACTION_COMMENT'};
&CONNSCHED::WriteConfig; } -- 2.44.0