From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Question regarding IPsec N2N throughput with and without IPS Date: Tue, 28 Jan 2020 17:18:28 +0000 Message-ID: <9AA652E3-AA22-4421-A9C5-604FC104D4A0@ipfire.org> In-Reply-To: <79195991-f51e-44b5-5377-aec0f512432f@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4446268722077712203==" List-Id: --===============4446268722077712203== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi, > On 24 Jan 2020, at 17:35, Peter M=C3=BCller wr= ote: >=20 > Hello Michael, >=20 >>>>>=20 >>>>> Hello Stefan, hello list (CC'ed), >>>>>=20 >>>>> are you aware of any IPS bottlenecks regarding IPsec N2N throughput? >>>>=20 >>>> No, not at all. >>>>=20 >>>> There is actually not much the IPS can do with the ESP packets. I would = assume it just passes them through. >>> I am not sure about this as the decapsulated clear text packages show up = on >>> ppp0/red0, too - perhaps these pass Suricata twice? >>=20 >> Yes, after they are being decrypted, they are passing through suricata aga= in. > I see. >>=20 >> You can try to add a RETURN rule that matches the private IP address range= s so that you can skip this step and find out if this is causing any problems. >>=20 >>>>> I finally (!) managed to get an IPsec connection between OpenBSD 6.6 (O= penIKED) >>>>> and IPFire 2.23 Core Update 139 working. >>>>=20 >>>> Yay \o/. >>> Well, actually not. PSK works, but certificate based authentication does = not >>> due to bug #12276 (OpenSSL does not use subjectAltNames from CSRs and is = unable >>> to be safely forced to do so - you will have to supply an additional conf= iguration >>> file to make CSR signing with subjectAltNames work which is a _very_ ugly= thing >>> to do if no CSR configuration is available on the signing machine). >>>=20 >>> Furthermore, Dead Peer Detection is tricky as OpenIKED does not support it >>> - the OpenBSD people use ifstated instead, but restarting single IPsec co= nnections >>> is likely impossible at the moment -, so we can only rely on Strongswan h= ere. :-/ >>=20 >> Oh OpenBSD. I have no idea why you are doing this to yourself. >>=20 >>> Side note: OpenIKED does not support AES-GCM for ESP (see configuration b= elow). >>=20 >> That=E2=80=99s bad because it is fast and secure. > Sorry, I meant IKE here, not ESP, so it is not too bad. However, I still wo= nder why > they do not implement AES-GCM for IKE if they have already done this work f= or ESP. Interesting question. >>=20 >>>>=20 >>>> Please do not forget to add the relevant documentation to the IPFire wik= i. >>> I will do so as soon this thing works stable and all corresponding IPFire= bugs >>> have been fixed. :-) >>=20 >> Great! >>=20 >>>>=20 >>>>> During throughput tests, where >>>>> I downloaded a 1 GByte test file from the machine via the IPsec tunnel, >>>>> a rather large throughput difference with and without IPS enabled on RED >>>>> has come to my attention: >>>>>=20 >>>>> With IPS enabled on RED, the download starts at ~ 2.5 MByte/sec. and >>>>> continually decreases to ~ 580 kByte/sec. - 800 kByte/sec., which is >>>>> even lower than OpenVPN performance. Without IPS enabled on RED, throug= hput >>>>> is 4.0 MByte/sek. on average - running the IPS on other interfaces does >>>>> not change this behaviour, neither does enabling monitoring mode. >>>>=20 >>>> How is CPU load? >>> From the collectd graphs I can recall, load average (1 minute) was about = 1.1 . >>=20 >> That is high. Please try disabling the Spectre/Meltdown mitigations just f= or the fun of it. It looks like something is blocking the processor from doin= g any actual work. > Here are the throughput results (put into a list to avoid confusion) from a= nother > measurement series: >=20 > Operating system mode Average throughput when downloading a 1 GByte test f= ile Average load during downloading > - CPU mitigations disabled > - IPS enabled 1.2 MByte/sec. 0.56 / 0.40 / 0.21 > - IPS disabled 4.5 MByte/sec. 0.01 / 0.02 / 0.16 > - CPU mitigations enabled > - IPS enabled 1.0 MByte/sec. 0.67 / 0.49 / 0.20 > - IPS disabled 4.4 MByte/sec. 0.13 / 0.14 / 0.07 >=20 > Not surprisingly, the IPS makes the difference here. :-) The connection to = the > VPS seems to be more stable now, perhaps yesterday was a bad time to measure > due to the DE-CIX Frankfurt/Main linecard outage... Well, this makes sense then. Could you please try to change suricate=E2=80=99s runmode to =E2=80=9Cautofp= =E2=80=9D? This is currently set to =E2=80=9Cworker=E2=80=9D: https://git.ipfire.org/?p=3Dipfire-2.x.git;a=3Dblob;f=3Dconfig/suricata/suric= ata.yaml;h=3Daf9cb75a9737c76232388bb8e0b4ede9b155bb9a;hb=3DHEAD#l321 > CPU vulnerability information with mitigations disabled: >> [root(a)maverick ~]# grep . /sys/devices/system/cpu/vulnerabilities/* >> /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected >> /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected >> /sys/devices/system/cpu/vulnerabilities/mds:Vulnerable; SMT vulnerable >> /sys/devices/system/cpu/vulnerabilities/meltdown:Vulnerable >> /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Not affected >> /sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable: __user poin= ter sanitization and usercopy barriers only; no swapgs barriers >> /sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable, IBPB: disab= led, STIBP: disabled >> /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected >=20 > Kernel command line with CPU vulnerability mitigations disabled for referen= ce purposes: >> [root(a)maverick ~]# cat /proc/cmdline=20 >> BOOT_IMAGE=3D/vmlinuz-4.14.154-ipfire root=3DUUID=3D[REDACTED] ro panic=3D= 10 noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=3Doff nospec_store_bypa= ss_disable no_stf_barrier mds=3Doff mitigations=3Doff So that is a clear no then. I do not know why I thought this could have an im= pact=E2=80=A6 -Michael >>=20 >>>> Did you see any retransmissions? >>> I am currently unable to test this with IPsec but transferring the same t= est file >>> via SCP (which results approximately in the same throughput) shows some d= uplicate >>> ACKs, but much less than those we have seen on the similar bug investigat= ed last >>> year. >>=20 >> Ideally there should not be any. >>=20 >>>>=20 >>>>> This suspiciously sounds like the issue we have had with Suricata last >>>>> year - as far as I am concerned that was fully fixed. Are you aware of >>>>> any other similar issue that could cause this massive throughput loss? >>>>=20 >>>> No. You can try Suricata 5 which has been posted to the list today. >>> I would like to do so - is there a nightly/pre-testing ISO available? >>=20 >> Not yet. Next is still on core140.>=20 >> I am hoping for next week. Arne? > Good to hear - thank you both. :-) >>=20 >>>>=20 >>>>> Anyway, thank you in advance for any help and hints. :-) >>>>>=20 >>>>> Kernel and CPU information of the OpenBSD machine: >>>>>> openbsd# uname -a >>>>>> OpenBSD openbsd 6.6 GENERIC.MP#4 amd64 >>>>>> openbsd# sysctl hw.model hw.machine hw.ncpu >>>>>> hw.model=3DIntel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz >>>>>> hw.machine=3Damd64 >>>>>> hw.ncpu=3D2 >>>>>=20 >>>>> Content of /etc/iked.conf on the OpenBSD machine: >>>>>> set fragmentation >>>>>>=20 >>>>>> ikev2 "[REDACTED]" active esp \ >>>>>> from 10.xxx.xxx.2/24 to 10.xxx.xxx.0/24 \ >>>>>> local [REDACTED] peer [REDACTED] \ >>>>>> ikesa auth hmac-sha2-512 enc aes-256 prf hmac-sha2-512 group curve2551= 9 \ >>>>>> childsa enc aes-256-gcm group curve25519 \ >>>>>> srcid [REDACTED] dstid [REDACTED] \ >>>>>> ikelifetime 3h \ >>>>>> lifetime 1h >>>>>=20 >>>>> Kernel and CPU information of the IPFire machine: >>>>>> [root(a)maverick ~]# uname -a >>>>>> Linux maverick 4.14.154-ipfire #1 SMP Fri Nov 15 07:27:41 GMT 2019 x86= _64 Intel(R) Celeron(R) CPU N3150 @ 1.60GHz GenuineIntel GNU/Linux >>>>=20 >>>> This is a really small Atom processor AFAIK. Could we struggle with Melt= down/Spectre mitigations here? Just to rule it out, can you boot the kernel w= ith them disabled? >>> Hm, I do not think we need to worry about them too much as the N3150 is n= ot vulnerable >>> to some CPU vulnerabilities: >>>> [root(a)maverick ~]# grep . /sys/devices/system/cpu/vulnerabilities/* >>>> /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected >>>> /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected >>>> /sys/devices/system/cpu/vulnerabilities/mds:Mitigation: Clear CPU buffer= s; SMT disabled >>>> /sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI >>>> /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Not affected >>>> /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: usercopy/= swapgs barriers and __user pointer sanitization >>>> /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full gene= ric retpoline, IBPB: conditional, IBRS_FW, STIBP: disabled, RSB filling >>>> /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected >>=20 >> PTI might be it. But that should not only affect IPsec traffic then, but a= ll the rest as well. > Yes, I think so, too. >=20 > Thanks, and best regards, > Peter M=C3=BCller >=20 --===============4446268722077712203==--