From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter =?utf-8?q?M=C3=BCller?= To: development@lists.ipfire.org Subject: Re: Question regarding IPsec N2N throughput with and without IPS Date: Fri, 24 Jan 2020 17:35:00 +0000 Message-ID: <79195991-f51e-44b5-5377-aec0f512432f@ipfire.org> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0969541673427614153==" List-Id: --===============0969541673427614153== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello Michael, >>>> >>>> Hello Stefan, hello list (CC'ed), >>>> >>>> are you aware of any IPS bottlenecks regarding IPsec N2N throughput? >>> >>> No, not at all. >>> >>> There is actually not much the IPS can do with the ESP packets. I would a= ssume it just passes them through. >> I am not sure about this as the decapsulated clear text packages show up on >> ppp0/red0, too - perhaps these pass Suricata twice? >=20 > Yes, after they are being decrypted, they are passing through suricata agai= n. I see. >=20 > You can try to add a RETURN rule that matches the private IP address ranges= so that you can skip this step and find out if this is causing any problems. >=20 >>>> I finally (!) managed to get an IPsec connection between OpenBSD 6.6 (Op= enIKED) >>>> and IPFire 2.23 Core Update 139 working. >>> >>> Yay \o/. >> Well, actually not. PSK works, but certificate based authentication does n= ot >> due to bug #12276 (OpenSSL does not use subjectAltNames from CSRs and is u= nable >> to be safely forced to do so - you will have to supply an additional confi= guration >> file to make CSR signing with subjectAltNames work which is a _very_ ugly = thing >> to do if no CSR configuration is available on the signing machine). >> >> Furthermore, Dead Peer Detection is tricky as OpenIKED does not support it >> - the OpenBSD people use ifstated instead, but restarting single IPsec con= nections >> is likely impossible at the moment -, so we can only rely on Strongswan he= re. :-/ >=20 > Oh OpenBSD. I have no idea why you are doing this to yourself. >=20 >> Side note: OpenIKED does not support AES-GCM for ESP (see configuration be= low). >=20 > That=E2=80=99s bad because it is fast and secure. Sorry, I meant IKE here, not ESP, so it is not too bad. However, I still wond= er why they do not implement AES-GCM for IKE if they have already done this work for= ESP. >=20 >>> >>> Please do not forget to add the relevant documentation to the IPFire wiki. >> I will do so as soon this thing works stable and all corresponding IPFire = bugs >> have been fixed. :-) >=20 > Great! >=20 >>> >>>> During throughput tests, where >>>> I downloaded a 1 GByte test file from the machine via the IPsec tunnel, >>>> a rather large throughput difference with and without IPS enabled on RED >>>> has come to my attention: >>>> >>>> With IPS enabled on RED, the download starts at ~ 2.5 MByte/sec. and >>>> continually decreases to ~ 580 kByte/sec. - 800 kByte/sec., which is >>>> even lower than OpenVPN performance. Without IPS enabled on RED, through= put >>>> is 4.0 MByte/sek. on average - running the IPS on other interfaces does >>>> not change this behaviour, neither does enabling monitoring mode. >>> >>> How is CPU load? >> From the collectd graphs I can recall, load average (1 minute) was about 1= .1 . >=20 > That is high. Please try disabling the Spectre/Meltdown mitigations just fo= r the fun of it. It looks like something is blocking the processor from doing= any actual work. Here are the throughput results (put into a list to avoid confusion) from ano= ther measurement series: Operating system mode Average throughput when downloading a 1 GByte test fil= e Average load during downloading - CPU mitigations disabled - IPS enabled 1.2 MByte/sec. 0.56 / 0.40 / 0.21 - IPS disabled 4.5 MByte/sec. 0.01 / 0.02 / 0.16 - CPU mitigations enabled - IPS enabled 1.0 MByte/sec. 0.67 / 0.49 / 0.20 - IPS disabled 4.4 MByte/sec. 0.13 / 0.14 / 0.07 Not surprisingly, the IPS makes the difference here. :-) The connection to the VPS seems to be more stable now, perhaps yesterday was a bad time to measure due to the DE-CIX Frankfurt/Main linecard outage... CPU vulnerability information with mitigations disabled: > [root(a)maverick ~]# grep . /sys/devices/system/cpu/vulnerabilities/* > /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected > /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected > /sys/devices/system/cpu/vulnerabilities/mds:Vulnerable; SMT vulnerable > /sys/devices/system/cpu/vulnerabilities/meltdown:Vulnerable > /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Not affected > /sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable: __user point= er sanitization and usercopy barriers only; no swapgs barriers > /sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable, IBPB: disabl= ed, STIBP: disabled > /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected Kernel command line with CPU vulnerability mitigations disabled for reference= purposes: > [root(a)maverick ~]# cat /proc/cmdline=20 > BOOT_IMAGE=3D/vmlinuz-4.14.154-ipfire root=3DUUID=3D[REDACTED] ro panic=3D1= 0 noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=3Doff nospec_store_bypas= s_disable no_stf_barrier mds=3Doff mitigations=3Doff >=20 >>> Did you see any retransmissions? >> I am currently unable to test this with IPsec but transferring the same te= st file >> via SCP (which results approximately in the same throughput) shows some du= plicate >> ACKs, but much less than those we have seen on the similar bug investigate= d last >> year. >=20 > Ideally there should not be any. >=20 >>> >>>> This suspiciously sounds like the issue we have had with Suricata last >>>> year - as far as I am concerned that was fully fixed. Are you aware of >>>> any other similar issue that could cause this massive throughput loss? >>> >>> No. You can try Suricata 5 which has been posted to the list today. >> I would like to do so - is there a nightly/pre-testing ISO available? >=20 > Not yet. Next is still on core140.>=20 > I am hoping for next week. Arne? Good to hear - thank you both. :-) >=20 >>> >>>> Anyway, thank you in advance for any help and hints. :-) >>>> >>>> Kernel and CPU information of the OpenBSD machine: >>>>> openbsd# uname -a >>>>> OpenBSD openbsd 6.6 GENERIC.MP#4 amd64 >>>>> openbsd# sysctl hw.model hw.machine hw.ncpu >>>>> hw.model=3DIntel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz >>>>> hw.machine=3Damd64 >>>>> hw.ncpu=3D2 >>>> >>>> Content of /etc/iked.conf on the OpenBSD machine: >>>>> set fragmentation >>>>> >>>>> ikev2 "[REDACTED]" active esp \ >>>>> from 10.xxx.xxx.2/24 to 10.xxx.xxx.0/24 \ >>>>> local [REDACTED] peer [REDACTED] \ >>>>> ikesa auth hmac-sha2-512 enc aes-256 prf hmac-sha2-512 group curve2551= 9 \ >>>>> childsa enc aes-256-gcm group curve25519 \ >>>>> srcid [REDACTED] dstid [REDACTED] \ >>>>> ikelifetime 3h \ >>>>> lifetime 1h >>>> >>>> Kernel and CPU information of the IPFire machine: >>>>> [root(a)maverick ~]# uname -a >>>>> Linux maverick 4.14.154-ipfire #1 SMP Fri Nov 15 07:27:41 GMT 2019 x86_= 64 Intel(R) Celeron(R) CPU N3150 @ 1.60GHz GenuineIntel GNU/Linux >>> >>> This is a really small Atom processor AFAIK. Could we struggle with Meltd= own/Spectre mitigations here? Just to rule it out, can you boot the kernel wi= th them disabled? >> Hm, I do not think we need to worry about them too much as the N3150 is no= t vulnerable >> to some CPU vulnerabilities: >>> [root(a)maverick ~]# grep . /sys/devices/system/cpu/vulnerabilities/* >>> /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected >>> /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected >>> /sys/devices/system/cpu/vulnerabilities/mds:Mitigation: Clear CPU buffers= ; SMT disabled >>> /sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI >>> /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Not affected >>> /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: usercopy/s= wapgs barriers and __user pointer sanitization >>> /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full gener= ic retpoline, IBPB: conditional, IBRS_FW, STIBP: disabled, RSB filling >>> /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected >=20 > PTI might be it. But that should not only affect IPsec traffic then, but al= l the rest as well. Yes, I think so, too. Thanks, and best regards, Peter M=C3=BCller --===============0969541673427614153==--