Hello Michael,
Hello Stefan, hello list (CC'ed),
are you aware of any IPS bottlenecks regarding IPsec N2N throughput?
No, not at all.
There is actually not much the IPS can do with the ESP packets. I would assume it just passes them through.
I am not sure about this as the decapsulated clear text packages show up on ppp0/red0, too - perhaps these pass Suricata twice?
Yes, after they are being decrypted, they are passing through suricata again.
I see.
You can try to add a RETURN rule that matches the private IP address ranges so that you can skip this step and find out if this is causing any problems.
I finally (!) managed to get an IPsec connection between OpenBSD 6.6 (OpenIKED) and IPFire 2.23 Core Update 139 working.
Yay \o/.
Well, actually not. PSK works, but certificate based authentication does not due to bug #12276 (OpenSSL does not use subjectAltNames from CSRs and is unable to be safely forced to do so - you will have to supply an additional configuration file to make CSR signing with subjectAltNames work which is a _very_ ugly thing to do if no CSR configuration is available on the signing machine).
Furthermore, Dead Peer Detection is tricky as OpenIKED does not support it
- the OpenBSD people use ifstated instead, but restarting single IPsec connections
is likely impossible at the moment -, so we can only rely on Strongswan here. :-/
Oh OpenBSD. I have no idea why you are doing this to yourself.
Side note: OpenIKED does not support AES-GCM for ESP (see configuration below).
That’s bad because it is fast and secure.
Sorry, I meant IKE here, not ESP, so it is not too bad. However, I still wonder why they do not implement AES-GCM for IKE if they have already done this work for ESP.
Please do not forget to add the relevant documentation to the IPFire wiki.
I will do so as soon this thing works stable and all corresponding IPFire bugs have been fixed. :-)
Great!
During throughput tests, where I downloaded a 1 GByte test file from the machine via the IPsec tunnel, a rather large throughput difference with and without IPS enabled on RED has come to my attention:
With IPS enabled on RED, the download starts at ~ 2.5 MByte/sec. and continually decreases to ~ 580 kByte/sec. - 800 kByte/sec., which is even lower than OpenVPN performance. Without IPS enabled on RED, throughput is 4.0 MByte/sek. on average - running the IPS on other interfaces does not change this behaviour, neither does enabling monitoring mode.
How is CPU load?
From the collectd graphs I can recall, load average (1 minute) was about 1.1 .
That is high. Please try disabling the Spectre/Meltdown mitigations just for the fun of it. It looks like something is blocking the processor from doing any actual work.
Here are the throughput results (put into a list to avoid confusion) from another measurement series:
Operating system mode Average throughput when downloading a 1 GByte test file Average load during downloading - CPU mitigations disabled - IPS enabled 1.2 MByte/sec. 0.56 / 0.40 / 0.21 - IPS disabled 4.5 MByte/sec. 0.01 / 0.02 / 0.16 - CPU mitigations enabled - IPS enabled 1.0 MByte/sec. 0.67 / 0.49 / 0.20 - IPS disabled 4.4 MByte/sec. 0.13 / 0.14 / 0.07
Not surprisingly, the IPS makes the difference here. :-) The connection to the VPS seems to be more stable now, perhaps yesterday was a bad time to measure due to the DE-CIX Frankfurt/Main linecard outage...
CPU vulnerability information with mitigations disabled:
[root@maverick ~]# grep . /sys/devices/system/cpu/vulnerabilities/* /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected /sys/devices/system/cpu/vulnerabilities/mds:Vulnerable; SMT vulnerable /sys/devices/system/cpu/vulnerabilities/meltdown:Vulnerable /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Not affected /sys/devices/system/cpu/vulnerabilities/spectre_v1:Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers /sys/devices/system/cpu/vulnerabilities/spectre_v2:Vulnerable, IBPB: disabled, STIBP: disabled /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected
Kernel command line with CPU vulnerability mitigations disabled for reference purposes:
[root@maverick ~]# cat /proc/cmdline BOOT_IMAGE=/vmlinuz-4.14.154-ipfire root=UUID=[REDACTED] ro panic=10 noibrs noibpb nopti nospectre_v2 nospectre_v1 l1tf=off nospec_store_bypass_disable no_stf_barrier mds=off mitigations=off
Did you see any retransmissions?
I am currently unable to test this with IPsec but transferring the same test file via SCP (which results approximately in the same throughput) shows some duplicate ACKs, but much less than those we have seen on the similar bug investigated last year.
Ideally there should not be any.
This suspiciously sounds like the issue we have had with Suricata last year - as far as I am concerned that was fully fixed. Are you aware of any other similar issue that could cause this massive throughput loss?
No. You can try Suricata 5 which has been posted to the list today.
I would like to do so - is there a nightly/pre-testing ISO available?
Not yet. Next is still on core140.> I am hoping for next week. Arne?
Good to hear - thank you both. :-)
Anyway, thank you in advance for any help and hints. :-)
Kernel and CPU information of the OpenBSD machine:
openbsd# uname -a OpenBSD openbsd 6.6 GENERIC.MP#4 amd64 openbsd# sysctl hw.model hw.machine hw.ncpu hw.model=Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz hw.machine=amd64 hw.ncpu=2
Content of /etc/iked.conf on the OpenBSD machine:
set fragmentation
ikev2 "[REDACTED]" active esp \ from 10.xxx.xxx.2/24 to 10.xxx.xxx.0/24 \ local [REDACTED] peer [REDACTED] \ ikesa auth hmac-sha2-512 enc aes-256 prf hmac-sha2-512 group curve25519 \ childsa enc aes-256-gcm group curve25519 \ srcid [REDACTED] dstid [REDACTED] \ ikelifetime 3h \ lifetime 1h
Kernel and CPU information of the IPFire machine:
[root@maverick ~]# uname -a Linux maverick 4.14.154-ipfire #1 SMP Fri Nov 15 07:27:41 GMT 2019 x86_64 Intel(R) Celeron(R) CPU N3150 @ 1.60GHz GenuineIntel GNU/Linux
This is a really small Atom processor AFAIK. Could we struggle with Meltdown/Spectre mitigations here? Just to rule it out, can you boot the kernel with them disabled?
Hm, I do not think we need to worry about them too much as the N3150 is not vulnerable to some CPU vulnerabilities:
[root@maverick ~]# grep . /sys/devices/system/cpu/vulnerabilities/* /sys/devices/system/cpu/vulnerabilities/itlb_multihit:Not affected /sys/devices/system/cpu/vulnerabilities/l1tf:Not affected /sys/devices/system/cpu/vulnerabilities/mds:Mitigation: Clear CPU buffers; SMT disabled /sys/devices/system/cpu/vulnerabilities/meltdown:Mitigation: PTI /sys/devices/system/cpu/vulnerabilities/spec_store_bypass:Not affected /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: usercopy/swapgs barriers and __user pointer sanitization /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: disabled, RSB filling /sys/devices/system/cpu/vulnerabilities/tsx_async_abort:Not affected
PTI might be it. But that should not only affect IPsec traffic then, but all the rest as well.
Yes, I think so, too.
Thanks, and best regards, Peter Müller