From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: fireperf results Date: Tue, 16 Feb 2021 19:07:49 +0000 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0751744998969870118==" List-Id: --===============0751744998969870118== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, > On 16 Feb 2021, at 18:50, Adolf Belka (ipfire-dev) wrote: >=20 > Hi Michael, >=20 > Daniel asked if I was running suricata and I was. That email must have got lost. But of course this explains it. > Removing that made everything much better. Now with -P 1 to -P 1000 I was g= etting ~950Mb/s. With -P 10000 I got only ~550Mb/s. >=20 > On 16/02/2021 17:16, Michael Tremer wrote: >> Hello Adolf, >> This is very surprising to me. I am almost shocked. >> Maybe any of my assumptions are wrong, but if this is the actual throughpu= t of this piece of hardware, I find this not enough. >>> On 16 Feb 2021, at 12:44, Adolf Belka (ipfire-dev) wrote: >>>=20 >>> Hi All, >>>=20 >>> Following are the fireperf results I obtained:- >>>=20 >>> server: IPFire 2.25 - Core Update 153; Intel Celeron CPU J1900 @ 1.99GHz = x4; I211 Gigabit Network Connection >> You have a small processor here with a rather high clock rate. Four cores = at 2 GHz is quite something. >> However, it is a Celeron processor and that means it is a bit more strippe= d down than others. Usually it is caches and pipeline throughput. It might be= that, that bites you really bad here. >> You have a better than average NIC. The Intel network controllers are not = bad, although the i2xx series is not fully active. >> Could you please send the output of =E2=80=9Ccat /proc/interrupts=E2=80=9D= so that we can see how many queues they have? >=20 > -bash-5.0$ cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 > 0: 40 0 0 0 IO-APIC 2-edge t= imer > 1: 3 0 0 0 IO-APIC 1-edge i= 8042 > 4: 430 0 0 0 IO-APIC 4-edge t= tyS0 > 8: 55 0 0 0 IO-APIC 8-fasteoi r= tc0 > 9: 0 0 0 0 IO-APIC 9-fasteoi a= cpi > 12: 4 0 0 0 IO-APIC 12-edge i= 8042 > 18: 0 0 0 0 IO-APIC 18-fasteoi i= 801_smbus > 91: 2959648 0 0 0 PCI-MSI 311296-edge = ahci[0000:00:13.0] > 92: 292544 0 0 0 PCI-MSI 327680-edge = xhci_hcd > 93: 1 0 0 0 PCI-MSI 2097152-edge = orange0 > 94: 332181 0 0 0 PCI-MSI 2097153-edge = orange0-rx-0 > 95: 94258 0 0 0 PCI-MSI 2097154-edge = orange0-rx-1 > 96: 328866 0 0 0 PCI-MSI 2097155-edge = orange0-tx-0 > 97: 169838 0 0 0 PCI-MSI 2097156-edge = orange0-tx-1 > 98: 1 0 0 0 PCI-MSI 3670016-edge = red0 > 99: 9795304 0 0 0 PCI-MSI 3670017-edge = red0-rx-0 > 100: 94258 0 0 0 PCI-MSI 3670018-edge = red0-rx-1 > 101: 9574443 0 0 0 PCI-MSI 3670019-edge = red0-tx-0 > 102: 1067926 0 0 0 PCI-MSI 3670020-edge = red0-tx-1 > 103: 1 0 0 0 PCI-MSI 4194304-edge = green0 > 104: 15302199 0 0 0 PCI-MSI 4194305-edge = green0-rx-0 > 105: 94259 0 0 0 PCI-MSI 4194306-edge = green0-rx-1 > 106: 13422909 0 0 0 PCI-MSI 4194307-edge = green0-tx-0 > 107: 4977558 0 0 0 PCI-MSI 4194308-edge = green0-tx-1 > 108: 1 0 0 0 PCI-MSI 4718592-edge = blue0 > 109: 97391 0 0 0 PCI-MSI 4718593-edge = blue0-rx-0 > 110: 94259 0 0 0 PCI-MSI 4718594-edge = blue0-rx-1 > 111: 94259 0 0 0 PCI-MSI 4718595-edge = blue0-tx-0 > 112: 137222 0 0 0 PCI-MSI 4718596-edge = blue0-tx-1 > NMI: 638 468 287 294 Non-maskable interrupts > LOC: 18102811 13305397 18242209 25513971 Local timer interrupts > SPU: 0 0 0 0 Spurious interrupts > PMI: 638 468 287 294 Performance monitoring i= nterrupts > IWI: 9208 21 2 26 IRQ work interrupts > RTR: 0 0 0 0 APIC ICR read retries > RES: 563980 301713 552880 579914 Rescheduling interrupts > CAL: 184217 137668 310137 256984 Function call interrupts > TLB: 170395 122144 132849 103440 TLB shootdowns > TRM: 0 0 0 0 Thermal event interrupts > THR: 0 0 0 0 Threshold APIC interrupts > DFR: 0 0 0 0 Deferred Error APIC inte= rrupts > MCE: 0 0 0 0 Machine check exceptions > MCP: 605 605 605 605 Machine check polls > HYP: 0 0 0 0 Hypervisor callback inte= rrupts > ERR: 1 > MIS: 0 > PIN: 0 0 0 0 Posted-interrupt notific= ation event > NPI: 0 0 0 0 Nested posted-interrupt = event > PIW: 0 0 0 0 Posted-interrupt wakeup = event >=20 >=20 >>> client: Arch Linux; Intel Core i5-8400 CPU @ 2.80GHz 6 core; 1GBit nic >>>=20 >>> Server: >>> fireperf -s -P 10000 -p 63000:630010 >>>=20 >>>=20 >>> Client: >>> fireperf -c -P 1 -x -p 63000:63010 -> 100 - 3000 cps strongl= y fluctuating. After a couple of minutes the client cps went down to 0 and st= ayed there. I had to stop fireperf and restart the terminal to get it working= again. >>>=20 >>> fireperf -c -P 10 -x -p 63000:63010 ->250 - 500 cps fluctuat= ing >>>=20 >>> fireperf -c -P 100 -x -p 63000:63010 -> 220 - 1000 cps fluct= uating >>>=20 >>> fireperf -c -P 1000 -x -p 63000:63010 -> 1200 - 2500 cps flu= ctuating >>>=20 >>> fireperf -c -P 10000 -x -p 63000:63010 -> 0 - 7000 cps hugel= y fluctuating >> From the beginning you have quite a large fluctuation here. Some is normal= , but this is a lot. It seems that the system is overloaded from the very beg= inning. >> I have not done experiments with lots of different hardware (used the same= usually), but Daniel has, and we normally have the systems being very idle w= ith only one connection at a time. There isn=E2=80=99t too much to do for the= CPU except waiting. >>> In all cases the cpu utilisation was quite low on both IPFire and the Arc= h Linux desktop. >> Not surprising on the desktop side, because there wasn=E2=80=99t a lot stu= ff to do. >>> I then repeated the above tests removing the -x option so I could see the= data bandwidth. >>>=20 >>>=20 >>> fireperf -c -P 1 -p 63000:63010 -> 225Mb/s - 1 core at 100%,= rest around 30% to 40% >> This is the most surprising part. >> The IPFire Mini Appliance for example only has 1 GHz of clock and it doesn= =E2=80=99t have any problems with transmissing a whole gigabit a second of da= ta. This system has double the clock speed and the same NIC (or at least very= similar to it). >>> fireperf -c -P 10 -p 63000:63010 -> 185Mb/s - similar as abo= ve >>>=20 >>> fireperf -c -P 100 -p 63000:63010 -> 210Mb/s - similar to ab= ove >> The bandwidth should have increased here. That means we know that the bott= leneck is not the network, but something else. >> The one core that is maxed out is to some good extend the fireperf process= generating packets. The rest is overhead of the OS, network stack and NIC dr= iver. Which feels way too high for me. >>> fireperf -c -P 1000 -p 63000:63010 -> 370 - 450Mb/s - 2 core= s at 100%, rest at 30% to 40% >>>=20 >>> fireperf -c -P 10000 -p 63000:63010 -> 400Mb/s - 1Gb/s - 2 c= ores at 100%, rest at 40% to 50% >> You seem to have more than one receive queue as it looks like. >> Did you actually achieve the 10k connections? >>> I recently got my Glass Fibre Gigabit connection connected. The supplier = hooked his laptop directly to the media converter and got around 950Mb/s >> Could you test =E2=80=9Cspeedtest-cli=E2=80=9D and see what that reports? >=20 > After turning off suricata I ran speedtest-cli on IPFire and got ~850Mb/s > I ran speedtest-cli on my Arch Desktop and got 840Mb/s > I ran speedtest++ on my Arch Desktop and got ~930Mb/s > I ran my ISP's speedtest and got ~970Mb/s Yeah, so it seems that the ISP speed test is doing something =E2=80=9Cdiffere= nt=E2=80=9D. I am not sure if we can trust them. I definitely wouldn=E2=80=99= t trust my ISP. ~840 MBit/s is still about 100 MBit/s away from the maximum which is 940 MBit= /s taking away any overhead from Ethernet and IP. 970 MBit/s is technically not possible for an IP connection. It might work ou= t if you include Ethernet headers into the maths. How did the CPU load change? It suggests that the hardware still didn=E2=80= =99t run at full capacity because of two cores being idle. > So all showing similar-ish values and around where they should be. So on my= hardware suricata is making a very big difference. I now have to decide if t= hat is worth it or not for my situation. Very good question. Probably not best to be discussed here, but in general I = would like to have this discussion. An IPS is absolutely worth it, and everyone should have it. Unfortunately it = doesn=E2=80=99t run on small hardware and therefore we need to be very carefu= l what we recommend and what we compare with each other. Discussions on the forum always ended up with people buying the cheapest stuf= f that they could get - that is of course a rational thing to do. However, I = am simply running the IPS wherever I go. So that means that a raspberry pi is= not useful for more than a megabit a second. However, it is difficult to pre= dict IPS throughput because it depends on so many factors. It would be nice i= f we could find a reproducible way to benchmark this - at least somewhat accu= rate. -Michael >=20 >>> Using the same speed test as he used but going through my IPFire hardware= I get around 225Mb/s. >>>=20 >>> Although my hardware has four Intel I211 Gigabit NIC's, I have suspected = that their performance is limited by the processor. >> It sounds like it. Lets see what more information we can gather and hopefu= lly find it. >> Can you run powertop along the bechmark and see what that says? >> -Michael >>> Regards, >>>=20 >>> Adolf. --===============0751744998969870118==--