From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Enable eBPF XDP/TC kernel feature for IPFire Date: Wed, 24 Apr 2024 17:28:24 +0200 Message-ID: <96AFD249-D5FB-4D0D-A11B-889AB99F3225@ipfire.org> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8714043493086650382==" List-Id: --===============8714043493086650382== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, > On 19 Apr 2024, at 02:17, Vincent Li wrote: >=20 > On Thu, Apr 18, 2024 at 2:13=E2=80=AFPM Michael Tremer > wrote: >>=20 >> Hello Vincent, >>=20 >>> On 18 Apr 2024, at 16:21, Vincent Li wrote: >>>=20 >>> On Thu, Apr 18, 2024 at 1:57=E2=80=AFAM Michael Tremer >>> wrote: >>>>=20 >>>> Hello, >>>>=20 >>>>> On 17 Apr 2024, at 23:36, Vincent Li wrote: >>>>>=20 >>>>> On Wed, Apr 17, 2024 at 9:07=E2=80=AFAM Michael Tremer >>>>> wrote: >>>>>>=20 >>>>>> Hello Vincent, >>>>>>=20 >>>>>>> On 10 Apr 2024, at 19:01, Vincent Li wrot= e: >>>>>>>=20 >>>>>>> On Wed, Apr 10, 2024 at 8:17=E2=80=AFAM Peter M=C3=BCller wrote: >>>>>>>>=20 >>>>>>>> Hello Vincent, >>>>>>>>=20 >>>>>>>> thank you for your e-mail and the proposal. >>>>>>>>=20 >>>>>>>>> Hi Adolf, >>>>>>>>>=20 >>>>>>>>> Please see my reply inline >>>>>>>>>=20 >>>>>>>>> On Wed, Apr 10, 2024 at 2:04=E2=80=AFAM Adolf Belka wrote: >>>>>>>>>>=20 >>>>>>>>>> Hi Vincent, >>>>>>>>>>=20 >>>>>>>>>> I am not very familiar at all with this type of stuff but one thin= g that I noticed is that in the image you provided a link to, the XDP section= has a line labelled XDP_TX which completely bypasses the whole Netfilter sec= tion which doesn't seem to be a good idea to me. >>>>>>>>>>=20 >>>>>>>>> XDP_TX is to redirect the packet out after processing the packet at >>>>>>>>> XDP stage, yes, netfilter will not see these packets. >>>>>>>>> for example for DDoS SYN flood attack scenario, when the SYN packet >>>>>>>>> is received, XDP program can generate SYN+ACK with syncookie and se= nd >>>>>>>>> the SYN+ACK out, netfilter/Linux tcp stack knows nothing about it, >>>>>>>>> which actually saves host CPU cycles to process the SYN in >>>>>>>>> netfilter/TCP stack, which is actually good thing. >>>>>>>>>=20 >>>>>>>>> Also, XDP_DROP, XDP_PASS, XDP_TX action is depending on the XDP >>>>>>>>> program attached to the network interface, so it is the XDP program >>>>>>>>> author decide what to do with the packet, if no XDP program attached >>>>>>>>> to the network interface, everything works as usual, no interference >>>>>>>>> from XDP. >>>>>>>>=20 >>>>>>>> If my understanding of this is correct, then this would lead to the = exact >>>>>>>> opposite of what IPFire is designed to do. Rather than having packets >>>>>>>> processed below any level of operating system influence, the objecti= ve of >>>>>>>> IPFire in particular and firewalls in general is to control network = traffic, >>>>>>>> which inherently requires thorough visibility on it. >>>>>>>=20 >>>>>>> Kernel still has the traffic statistics processed by XDP program and >>>>>>> store in eBPF maps so the user space program can query and view. you >>>>>>> can still view XDP as part of the firewall except it processes packets >>>>>>> early at the driver layer for efficiency. >>>>>>=20 >>>>>> I would like to understand what your need is to use XDP. >>>>>>=20 >>>>>> As Peter has stated, your system will pass packets with this, but 90% = of the features that IPFire has won=E2=80=99t work any more: >>>>>>=20 >>>>>> * Connection Tracking won=E2=80=99t be up to date >>>>>> * QoS won=E2=80=99t be able to categorise packets correctly and won=E2= =80=99t be able to do its job >>>>>> * The IPS won=E2=80=99t be able to inspect any data >>>>>>=20 >>>>>=20 >>>>> I think I did not explain the XDP use case clearly, for now, most XDP >>>>> use cases and particularly my use case is to do DDoS protection at >>>>> the earliest packet receiving path in high efficiency since XDP works >>>>> inside the network driver. the things you mentioned above would work >>>>> fine even if the packets go through XDP program, yes XDP program can >>>>> drop, modify, reflect the packet, but in non-DDoS packet senario, XDP >>>>> simply passes the packet to Linux as if nothing happened. so it is up >>>>> to the XDP program logic, also even if ipfire has the kernel feature >>>>> enabled, users still need to attach XDP program to the interface, if >>>>> no XDP program is attached to the interface, nothing is in the way to >>>>> stop packet flowing to IPfire filtering. >>>>> Again this diagram is great to describe the packet path >>>>> https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-packet-fl= ow.svg, >>>>> so the XDP_PASS action from XDP program is to pass packet to Linux as >>>>> usual. from the same diagram, you also see "AF_PACKET", right? that is >>>>> where tcpdump taking place to capture packet for network >>>>> troubleshooting, but tcpdump would not stops all the scenario above >>>>> you mentioned working, actually tcpdump is based on classic BPF >>>>> technology, the XDP/eBPF is to extended classic BPF technology, it not >>>>> only can clone (packet capture), but can also drop, redirect. Please >>>>> read more about eBPF/XDP in general if you would like, many online >>>>> resources explain better than me :) > Those three are just a few, but >>>>> they are commonly used features of IPFire and without them, it would >>>>> not be what it is. >>>>=20 >>>> Thank you. I am very familiar with how Netfilter works, including BPF an= d XDP. >>>>=20 >>>> I think your diagram just proves my point when I say that everything is = going to bypass the OS. That is the long arrow at the bottom. >>>>=20 >>> I think you refer to the XDP_TX action to bypass the whole OS, for >>> IPFire, this is not recommended >>=20 >> I think it is safe to assume that this is the core feature of XDP is what = most people mean when they refer to it. >>=20 >>>> At least I am assuming that you are interested in forwarding packets ins= tead of going the =E2=80=9CXDP_PASS=E2=80=9D route, because for that you don= =E2=80=99t need XDP? >>>>=20 >>>=20 >>> For IPFire, XDP_PASS action is recommended because we don't want to >>> bypass the OS, we only want to drop DDoS packet at the driver, the >>> good packet passes through the OS as usual. >>=20 >> You did not make at clear at all that your goal is to implement SYNPROXY w= ith BPF. >>=20 >> You still do not want to bypass the OS, you just want to bypass Netfilter.= So let=E2=80=99s maybe try to be more clear with what we are referring to so= that we will save many roundtrips. >>=20 >=20 > I assumed everyone knows what I was talking about in my first email > with the https://netdevconf.info/0x15/slides/30/Netdev%200x15%20Acceleratin= g%20synproxy%20with%20XDP.pdf, > apparently I failed :) Well, I don=E2=80=99t think it is very efficient to ask people to read throug= h a long presentation to find the thing that you want. You should briefly exp= lain it in your own words and link further references. >>>> I think you still haven=E2=80=99t explained what your goal is on a lower= level. DoS protection is incredibly broad and that does not strictly require= XDP. What kind of XDP program are you interested in using? >>>>=20 >>> use XDP to stop DoS has low overhead, save OS cycles, I have IPFire >>> KVM instance, if I run TCP SYN flood the IPFire, the IPFire ssh >>> session, WeUI session because sluggish and unresponsive, but with XDP, >>> it is almost like nothing happening, IPFire is responsive all the time >>> during flood attack. >>=20 >> Could you describe your test scenario more, please? >>=20 >> The Linux kernel is already using SYN cookies whenever it cannot keep up w= ith processing all SYN packets that it receives. That can be configured with = net.ipv4.tcp_max_syn_backlog which is set to 256 on my system. That should be= low enough to trigger the mechanism. >>=20 > yes, that is for protection for services listening on the Linux host itself. Ah okay, I wasn=E2=80=99t aware that we were talking about port forwardings. >> Using a KVM-virtualised machine is probably not the best way to deploy thi= s in production as interrupts are expensive and you are competing for compute= power with other machines on the host. But I am sure this is just your test = environment. >>=20 > This is my hobby project so I used KVM, but also I used a virtual > machine for the client starting SYN flood, so it is comparable. If I > can afford a physical server with 10/25G NIC and a client machine with > the same capability, the result will likely be the same. I highly doubt that as virtual machines are not very good at interrupt handli= ng. >>> these are the XDP program I am interested in using >>> https://elixir.bootlin.com/linux/latest/source/tools/testing/selftests/bp= f/progs/xdp_synproxy_kern.c >>=20 >> Okay, so this does not DoS protect in a strict sense. It just moves the pa= rt of SYNPROXY into the driver. I believe this is your primary goal here. >>=20 > right, it does what SYNPROXY does, but way more efficiently, page 24 > of above xdp pdf link shows result. It does, but it also does less. >=20 >>> https://github.com/NLnetLabs/XDPeriments >>>=20 >>> I had ported them to xdp-tools >>> https://github.com/vincentmli/xdp-tools/tree/vli-xdp-synproxy so >>> xdp-tools loader program could attach multiple XDP programs to red0 >>> interface to stop various DDoS attack >>=20 >> Thank you for sharing this. >>=20 >>>=20 >>>>>>=20 >>>>>>>> As far as I am aware, IPFire is currently able to handle 25 GBit/sec= . on >>>>>>>> the right hardware, and SYN flooding attacks are not a major threat = to >>>>>>>> IPFire users, given that we have historically implemented some fine-= tuning >>>>>>>> to make such attacks less viable. >>>>>>>=20 >>>>>>> DDoS attacks to IPFire users do not happen now does not mean it will >>>>>>> not happen in the future, SYN flood is just one scenario, so better be >>>>>>> prepared than sorry later :) One IPFire user had asked for help >>>>>>> https://community.ipfire.org/t/filter-out-ddos-attacks-anyone-can-hel= p-me-please/11046/43 >>>>>>=20 >>>>>> Is this only about SYN flooding? >>>>>> for layer 4 TCP DDoS, the most common scenario is SYN flooding, ACK fl= ooding, RST flooding. all these flooding can be stopped by SYN cookie that is= already built in the Linux TCP host stack, but IPFire is a middle box firewa= ll, the packet destination endpoint is not IPFire, but the host/green network= port forwarded by IPFire, so that is where the netfilter SYNPROXY module pla= ys in, I don't see SYNPROXY module being referenced anywhere in IPFire, so ev= en without XDP, I still recommend IPFire provides user option to use SYNPROXY= for TCP SYN/ACK/RST flood attack >>>>>>> I have studied IPFire, I do not see relevant SYN flooding or DDoS >>>>>>> tuning, where is it? netfilter with SYNPROXY module? or the TCP stack >>>>>>> syncookie implementation, or suricata ddos rules...etc? keep in mind >>>>>>> all these are handled in software, no hardware acceleration. >>>>>>=20 >>>>>> Yes, IPFire runs in software. We cannot use hardware acceleration beca= use it is designed to pass packets and not to do what we are doing here. >>>>> hardware acceleration probably is not the right word here for XDP >>>>> because XDP is actually still in the software driver, not inside the >>>>> hardware, though there is one hardware vendor that supports running >>>>> XDP byte code inside the hardware itself for true hardware >>>>> acceleration, but that is not common. again XDP does not interfere >>>>> with the IPfire filter except in DDoS scenarios, users can have the >>>>> option to drop the packet early in the network driver without >>>>> consuming IPfire CPU/memory resource. > >>>>=20 >>>> But do you have any kind of system out there that is under constant fire= and the OS cannot cope? What kind of packet rates or bandwidth are we talkin= g about? >>>=20 >>> I don't have IPFire in production since I am just starting to know >>> about the IPFire project. My day time job is enterprise network >>> engineer supporting fortune 500 enterprise customer with our >>> enterprise product (BIG-IP) which handles 10G/25G/40G or even 100G >>> throughput with FPGA, we often has enterprise customer under DDoS >>> attack, and sometime my day time job is to simulate such high >>> bandwidth attack in lab to see if enterprise product handles well or >>> not. I don't think IPFire can handle such flood attack since I know >>> the limitation of netfilter ( with more than 20 years of working with >>> Linux networking :)) >>=20 >> Do you have any figures how much your test environment can handle now comp= ared to a stock IPFire without your changes? >>=20 > I don't, but I think the xdp pdf link I referred to should answer, if > the performance is not significant, the kernel community would not > accept such feature. I don=E2=80=99t think that that is a true statement. Not everything is about = performance. And if something works in a lab with a for-purpose built kernel does not exac= tly mean that this is true for IPFire. I am sure that in all sorts of tests, = there are is no NAT involved which quite likely is always involved in a stand= ard IPFire setup - that is how port forwardings work. Therefore it would be g= ood to have real figures. >>>>>> IPFire uses SYN cookies by default for all incoming connections. We cu= rrently do not use the SYNPROXY module, but that is simply because there has = not been any demand for it. If this suits your use-case I would rather implem= ent that than XDP. >>>>> home users very unlikely would have this demand because there isn't >>>>> much gain for attackers who would use DDoS to attack home users. For >>>>> small/medium size businesses, attackers could start DDoS because >>>>> business could be impacted and lose profit when under DDoS attack, >>>>> many businesses choose cloud DDoS providers if they could not afford >>>>> DDoS protection devices. XDP DDoS protection on IPFire provides DDoS >>>>> protection on inexpensive commodity hardware. There are already a lot >>>>> of open source XDP programs out there, including XDP SYNCookie from >>>>> Linux kernel source, I have ported it to xdp-tool repo and ported >>>>> xdp-tool to IPfire. >>>>=20 >>>> Maybe share this code on the list here so that people understand better = what you are looking for. >>>>=20 >>> see above link I shared >>>=20 >>>> I am still not sure what kind of changes you are asking us to make. >>>>=20 >>> actually the asking is minimum, turn on the kernel config feature >>> for eBPF for networking >>=20 >> You say as a minimum, but you have not asked this before. Usually we handl= e this in the way that people send an email with a description of their featu= re and if there is generally an option to include this in the distribution. I= f agreed, then you can work on some code and send it to this list. >>=20 > yep, should have asked this in the beginning. Yes, you should also have sent your code that you have been working on and se= nt a link to your YouTube channel (https://www.youtube.com/@BPFireOS) where y= ou explain your feature in detail. That would have helped a lot to understand= where you were actually going with this instead of us trying to fill in the = blanks. >>> CONFIG_BPF_SYSCALL=3Dy >>> CONFIG_DEBUG_INFO=3Dy >>> CONFIG_DEBUG_INFO_BTF=3Dy >>> CONFIG_DEBUG_INFO_DWARF4=3Dy >>> CONFIG_BPF_UNPRIV_DEFAULT_OFF=3Dy >>=20 >> We don=E2=80=99t need any of those debugging symbols in production. They a= re actually really large and will make the kernel slower. >>=20 > the debug symbols is required during build time for kernel image, but > can be stripped after BTF is generated for kernel image How much extra build time does generating the debug information add? >> I have just posted a suggestion to the list as we are going to ship a fres= h kernel with the next Core Update: >>=20 >> https://lists.ipfire.org/hyperkitty/list/development(a)lists.ipfire.org/t= hread/QYG5SEVSEK53KKW3KAGTPQBC4S654BQW/ >>=20 >>> I have a working fork of IPFire >>> https://github.com/vincentmli/BPFire/tree/bpfire, the discussion here >>> is I want to share that great technology with the IPFire community and >>> contribute that to IPFire. >>=20 >> I was entirely unaware that you have written any code here. And I was also= entirely unaware what your goal was. Thankfully we are on the same page now. >>=20 >> Since I didn=E2=80=99t know that you have written some code, I implemented= a classic SYN proxy using Netfilter this afternoon: >>=20 >> https://lists.ipfire.org/hyperkitty/list/development(a)lists.ipfire.org/t= hread/5PUYYFTQBIOIRGIIV55PYSA5LJ5S3OVP/ >>=20 >> I believe that this does what you want to do, although it does not use BPF= /XDP. However, it has the advantage that it can be enabled on a per rule basi= s and does not have to be globally enabled for all incoming connection to tha= t host. How could BPF/XDP be integrated into this without losing that functio= nality? >>=20 > enabled kernel config feature for XDP does not mean all incoming > traffic will be processed by XDP, the incoming traffic will only be > processed if there is XDP program bytecode/machine code attached to > the incoming network interface, so it is up to what individual XDP > program does >=20 > for example XDP program below, it does nothing, simply pass every > packet/connection to the netfilter/OS >=20 > SEC("xdp") > int xdp_pass(struct xdp_md *ctx) > { > return XDP_PASS; > } >=20 > another XDP program example to drop packet to port 5555 >=20 > SEC("xdp") > int xdp_drop(struct xdp_md *ctx) > { > ... code to parse the raw packet header.. > if tcp->dest =3D=3D 5555 > return XDP_DROP; >=20 > } >=20 > so you could attach none or many XDP programs to the network > interface, the XDP program developers have to write the XDP program > based on what users want. >=20 > The XDP synproxy program is more complicated than the above example, > but it also only does filtering on a per port basis and configurable > by user. >=20 > for example snippet of the code: >=20 > /* Pass to upper stack if port requires no syncookie handling */ > if (!check_port_allowed(bpf_ntohs(hdr->tcp->dest))) > return XDP_PASS; >=20 > so for example if user only want port 80 syn flood protected by the > XDP syncookie program , user can add port 80 to the ebpf map that > function check_port_allowed looks up ( in my fork I already added > IPFire UI option for user to do that), for all other ports, XDP > program does nothing about it, simply pass it to netfilter/OS. Well, this does answer my question then=E2=80=A6 This solution cannot be enab= led individually per rule, but only globally for all traffic that hits the fi= rewall on a certain port. That is not exactly selling this. -Michael >=20 >> Best, >> -Michael >>=20 >>>=20 >>> Vincent >>>=20 >>>=20 >>>> -Michael >>>>=20 >>>>>>=20 >>>>>> -Michael >>>>>>=20 >>>>>>> Why not give IPFire users the options when the options already exist >>>>>>> in the IPFire kernel? >>>>>>>=20 >>>>>>>>=20 >>>>>>>> Therefore, I - personally - neither see the necessity nor benefit of= pursuing >>>>>>>> this proposal at this time. >>>>>>>>=20 >>>>>>>> Thanks, and best regards, >>>>>>>> Peter M=C3=BCller >>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>> I don't understand what the difference is between XDP_PASS and XDP= _TX but I would expect that nothing should be allowed to bypass the netfilter= section unless it is being dropped or rejected already by the XDP process. >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> XDP_PASS is to pass the packet to netfilter/TCP stack as usual after >>>>>>>>> XDP program packet processing, XDP_TX is to redirect the packet back >>>>>>>>> out through the same network interface after XDP program packet >>>>>>>>> processing. >>>>>>>>>=20 >>>>>>>>>> Regards, >>>>>>>>>>=20 >>>>>>>>>> Adolf. >>>>>>>>>>=20 >>>>>>>>>> On 09/04/2024 19:36, Vincent Li wrote: >>>>>>>>>>> Hi, >>>>>>>>>>>=20 >>>>>>>>>>> I have been working on enabling eBPF XDP/TC kernel feature for IP= Fire, >>>>>>>>>>> please refer to >>>>>>>>>>> https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-pac= ket-flow.svg >>>>>>>>>>> for where XDP fit in Linux network datapath, XDP will not interfe= re >>>>>>>>>>> with existing IPFire firewall rules. XDP is especially good at DD= oS >>>>>>>>>>> packet filtering at high speed, see >>>>>>>>>>> https://netdevconf.info/0x15/slides/30/Netdev%200x15%20Accelerati= ng%20synproxy%20with%20XDP.pdf >>>>>>>>>>>=20 >>>>>>>>>>> I think we only need to enable XDP/TC network filtering capability >>>>>>>>>>> without eBPF tracing capability which some users are concerned ab= out >>>>>>>>>>> potential host security information leaks. >>>>>>>>>>>=20 >>>>>>>>>>> Please let me know what you think, thanks! >>>>>>>>>>>=20 >>>>>>>>>>> Vincent --===============8714043493086650382==--