From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer To: development@lists.ipfire.org Subject: Re: Enable eBPF XDP/TC kernel feature for IPFire Date: Thu, 25 Apr 2024 12:08:40 +0200 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5019376184643852941==" List-Id: --===============5019376184643852941== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello, Please make sure you keep the list copied. > On 24 Apr 2024, at 20:55, Vincent Li wrote: >=20 > On Wed, Apr 24, 2024 at 8:28=E2=80=AFAM Michael Tremer > wrote: >>=20 >> Hello, >>=20 >>> On 19 Apr 2024, at 02:17, Vincent Li wrote: >>>=20 >>> On Thu, Apr 18, 2024 at 2:13=E2=80=AFPM Michael Tremer >>> wrote: >>>>=20 >>>> Hello Vincent, >>>>=20 >>>>> On 18 Apr 2024, at 16:21, Vincent Li wrote: >>>>>=20 >>>>> On Thu, Apr 18, 2024 at 1:57=E2=80=AFAM Michael Tremer >>>>> wrote: >>>>>>=20 >>>>>> Hello, >>>>>>=20 >>>>>>> On 17 Apr 2024, at 23:36, Vincent Li wrot= e: >>>>>>>=20 >>>>>>> On Wed, Apr 17, 2024 at 9:07=E2=80=AFAM Michael Tremer >>>>>>> wrote: >>>>>>>>=20 >>>>>>>> Hello Vincent, >>>>>>>>=20 >>>>>>>>> On 10 Apr 2024, at 19:01, Vincent Li wr= ote: >>>>>>>>>=20 >>>>>>>>> On Wed, Apr 10, 2024 at 8:17=E2=80=AFAM Peter M=C3=BCller wrote: >>>>>>>>>>=20 >>>>>>>>>> Hello Vincent, >>>>>>>>>>=20 >>>>>>>>>> thank you for your e-mail and the proposal. >>>>>>>>>>=20 >>>>>>>>>>> Hi Adolf, >>>>>>>>>>>=20 >>>>>>>>>>> Please see my reply inline >>>>>>>>>>>=20 >>>>>>>>>>> On Wed, Apr 10, 2024 at 2:04=E2=80=AFAM Adolf Belka wrote: >>>>>>>>>>>>=20 >>>>>>>>>>>> Hi Vincent, >>>>>>>>>>>>=20 >>>>>>>>>>>> I am not very familiar at all with this type of stuff but one th= ing that I noticed is that in the image you provided a link to, the XDP secti= on has a line labelled XDP_TX which completely bypasses the whole Netfilter s= ection which doesn't seem to be a good idea to me. >>>>>>>>>>>>=20 >>>>>>>>>>> XDP_TX is to redirect the packet out after processing the packet = at >>>>>>>>>>> XDP stage, yes, netfilter will not see these packets. >>>>>>>>>>> for example for DDoS SYN flood attack scenario, when the SYN pack= et >>>>>>>>>>> is received, XDP program can generate SYN+ACK with syncookie and = send >>>>>>>>>>> the SYN+ACK out, netfilter/Linux tcp stack knows nothing about it, >>>>>>>>>>> which actually saves host CPU cycles to process the SYN in >>>>>>>>>>> netfilter/TCP stack, which is actually good thing. >>>>>>>>>>>=20 >>>>>>>>>>> Also, XDP_DROP, XDP_PASS, XDP_TX action is depending on the XDP >>>>>>>>>>> program attached to the network interface, so it is the XDP progr= am >>>>>>>>>>> author decide what to do with the packet, if no XDP program attac= hed >>>>>>>>>>> to the network interface, everything works as usual, no interfere= nce >>>>>>>>>>> from XDP. >>>>>>>>>>=20 >>>>>>>>>> If my understanding of this is correct, then this would lead to th= e exact >>>>>>>>>> opposite of what IPFire is designed to do. Rather than having pack= ets >>>>>>>>>> processed below any level of operating system influence, the objec= tive of >>>>>>>>>> IPFire in particular and firewalls in general is to control networ= k traffic, >>>>>>>>>> which inherently requires thorough visibility on it. >>>>>>>>>=20 >>>>>>>>> Kernel still has the traffic statistics processed by XDP program a= nd >>>>>>>>> store in eBPF maps so the user space program can query and view. you >>>>>>>>> can still view XDP as part of the firewall except it processes pack= ets >>>>>>>>> early at the driver layer for efficiency. >>>>>>>>=20 >>>>>>>> I would like to understand what your need is to use XDP. >>>>>>>>=20 >>>>>>>> As Peter has stated, your system will pass packets with this, but 90= % of the features that IPFire has won=E2=80=99t work any more: >>>>>>>>=20 >>>>>>>> * Connection Tracking won=E2=80=99t be up to date >>>>>>>> * QoS won=E2=80=99t be able to categorise packets correctly and won= =E2=80=99t be able to do its job >>>>>>>> * The IPS won=E2=80=99t be able to inspect any data >>>>>>>>=20 >>>>>>>=20 >>>>>>> I think I did not explain the XDP use case clearly, for now, most XDP >>>>>>> use cases and particularly my use case is to do DDoS protection at >>>>>>> the earliest packet receiving path in high efficiency since XDP works >>>>>>> inside the network driver. the things you mentioned above would work >>>>>>> fine even if the packets go through XDP program, yes XDP program can >>>>>>> drop, modify, reflect the packet, but in non-DDoS packet senario, XDP >>>>>>> simply passes the packet to Linux as if nothing happened. so it is up >>>>>>> to the XDP program logic, also even if ipfire has the kernel feature >>>>>>> enabled, users still need to attach XDP program to the interface, if >>>>>>> no XDP program is attached to the interface, nothing is in the way to >>>>>>> stop packet flowing to IPfire filtering. >>>>>>> Again this diagram is great to describe the packet path >>>>>>> https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-packet-= flow.svg, >>>>>>> so the XDP_PASS action from XDP program is to pass packet to Linux as >>>>>>> usual. from the same diagram, you also see "AF_PACKET", right? that is >>>>>>> where tcpdump taking place to capture packet for network >>>>>>> troubleshooting, but tcpdump would not stops all the scenario above >>>>>>> you mentioned working, actually tcpdump is based on classic BPF >>>>>>> technology, the XDP/eBPF is to extended classic BPF technology, it not >>>>>>> only can clone (packet capture), but can also drop, redirect. Please >>>>>>> read more about eBPF/XDP in general if you would like, many online >>>>>>> resources explain better than me :) > Those three are just a few, but >>>>>>> they are commonly used features of IPFire and without them, it would >>>>>>> not be what it is. >>>>>>=20 >>>>>> Thank you. I am very familiar with how Netfilter works, including BPF = and XDP. >>>>>>=20 >>>>>> I think your diagram just proves my point when I say that everything i= s going to bypass the OS. That is the long arrow at the bottom. >>>>>>=20 >>>>> I think you refer to the XDP_TX action to bypass the whole OS, for >>>>> IPFire, this is not recommended >>>>=20 >>>> I think it is safe to assume that this is the core feature of XDP is wha= t most people mean when they refer to it. >>>>=20 >>>>>> At least I am assuming that you are interested in forwarding packets i= nstead of going the =E2=80=9CXDP_PASS=E2=80=9D route, because for that you do= n=E2=80=99t need XDP? >>>>>>=20 >>>>>=20 >>>>> For IPFire, XDP_PASS action is recommended because we don't want to >>>>> bypass the OS, we only want to drop DDoS packet at the driver, the >>>>> good packet passes through the OS as usual. >>>>=20 >>>> You did not make at clear at all that your goal is to implement SYNPROXY= with BPF. >>>>=20 >>>> You still do not want to bypass the OS, you just want to bypass Netfilte= r. So let=E2=80=99s maybe try to be more clear with what we are referring to = so that we will save many roundtrips. >>>>=20 >>>=20 >>> I assumed everyone knows what I was talking about in my first email >>> with the https://netdevconf.info/0x15/slides/30/Netdev%200x15%20Accelerat= ing%20synproxy%20with%20XDP.pdf, >>> apparently I failed :) >>=20 >> Well, I don=E2=80=99t think it is very efficient to ask people to read thr= ough a long presentation to find the thing that you want. You should briefly = explain it in your own words and link further references. >>=20 >>>>>> I think you still haven=E2=80=99t explained what your goal is on a low= er level. DoS protection is incredibly broad and that does not strictly requi= re XDP. What kind of XDP program are you interested in using? >>>>>>=20 >>>>> use XDP to stop DoS has low overhead, save OS cycles, I have IPFire >>>>> KVM instance, if I run TCP SYN flood the IPFire, the IPFire ssh >>>>> session, WeUI session because sluggish and unresponsive, but with XDP, >>>>> it is almost like nothing happening, IPFire is responsive all the time >>>>> during flood attack. >>>>=20 >>>> Could you describe your test scenario more, please? >>>>=20 >>>> The Linux kernel is already using SYN cookies whenever it cannot keep up= with processing all SYN packets that it receives. That can be configured wit= h net.ipv4.tcp_max_syn_backlog which is set to 256 on my system. That should = be low enough to trigger the mechanism. >>>>=20 >>> yes, that is for protection for services listening on the Linux host itse= lf. >>=20 >> Ah okay, I wasn=E2=80=99t aware that we were talking about port forwarding= s. >>=20 >>>> Using a KVM-virtualised machine is probably not the best way to deploy t= his in production as interrupts are expensive and you are competing for compu= te power with other machines on the host. But I am sure this is just your tes= t environment. >>>>=20 >>> This is my hobby project so I used KVM, but also I used a virtual >>> machine for the client starting SYN flood, so it is comparable. If I >>> can afford a physical server with 10/25G NIC and a client machine with >>> the same capability, the result will likely be the same. >>=20 >> I highly doubt that as virtual machines are not very good at interrupt han= dling. >>=20 >>>>> these are the XDP program I am interested in using >>>>> https://elixir.bootlin.com/linux/latest/source/tools/testing/selftests/= bpf/progs/xdp_synproxy_kern.c >>>>=20 >>>> Okay, so this does not DoS protect in a strict sense. It just moves the = part of SYNPROXY into the driver. I believe this is your primary goal here. >>>>=20 >>> right, it does what SYNPROXY does, but way more efficiently, page 24 >>> of above xdp pdf link shows result. >>=20 >> It does, but it also does less. >>=20 >>>=20 >>>>> https://github.com/NLnetLabs/XDPeriments >>>>>=20 >>>>> I had ported them to xdp-tools >>>>> https://github.com/vincentmli/xdp-tools/tree/vli-xdp-synproxy so >>>>> xdp-tools loader program could attach multiple XDP programs to red0 >>>>> interface to stop various DDoS attack >>>>=20 >>>> Thank you for sharing this. >>>>=20 >>>>>=20 >>>>>>>>=20 >>>>>>>>>> As far as I am aware, IPFire is currently able to handle 25 GBit/s= ec. on >>>>>>>>>> the right hardware, and SYN flooding attacks are not a major threa= t to >>>>>>>>>> IPFire users, given that we have historically implemented some fin= e-tuning >>>>>>>>>> to make such attacks less viable. >>>>>>>>>=20 >>>>>>>>> DDoS attacks to IPFire users do not happen now does not mean it wi= ll >>>>>>>>> not happen in the future, SYN flood is just one scenario, so better= be >>>>>>>>> prepared than sorry later :) One IPFire user had asked for help >>>>>>>>> https://community.ipfire.org/t/filter-out-ddos-attacks-anyone-can-h= elp-me-please/11046/43 >>>>>>>>=20 >>>>>>>> Is this only about SYN flooding? >>>>>>>> for layer 4 TCP DDoS, the most common scenario is SYN flooding, ACK = flooding, RST flooding. all these flooding can be stopped by SYN cookie that = is already built in the Linux TCP host stack, but IPFire is a middle box fire= wall, the packet destination endpoint is not IPFire, but the host/green netwo= rk port forwarded by IPFire, so that is where the netfilter SYNPROXY module p= lays in, I don't see SYNPROXY module being referenced anywhere in IPFire, so = even without XDP, I still recommend IPFire provides user option to use SYNPRO= XY for TCP SYN/ACK/RST flood attack >>>>>>>>> I have studied IPFire, I do not see relevant SYN flooding or DDoS >>>>>>>>> tuning, where is it? netfilter with SYNPROXY module? or the TCP st= ack >>>>>>>>> syncookie implementation, or suricata ddos rules...etc? keep in mi= nd >>>>>>>>> all these are handled in software, no hardware acceleration. >>>>>>>>=20 >>>>>>>> Yes, IPFire runs in software. We cannot use hardware acceleration be= cause it is designed to pass packets and not to do what we are doing here. >>>>>>> hardware acceleration probably is not the right word here for XDP >>>>>>> because XDP is actually still in the software driver, not inside the >>>>>>> hardware, though there is one hardware vendor that supports running >>>>>>> XDP byte code inside the hardware itself for true hardware >>>>>>> acceleration, but that is not common. again XDP does not interfere >>>>>>> with the IPfire filter except in DDoS scenarios, users can have the >>>>>>> option to drop the packet early in the network driver without >>>>>>> consuming IPfire CPU/memory resource. > >>>>>>=20 >>>>>> But do you have any kind of system out there that is under constant fi= re and the OS cannot cope? What kind of packet rates or bandwidth are we talk= ing about? >>>>>=20 >>>>> I don't have IPFire in production since I am just starting to know >>>>> about the IPFire project. My day time job is enterprise network >>>>> engineer supporting fortune 500 enterprise customer with our >>>>> enterprise product (BIG-IP) which handles 10G/25G/40G or even 100G >>>>> throughput with FPGA, we often has enterprise customer under DDoS >>>>> attack, and sometime my day time job is to simulate such high >>>>> bandwidth attack in lab to see if enterprise product handles well or >>>>> not. I don't think IPFire can handle such flood attack since I know >>>>> the limitation of netfilter ( with more than 20 years of working with >>>>> Linux networking :)) >>>>=20 >>>> Do you have any figures how much your test environment can handle now co= mpared to a stock IPFire without your changes? >>>>=20 >>> I don't, but I think the xdp pdf link I referred to should answer, if >>> the performance is not significant, the kernel community would not >>> accept such feature. >>=20 >> I don=E2=80=99t think that that is a true statement. Not everything is abo= ut performance. >>=20 >> And if something works in a lab with a for-purpose built kernel does not e= xactly mean that this is true for IPFire. I am sure that in all sorts of test= s, there are is no NAT involved which quite likely is always involved in a st= andard IPFire setup - that is how port forwardings work. Therefore it would b= e good to have real figures. >=20 > I will see if I can do some testing on real hardware with NAT port forwardi= ng >=20 >>=20 >>>>>>>> IPFire uses SYN cookies by default for all incoming connections. We = currently do not use the SYNPROXY module, but that is simply because there ha= s not been any demand for it. If this suits your use-case I would rather impl= ement that than XDP. >>>>>>> home users very unlikely would have this demand because there isn't >>>>>>> much gain for attackers who would use DDoS to attack home users. For >>>>>>> small/medium size businesses, attackers could start DDoS because >>>>>>> business could be impacted and lose profit when under DDoS attack, >>>>>>> many businesses choose cloud DDoS providers if they could not afford >>>>>>> DDoS protection devices. XDP DDoS protection on IPFire provides DDoS >>>>>>> protection on inexpensive commodity hardware. There are already a lot >>>>>>> of open source XDP programs out there, including XDP SYNCookie from >>>>>>> Linux kernel source, I have ported it to xdp-tool repo and ported >>>>>>> xdp-tool to IPfire. >>>>>>=20 >>>>>> Maybe share this code on the list here so that people understand bette= r what you are looking for. >>>>>>=20 >>>>> see above link I shared >>>>>=20 >>>>>> I am still not sure what kind of changes you are asking us to make. >>>>>>=20 >>>>> actually the asking is minimum, turn on the kernel config feature >>>>> for eBPF for networking >>>>=20 >>>> You say as a minimum, but you have not asked this before. Usually we han= dle this in the way that people send an email with a description of their fea= ture and if there is generally an option to include this in the distribution.= If agreed, then you can work on some code and send it to this list. >>>>=20 >>> yep, should have asked this in the beginning. >>=20 >> Yes, you should also have sent your code that you have been working on and= sent a link to your YouTube channel (https://www.youtube.com/@BPFireOS) wher= e you explain your feature in detail. That would have helped a lot to underst= and where you were actually going with this instead of us trying to fill in t= he blanks. >>=20 >>>>> CONFIG_BPF_SYSCALL=3Dy >>>>> CONFIG_DEBUG_INFO=3Dy >>>>> CONFIG_DEBUG_INFO_BTF=3Dy >>>>> CONFIG_DEBUG_INFO_DWARF4=3Dy >>>>> CONFIG_BPF_UNPRIV_DEFAULT_OFF=3Dy >>>>=20 >>>> We don=E2=80=99t need any of those debugging symbols in production. They= are actually really large and will make the kernel slower. >>>>=20 >>> the debug symbols is required during build time for kernel image, but >>> can be stripped after BTF is generated for kernel image >>=20 >> How much extra build time does generating the debug information add? >>=20 >=20 > I did not measure with and without debug info, my impression is it is > not significantly long that I would have noticed, I can definitely > measure. >=20 >>>> I have just posted a suggestion to the list as we are going to ship a fr= esh kernel with the next Core Update: >>>>=20 >>>> https://lists.ipfire.org/hyperkitty/list/development(a)lists.ipfire.org/= thread/QYG5SEVSEK53KKW3KAGTPQBC4S654BQW/ >>>>=20 >>>>> I have a working fork of IPFire >>>>> https://github.com/vincentmli/BPFire/tree/bpfire, the discussion here >>>>> is I want to share that great technology with the IPFire community and >>>>> contribute that to IPFire. >>>>=20 >>>> I was entirely unaware that you have written any code here. And I was al= so entirely unaware what your goal was. Thankfully we are on the same page no= w. >>>>=20 >>>> Since I didn=E2=80=99t know that you have written some code, I implement= ed a classic SYN proxy using Netfilter this afternoon: >>>>=20 >>>> https://lists.ipfire.org/hyperkitty/list/development(a)lists.ipfire.org/= thread/5PUYYFTQBIOIRGIIV55PYSA5LJ5S3OVP/ >>>>=20 >>>> I believe that this does what you want to do, although it does not use B= PF/XDP. However, it has the advantage that it can be enabled on a per rule ba= sis and does not have to be globally enabled for all incoming connection to t= hat host. How could BPF/XDP be integrated into this without losing that funct= ionality? >>>>=20 >>> enabled kernel config feature for XDP does not mean all incoming >>> traffic will be processed by XDP, the incoming traffic will only be >>> processed if there is XDP program bytecode/machine code attached to >>> the incoming network interface, so it is up to what individual XDP >>> program does >>>=20 >>> for example XDP program below, it does nothing, simply pass every >>> packet/connection to the netfilter/OS >>>=20 >>> SEC("xdp") >>> int xdp_pass(struct xdp_md *ctx) >>> { >>> return XDP_PASS; >>> } >>>=20 >>> another XDP program example to drop packet to port 5555 >>>=20 >>> SEC("xdp") >>> int xdp_drop(struct xdp_md *ctx) >>> { >>> ... code to parse the raw packet header.. >>> if tcp->dest =3D=3D 5555 >>> return XDP_DROP; >>>=20 >>> } >>>=20 >>> so you could attach none or many XDP programs to the network >>> interface, the XDP program developers have to write the XDP program >>> based on what users want. >>>=20 >>> The XDP synproxy program is more complicated than the above example, >>> but it also only does filtering on a per port basis and configurable >>> by user. >>>=20 >>> for example snippet of the code: >>>=20 >>> /* Pass to upper stack if port requires no syncookie handling */ >>> if (!check_port_allowed(bpf_ntohs(hdr->tcp->dest))) >>> return XDP_PASS; >>>=20 >>> so for example if user only want port 80 syn flood protected by the >>> XDP syncookie program , user can add port 80 to the ebpf map that >>> function check_port_allowed looks up ( in my fork I already added >>> IPFire UI option for user to do that), for all other ports, XDP >>> program does nothing about it, simply pass it to netfilter/OS. >>=20 >> Well, this does answer my question then=E2=80=A6 This solution cannot be e= nabled individually per rule, but only globally for all traffic that hits the= firewall on a certain port. That is not exactly selling this. >>=20 > I am not sure I understand what you mean by per rule, you mean iptable > rules for iptables SYNPROXY module or just iptable rules. for XDP > SYNPROXY to work, it also does require setup iptable rule for each > port that requires XDP SYNPROXY. In the web UI, you can define firewall rules. Those. They might generate more than one iptables rule, but generally that is what I= am interested in. You can combine SYNPROXY with other features that we have = like country filtering and so on. So there are huge benefits there. The XDP approach that you have been taking seems to take over the entire port= no matter where the packet is coming from which makes this solution less fle= xible. -Michael >> -Michael >>=20 >>>=20 >>>> Best, >>>> -Michael >>>>=20 >>>>>=20 >>>>> Vincent >>>>>=20 >>>>>=20 >>>>>> -Michael >>>>>>=20 >>>>>>>>=20 >>>>>>>> -Michael >>>>>>>>=20 >>>>>>>>> Why not give IPFire users the options when the options already exi= st >>>>>>>>> in the IPFire kernel? >>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> Therefore, I - personally - neither see the necessity nor benefit = of pursuing >>>>>>>>>> this proposal at this time. >>>>>>>>>>=20 >>>>>>>>>> Thanks, and best regards, >>>>>>>>>> Peter M=C3=BCller >>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>>> I don't understand what the difference is between XDP_PASS and X= DP_TX but I would expect that nothing should be allowed to bypass the netfilt= er section unless it is being dropped or rejected already by the XDP process. >>>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> XDP_PASS is to pass the packet to netfilter/TCP stack as usual af= ter >>>>>>>>>>> XDP program packet processing, XDP_TX is to redirect the packet b= ack >>>>>>>>>>> out through the same network interface after XDP program packet >>>>>>>>>>> processing. >>>>>>>>>>>=20 >>>>>>>>>>>> Regards, >>>>>>>>>>>>=20 >>>>>>>>>>>> Adolf. >>>>>>>>>>>>=20 >>>>>>>>>>>> On 09/04/2024 19:36, Vincent Li wrote: >>>>>>>>>>>>> Hi, >>>>>>>>>>>>>=20 >>>>>>>>>>>>> I have been working on enabling eBPF XDP/TC kernel feature for = IPFire, >>>>>>>>>>>>> please refer to >>>>>>>>>>>>> https://upload.wikimedia.org/wikipedia/commons/3/37/Netfilter-p= acket-flow.svg >>>>>>>>>>>>> for where XDP fit in Linux network datapath, XDP will not inter= fere >>>>>>>>>>>>> with existing IPFire firewall rules. XDP is especially good at = DDoS >>>>>>>>>>>>> packet filtering at high speed, see >>>>>>>>>>>>> https://netdevconf.info/0x15/slides/30/Netdev%200x15%20Accelera= ting%20synproxy%20with%20XDP.pdf >>>>>>>>>>>>>=20 >>>>>>>>>>>>> I think we only need to enable XDP/TC network filtering capabil= ity >>>>>>>>>>>>> without eBPF tracing capability which some users are concerned = about >>>>>>>>>>>>> potential host security information leaks. >>>>>>>>>>>>>=20 >>>>>>>>>>>>> Please let me know what you think, thanks! >>>>>>>>>>>>>=20 >>>>>>>>>>>>> Vincent --===============5019376184643852941==--