From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adam Jaremko To: development@lists.ipfire.org Subject: Re: Advice with adding RPS/RFS for PPP connections Date: Wed, 03 Jun 2020 11:27:16 -0400 Message-ID: In-Reply-To: <220C4081-4ED0-41F8-B1E1-E6D643388C49@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5597212724104286481==" List-Id: --===============5597212724104286481== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Thanks Michael for the input, I feel I'm headed in the right direction now. > On Tue, 2 Jun 2020 at 03:52, Michael Tremer w= rote: > > Hi Adam, > > > On 2 Jun 2020, at 02:38, Adam Jaremko w= rote: > > > >> On Mon, 1 Jun 2020 at 05:36, Michael Tremer wrote: > >> > >> Hello Adam, > >> > >> Thank you for getting in touch. > >> > >>> On 29 May 2020, at 21:18, Adam Jaremko wrote: > >>> > >>> I've been doing RPS/RFS in my own shell scripts for some time and also = implemented primitive support into the web UI but I'm very much a perl novice= since my last real usage was in the 90's. > >> > >> What did you change in your scripts? > > > > For my basic implementation I've added new parameters to > > /var/ipfire/ppp/settings called RPS=3D{on|off} and RPS_CPUS=3D. > > Added a new script to /etc/rc.d/init.d/networking/red.up/ called > > 01-rps which reads and acts upon the values by setting the following: > > > > /sys/class/net/${RED_DEV}/queues/rx-*/rps_cpus > > /sys/class/net/${RED_DEV}/queues/rx-*/rps_flow_cnt > > /sys/class/net/${RED_DEV}/queues/tx-*/xps_cpus > > This makes sense. > > > As for the WUI, I modified /srv/web/ipfire/cgi-bin/pppsetup.cgi to to > > get output from /usr/bin/nproc to display a list of checkboxes (to > > represent the bitmask) and convert the mask to hex on write. It's > > verbatim to what is documented at > > However, I do not really understand why you want to give the user a choice = about this? > > It is not best to always load-balance across all processors? That can be au= tomatically detected and will still work even after the user changes hardware= . It will also be zero-configuration in the first place. > I see your point and yes zero configuration makes absolute sense. > > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/= html/performance_tuning_guide/network-rps > > https://www.suse.com/support/kb/doc/?id=3D000018430 > > > > I understand it's a little more involved with NUMA systems which I > > don't address, but I've found a script or two in my research, such as > > > > https://stackoverflow.com/questions/30618524/setting-receive-packet-steer= ing-rps-for-32-cores/49544150#49544150 > > > >>> First and foremost, would it be relevant to add such support in an offi= cial capacity? > >> > >> Generally I would say yes. Does it come with any downsides? > >> > >> I am not aware that anyone ran into resource issues here, because PPP co= nnections are usually rather slow (up to 100 MBit/s). But we would probably g= ain some throughput by better utilisation of the load-balancing on the IPS, e= tc. > > > > I generalized PPP but in my case it's PPPoE on symmetrical gigabit but > > any form of encapsulation is often not handled by RSS algorithms with > > the exception of VLANs (PPPoE via VLAN is part of that exception). And > > as I'm sure you're aware network processing was being delegated to > > only CPU0. > > Yes, that can happen. I would argue that the network interfaces make a diff= erence here, but generally CPU0 will become a bottleneck. Normally that does = not matter too much as I said that bandwidth over PPP sessions is normally no= t high enough to even saturate a moderate processor. But encapsulation is exp= ensive and a Gigabit is a lot of traffic to push! > > > > >>> Second, is there any guidelines to accessing the filesytem with the scr= ipts? I ask because my non thorough browsing of the code only revealed the us= e of readhash, whereas I would need to poke and prod at procfs to work with c= omplex CPU affinity per queue. > >> > >> What are you doing here? Are you assigning a processor to a queue using = that script? > > > > I just realized I said procfs where I meant sysfs. The basic > > implementation is explained above but in my own scripts I am assigning > > a single proc to rx/tx queue and irq, which is where I would like to > > move the WUI part forward. I would like to represent each queue with > > its own CPU bitmask. > > We have some code in IPFire 3 that potentially automatically does what you = want to do: > > https://git.ipfire.org/?p=3Dnetwork.git;a=3Dblob;f=3Dsrc/functions/functi= ons.interrupts;h=3D83a57b35145888fad075f1e4ea58832c81789967;hb=3DHEAD > > The main part is here which is called when a new network interface is plugg= ed in (it is all hotplugging here): > > https://git.ipfire.org/?p=3Dnetwork.git;a=3Dblob;f=3Dsrc/functions/functi= ons.device;hb=3Dea4abb82bc6e613ddebd6235f792dd5bbbc469c9#l1007 > > It will then search for a processor that is not very busy and assign all qu= eues of the NIC to the least busy processor core. Least busy being the ones w= ith the fewest queues assigned. > > Can you re-use some of this code? > Wonderful to see that IPFire 3 already has it all in place and I'll definitely re-use the code. > -Michael > > > > >>> I can submit a review patch of my primitive implementation (non NUMA) w= herein it's the same CPU affinity used across each queue using nproc and some= check boxes just to get feedback and a start in the right direction. > >>> > >>> Thanks, > >>> AJ > >> > >> -Michael > --=20 AJ --===============5597212724104286481==--