On Mon, 1 Jun 2020 at 05:36, Michael Tremer michael.tremer@ipfire.org wrote:
Hello Adam,
Thank you for getting in touch.
On 29 May 2020, at 21:18, Adam Jaremko adam.jaremko+ipfire@gmail.com wrote:
I've been doing RPS/RFS in my own shell scripts for some time and also implemented primitive support into the web UI but I'm very much a perl novice since my last real usage was in the 90's.
What did you change in your scripts?
For my basic implementation I've added new parameters to /var/ipfire/ppp/settings called RPS={on|off} and RPS_CPUS=<CPU MASK>. Added a new script to /etc/rc.d/init.d/networking/red.up/ called 01-rps which reads and acts upon the values by setting the following:
/sys/class/net/${RED_DEV}/queues/rx-*/rps_cpus /sys/class/net/${RED_DEV}/queues/rx-*/rps_flow_cnt /sys/class/net/${RED_DEV}/queues/tx-*/xps_cpus
As for the WUI, I modified /srv/web/ipfire/cgi-bin/pppsetup.cgi to to get output from /usr/bin/nproc to display a list of checkboxes (to represent the bitmask) and convert the mask to hex on write. It's verbatim to what is documented at
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/htm... https://www.suse.com/support/kb/doc/?id=000018430
I understand it's a little more involved with NUMA systems which I don't address, but I've found a script or two in my research, such as
https://stackoverflow.com/questions/30618524/setting-receive-packet-steering...
First and foremost, would it be relevant to add such support in an official capacity?
Generally I would say yes. Does it come with any downsides?
I am not aware that anyone ran into resource issues here, because PPP connections are usually rather slow (up to 100 MBit/s). But we would probably gain some throughput by better utilisation of the load-balancing on the IPS, etc.
I generalized PPP but in my case it's PPPoE on symmetrical gigabit but any form of encapsulation is often not handled by RSS algorithms with the exception of VLANs (PPPoE via VLAN is part of that exception). And as I'm sure you're aware network processing was being delegated to only CPU0.
Second, is there any guidelines to accessing the filesytem with the scripts? I ask because my non thorough browsing of the code only revealed the use of readhash, whereas I would need to poke and prod at procfs to work with complex CPU affinity per queue.
What are you doing here? Are you assigning a processor to a queue using that script?
I just realized I said procfs where I meant sysfs. The basic implementation is explained above but in my own scripts I am assigning a single proc to rx/tx queue and irq, which is where I would like to move the WUI part forward. I would like to represent each queue with its own CPU bitmask.
I can submit a review patch of my primitive implementation (non NUMA) wherein it's the same CPU affinity used across each queue using nproc and some check boxes just to get feedback and a start in the right direction.
Thanks, AJ
-Michael