From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adolf Belka To: development@lists.ipfire.org Subject: Re: Feedback on problems with Core Update 168 Testing Date: Thu, 12 May 2022 22:10:01 +0200 Message-ID: In-Reply-To: <3C5B2854-D47F-4D92-B943-761BEC763191@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1001555238831972539==" List-Id: --===============1001555238831972539== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Michael, On 12/05/2022 14:53, Michael Tremer wrote: > Hello, >=20 >> On 12 May 2022, at 12:25, Adolf Belka wrote: >> >> Hi, >> >> On 12/05/2022 11:13, Michael Tremer wrote: >>> Hello, >>> Thanks for spending so much time on this. We definitely need to improve t= he general update experience since we sometimes seem to break people=E2=80=99= s systems and it is not nice to re-install a firewall from scratch. It will t= ake a while. >>> So what I can say is that the kernel module issues come from when the run= ning kernel is changed and the kernel is trying to load any modules that now = have changed. This fails by design, because we sign our kernel modules. The k= ey is randomly generated at build time and used to sign all modules and it th= en thrown away. For each build, we are using a different, unique key that is = not preserved. >>> This means that although the kernel modules are of the same version, they= cannot be loaded because the signature check fails. That might also explain = why you are seeing so many ipset errors, because the kernel cannot load that = module any more. However, we use so much ipset now, why isn=E2=80=99t the mod= ule loaded from before the update was started? >>> The same goes for any network drivers. I assume you are using virtio or a= generic e1000 network adapter which will have been initialised at boot time.= The kernel should never unload the kernel module for that interface and load= it again later. I have no idea what could have triggered that. >> It is a generic e1000 network adapter that is being used by VirtualBox. Co= nfirmed that on my working CU167 vm. >>> No matter what though; after you reboot, the new kernel should be booted = being able to load all modules it wants and the system should run absolutely = fine. Can you confirm that that is at least the case? >> No it is not the case. Yesterday when I booted the vm several times the sa= me error messages occurred each time and I always had no network interfaces. >=20 > Okay. That is indeed quite bad then. >=20 > Since there is no kernel in 168, is there a chance that we broke the update= to 167? I don't know but as far as I have been concerned CU167 has worked well on my = vm. Is there something I should look for on my CU167 vm? >=20 >> Just now I tried booting my third attempt vm again and this time it ran ok= ay and I ended up with all my networks assigned and I had network connection.= An IP was assigned by dhcp to the red interface. >=20 > Third time lucky isn=E2=80=99t good enough for me :) Me neither :-) >=20 >> I then rebooted again and this time the error messages were back. I reboot= ed the vm three more times and each time the error messages were there and no= network interfaces. >> >> I checked with lsmod and the e1000 driver module was not loaded. Confirmed= the e1000 driver directory was not present in /sys/bus/pci/drivers >=20 > Have there been any modules loaded? If one loads, they all should load. This is what I get with lsmod Module Size Used by crct10dif_pclmul 16384 1 crc32_pclmul 16384 0 ghash_clmulni_intel 16384 0 serio_raw 20480 0 ohci_pci 20480 0 ata_generic 16384 0 pata_acpi 16384 0 video 57344 0 On my running CU1678 vm lsmod gives the following Module Size Used by tun 61440 2 nfnetlink_queue 28672 1 xt_NFQUEUE 16384 8 xt_MASQUERADE 20480 1 cfg80211 1036288 0 rfkill 32768 1 cfg80211 8021q 40960 0 garp 16384 1 8021q xt_set 16384 260 ip_set_hash_net 49152 258 ip_set 57344 2 xt_set,ip_set_hash_net xt_hashlimit 20480 2 xt_multiport 20480 4 xt_policy 16384 5 xt_TCPMSS 16384 1 xt_conntrack 16384 7 xt_comment 16384 18 ipt_REJECT 16384 1 nf_reject_ipv4 16384 1 ipt_REJECT xt_LOG 20480 26 xt_limit 16384 25 xt_mark 16384 27 xt_connmark 16384 2 nf_log_syslog 24576 26 iptable_raw 16384 0 iptable_mangle 16384 1 iptable_filter 16384 1 vfat 24576 1 fat 90112 1 vfat sch_cake 36864 4 intel_powerclamp 20480 0 psmouse 184320 0 pcspkr 16384 0 i2c_piix4 28672 0 e1000 163840 0 i2c_core 106496 2 psmouse,i2c_piix4 crct10dif_pclmul 16384 1 crc32_pclmul 16384 0 ata_generic 16384 0 pata_acpi 16384 0 ghash_clmulni_intel 16384 0 serio_raw 20480 0 ohci_pci 20480 0 video 57344 0 The last 8 entries are the same but a whole load of others are missing. Regards, Adolf. >=20 >> Tried modprobe e1000 but this came back with the following error >> modprobe: ERROR: could not insert 'e1000': Key was rejected by service >> >> So out of 7 or 8 reboots the vm booted with the network drivers loaded onc= e. >> >> Regards >> Adolf >>>> On 11 May 2022, at 20:08, Adolf Belka wrote: >>>> >>>> Hi All, >>>> >>>> On 11/05/2022 20:48, Jon Murphy wrote: >>>>> I just did an update from CU 167 (stable) to CU 168 (testing) and I got= the same build info: >>>>> *IPFire 2.27 (x86_64) - Core Update 167 Development Build: master/c22d8= 34c* >>>>> Shouldn=E2=80=99t this be Core Update 168? >>>> That was what I thought but I couldn't precisely remember what it was in= the past with Testing Releases. >>>> >>>> I then downloaded the Core 168 iso from the nightlies build of master/la= test/x86_64 and did an install on the First attempt vm. >>>> >>>> I got to the stage of entering the root and admin passwords. Entered wha= t I usually do for the vm's and after pressing OK on the admin password scree= n I got as message box saying "Problem setting IPFire 'admin' user password".= I pressed the OK button and got another message box saying "Initial setup wa= s not entirely complete. You must ensure that Setup is properly finished by r= unning setup again at the shell.". Pressing the OK button on that message box= causes setup to restart but again after entering a valid admin password twic= e I got the same "Problem setting IPFire 'admin' user password" message box. >>>> >>>> I then tried installing CU168 from the same iso onto a clone of my runni= ng CU167 vm. Same result with "Problem setting IPFire 'admin' user password" = message box. >>>> >>>> Then I created a completely new vm from scratch and tried installing CU1= 68 Testing iso and again got "Problem setting IPFire 'admin' user password" m= essage box. >>> That seems to be a bug that should not be there. Did we recently update a= pache? >>> -Michael >>>> >>>> Don't know if this is related to the problem with the missing interfaces= but if not then it is another problem. >>>> >>>> I think I am going to raise a bug on this. >>>> >>>> Regards, >>>> Adolf. >>>>> Jon >>>>>> On May 11, 2022, at 9:19 AM, Adolf Belka > wrote: >>>>>> >>>>>> Hi All, >>>>>> >>>>>> On 11/05/2022 16:00, Adolf Belka wrote: >>>>>>> Hi Leo, >>>>>>> >>>>>>> On 11/05/2022 15:26, Leo Hofmann wrote: >>>>>>>> Hi Adolf, >>>>>>>> >>>>>>>> Pakfire always automatically reinstalls the current release before u= pdating if you are in the testing branch. See this function in the Pakfire co= de: >>>>>>> So that proves that I haven't been looking so closely on my upgrades = in the past because I have never noticed that. >>>>>>>> https://git.ipfire.org/?p=3Dipfire-2.x.git;a=3Dblob;f=3Dsrc/pakfire/= lib/functions.pl;h=3Dd4e338f23ae8ae97d6f18c6d8890d13463dc5d30;hb=3Drefs/heads= /next#l762 >>>>>>> Now I see it in the code. Thanks very much. >>>>>>>> >>>>>>>> But unfortunately this is all I can contribute, my test system updat= ed to 168 without problems! >>>>>>> Thanks for your help anyway. It's got rid of one of my questions. >>>>>>> >>>>>>> Just need to understand now why 2 out of 3 updates resulted in a comp= lete loss of the interface assignments. >>>>>>> Will wait to hear other inputs on this problem. >>>>>>> Maybe will try running setup on the first try I had to confirm that I= can re-assign the interfaces again. >>>>>>> >>>>>> Running setup does not help. None of the interfaces are available to b= e assigned to the red, green etc zones. >>>>>> >>>>>> I checked the interfaces in the vm setup and they are as previously de= fined and previously working on all earlier CU's of my vm. >>>>>> >>>>>> When rebooting I did see a fast scrolling message that seemed to say s= omething like Invalid Kernel Argument but that is as much as I was able to se= e/ >>>>>> >>>>>> Regards, >>>>>> >>>>>> Adolf. >>>>>> >>>>>>> Regards, >>>>>>> Adolf. >>>>>>> >>>>>>>> >>>>>>>> Best regards >>>>>>>> Leo >>>>>>>> >>>>>>>> Am 11.05.2022 um 15:11 schrieb Adolf Belka: >>>>>>>>> Hi All, >>>>>>>>> >>>>>>>>> I have tried to update my Core Update 167 vm machine three times no= w (using a clone) and have had several problems so I thought I would outline = what has happened for further thoughts. >>>>>>>>> >>>>>>>>> Normally my updates go without any real problems. This is the first= time where this is not the case. >>>>>>>>> >>>>>>>>> >>>>>>>>> First try:- >>>>>>>>> >>>>>>>>> After running the update, which went quite quickly, the bottom of t= he screen had >>>>>>>>> >>>>>>>>> Core Update 167 Development Build: master/c22d834c >>>>>>>>> >>>>>>>>> I couldn't remember if this is what is expected or not for Testing = releases. >>>>>>>>> >>>>>>>>> Rebooted the vm and on booting a large number of error messages scr= olled across the console screen. My Virtualbox terminal for IPFire has no scr= oll capability so I can only write on things I saw as they flew past or what = was on the screen when it stopped. >>>>>>>>> >>>>>>>>> Lots of ipset restore error but went by too fast to get more detail. >>>>>>>>> >>>>>>>>> Tried to access via the WUI and no access. Tried via SSH and no acc= ess. >>>>>>>>> >>>>>>>>> Checked the red0 directory and there were no files present. Tried r= estarting red0 and got the following message >>>>>>>>> >>>>>>>>> starting >>>>>>>>> DUID 00:04:70:........ >>>>>>>>> red0: interface not found >>>>>>>>> >>>>>>>>> So I ran ip address show and it came up with only the lo interface.= No red0, green0, blue0 or orange0. >>>>>>>>> >>>>>>>>> Looked in bootlog and this was filled with around 4000 lines all th= e same >>>>>>>>> >>>>>>>>> Loading of module with unavailable key is rejected >>>>>>>>> >>>>>>>>> >>>>>>>>> Second try:- >>>>>>>>> >>>>>>>>> This time running the update took over 10 mins. >>>>>>>>> >>>>>>>>> Noticed on the pakfire screen that it first ran a Core Update 167 u= pgrade then it ran the Core Update 168. My IPFire vm was definitely on Core U= pdate 167. >>>>>>>>> >>>>>>>>> /opt/pakfire/db/core/mine contained 167 beforehand. >>>>>>>>> >>>>>>>>> Before doing the reboot I accessed via ssh and saw that in the /var= /log/pakfire/ directory the update-core-upgrade-167 file has been updated as = well as there being a 168. Both files had the same last modified date/time. >>>>>>>>> >>>>>>>>> Rebooted. This time everything came back up okay but the bottom of = the screen now just said Core Update 167 and there was no update-core-upgrade= -168 file in the pakfire directory and the 167 version was back to its date o= f Apr 28 >>>>>>>>> >>>>>>>>> Repository was set back to Stable. >>>>>>>>> >>>>>>>>> Changed again to Testing and it said again that there was an update= from 167 to 168. >>>>>>>>> >>>>>>>>> Before running the pakfire directory had the following >>>>>>>>> >>>>>>>>> -rw-r--r-- 1 root root 1.8K Mar 31 23:26 update-core-upgrade-166.l= og >>>>>>>>> -rw-r--r-- 1 root root 323K Apr 28 09:47 update-core-upgrade-167.l= og >>>>>>>>> >>>>>>>>> After running it had >>>>>>>>> >>>>>>>>> -rw-r--r-- 1 root root 1.8K Mar 31 23:26 update-core-upgrade-166.l= og >>>>>>>>> -rw-r--r-- 1 root root 604K May 11 13:24 update-core-upgrade-167.l= og >>>>>>>>> -rw-r--r-- 1 root root 175 May 11 13:24 update-core-upgrade-168.l= og >>>>>>>>> >>>>>>>>> back to Core Update 167 Development Build: master/c22d834c at botto= m of screen. >>>>>>>>> >>>>>>>>> Rebooted. >>>>>>>>> >>>>>>>>> All interfaces present. Pakfire under System Status: says Core-Upda= te-Level: 168 and the bottom of the screen showed Core Update 167 Development= Build: master/c22d834c >>>>>>>>> >>>>>>>>> /opt/pakfire/db/core/mine has 168 in it >>>>>>>>> >>>>>>>>> I checked the WIO wio.pl file that had a bug fix and it was as expe= cted for CU168. >>>>>>>>> >>>>>>>>> >>>>>>>>> Third try:- >>>>>>>>> >>>>>>>>> -rw-r--r-- 1 root root 323K Apr 28 09:47 update-core-upgrade-167.l= og >>>>>>>>> >>>>>>>>> Selected Testing and ran update from 167 to 168 which occurred very= quickly. >>>>>>>>> >>>>>>>>> -rw-r--r-- 1 root root 604K May 11 14:00 update-core-upgrade-167.l= og >>>>>>>>> -rw-r--r-- 1 root root 63K May 11 14:01 update-core-upgrade-168.l= og >>>>>>>>> >>>>>>>>> Here is the log file info for pakfire from messages.1.gz and messag= es which shows it downloading and upgrading CU167 first then doing CU168 as f= ar as I can tell. >>>>>>>>> >>>>>>>>> May 11 13:59:30 ipfire pakfire: PAKFIRE INFO: IPFire Pakfire 2.27.1= -x86_64 started! >>>>>>>>> May 11 13:59:30 ipfire pakfire: CORE INFO: core-list.db is 27 secon= ds old. - DEBUG: noforce >>>>>>>>> May 11 13:59:30 ipfire pakfire: CORE UPGR: Upgrading from release 1= 66 to 168 >>>>>>>>> May 11 13:59:30 ipfire pakfire: DOWNLOAD STARTED: paks/core-upgrade= -2.27-167.ipfire >>>>>>>>> May 11 13:59:30 ipfire pakfire: MIRROR INFO: 2 servers found in list >>>>>>>>> May 11 13:59:30 ipfire pakfire: DOWNLOAD INFO: Host: ipfire.earl-ne= t.com (HTTPS) - File: pakfire2/2.27.1-x86_64/pa= ks/core-upgrade-2.27-167.ipfire >>>>>>>>> May 11 13:59:30 ipfire pakfire: DOWNLOAD INFO: pakfire2/2.27.1-x86_= 64/paks/core-upgrade-2.27-167.ipfire has size of 71914825 bytes >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD INFO: HTTP-Status-Code: 20= 0 - 200 OK >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD INFO: File received. Start= checking signature... >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD INFO: Signature of core-up= grade-2.27-167.ipfire is fine. >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD FINISHED: pakfire2/2.27.1-= x86_64/paks/core-upgrade-2.27-167.ipfire >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD STARTED: meta/meta-core-up= grade-168 >>>>>>>>> May 11 13:59:32 ipfire pakfire: MIRROR INFO: 2 servers found in list >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD INFO: Host: mirror1.ipfire= .org (HTTPS) - File: pakfire2/2.27.1-x86_64/meta= /meta-core-upgrade-168 >>>>>>>>> May 11 13:59:32 ipfire pakfire: DOWNLOAD INFO: pakfire2/2.27.1-x86_= 64/meta/meta-core-upgrade-168 has size of 1804 bytes >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD INFO: HTTP-Status-Code: 20= 0 - 200 OK >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD INFO: File received. Start= checking signature... >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD INFO: Signature of meta-co= re-upgrade-168 is fine. >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD FINISHED: pakfire2/2.27.1-= x86_64/meta/meta-core-upgrade-168 >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD STARTED: paks/core-upgrade= -2.27-168.ipfire >>>>>>>>> May 11 13:59:33 ipfire pakfire: MIRROR INFO: 2 servers found in list >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD INFO: Host: ipfire.earl-ne= t.com (HTTPS) - File: pakfire2/2.27.1-x86_64/pa= ks/core-upgrade-2.27-168.ipfire >>>>>>>>> May 11 13:59:33 ipfire pakfire: DOWNLOAD INFO: pakfire2/2.27.1-x86_= 64/paks/core-upgrade-2.27-168.ipfire has size of 34355484 bytes >>>>>>>>> May 11 13:59:34 ipfire pakfire: DOWNLOAD INFO: HTTP-Status-Code: 20= 0 - 200 OK >>>>>>>>> May 11 13:59:34 ipfire pakfire: DOWNLOAD INFO: File received. Start= checking signature... >>>>>>>>> May 11 13:59:34 ipfire pakfire: DOWNLOAD INFO: Signature of core-up= grade-2.27-168.ipfire is fine. >>>>>>>>> May 11 13:59:34 ipfire pakfire: DOWNLOAD FINISHED: pakfire2/2.27.1-= x86_64/paks/core-upgrade-2.27-168.ipfire >>>>>>>>> May 11 13:59:34 ipfire pakfire: PAKFIRE UPGR: core-upgrade-167: Dec= rypting... >>>>>>>>> May 11 13:59:34 ipfire pakfire: CLEANUP: tmp >>>>>>>>> May 11 13:59:34 ipfire pakfire: DECRYPT STARTED: core-upgrade-167 >>>>>>>>> May 11 13:59:35 ipfire pakfire: DECRYPT FINISHED: core-upgrade-167 = - Status: 0 >>>>>>>>> May 11 13:59:35 ipfire pakfire: PAKFIRE UPGR: core-upgrade-167: Upg= rading files and running post-upgrading scripts... >>>>>>>>> May 11 14:00:31 ipfire pakfire: CLEANUP: tmp >>>>>>>>> May 11 14:00:31 ipfire pakfire: PAKFIRE UPGR: core-upgrade-167: Fin= ished. >>>>>>>>> May 11 14:00:31 ipfire pakfire: PAKFIRE UPGR: core-upgrade-168: Dec= rypting... >>>>>>>>> May 11 14:00:31 ipfire pakfire: CLEANUP: tmp >>>>>>>>> May 11 14:00:31 ipfire pakfire: DECRYPT STARTED: core-upgrade-168 >>>>>>>>> May 11 14:00:32 ipfire pakfire: DECRYPT FINISHED: core-upgrade-168 = - Status: 0 >>>>>>>>> May 11 14:00:32 ipfire pakfire: PAKFIRE UPGR: core-upgrade-168: Upg= rading files and running post-upgrading scripts... >>>>>>>>> >>>>>>>>> >>>>>>>>> May 11 14:01:10 ipfire pakfire: CLEANUP: tmp >>>>>>>>> May 11 14:01:10 ipfire pakfire: PAKFIRE UPGR: core-upgrade-168: Fin= ished. >>>>>>>>> May 11 14:01:10 ipfire pakfire: CORE INFO: core-list.db is 127 seco= nds old. - DEBUG: noforce >>>>>>>>> May 11 14:01:10 ipfire pakfire: PAKFIRE RESV: wio: Resolving depend= encies... >>>>>>>>> May 11 14:01:10 ipfire pakfire: PAKFIRE UPGR: We are going to insta= ll all packages listed above. >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD STARTED: paks/wio-1.3.2-15= .ipfire >>>>>>>>> May 11 14:01:10 ipfire pakfire: MIRROR INFO: 2 servers found in list >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD INFO: Host: mirror1.ipfire= .org (HTTPS) - File: pakfire2/2.27.1-x86_64/paks= /wio-1.3.2-15.ipfire >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD INFO: pakfire2/2.27.1-x86_= 64/paks/wio-1.3.2-15.ipfire has size of 47216 bytes >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD INFO: HTTP-Status-Code: 20= 0 - 200 OK >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD INFO: File received. Start= checking signature... >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD INFO: Signature of wio-1.3= .2-15.ipfire is fine. >>>>>>>>> May 11 14:01:10 ipfire pakfire: DOWNLOAD FINISHED: pakfire2/2.27.1-= x86_64/paks/wio-1.3.2-15.ipfire >>>>>>>>> May 11 14:01:10 ipfire pakfire: PAKFIRE UPGR: wio: Decrypting... >>>>>>>>> May 11 14:01:10 ipfire pakfire: CLEANUP: tmp >>>>>>>>> May 11 14:01:10 ipfire pakfire: DECRYPT STARTED: wio >>>>>>>>> May 11 14:01:10 ipfire pakfire: DECRYPT FINISHED: wio - Status: 0 >>>>>>>>> May 11 14:01:10 ipfire pakfire: PAKFIRE UPGR: wio: Upgrading files = and running post-upgrading scripts... >>>>>>>>> May 11 14:01:11 ipfire pakfire: CLEANUP: tmp >>>>>>>>> May 11 14:01:11 ipfire pakfire: PAKFIRE UPGR: wio: Finished. >>>>>>>>> May 11 14:01:11 ipfire pakfire: PAKFIRE INFO: Pakfire has finished.= Closing. >>>>>>>>> >>>>>>>>> Rebooted >>>>>>>>> Lots of ipset restore error but went by too fast to get more detail. >>>>>>>>> Several repeats of the following two lines on the console screen:- >>>>>>>>> >>>>>>>>> iptables v1.8.7 (legacy): can't initialize iptables table `filter' = : Table does not exist (do you need to insmod?) >>>>>>>>> Perhaps iptables or your kernel needs to be upgraded. >>>>>>>>> >>>>>>>>> >>>>>>>>> Only lo interface present. All other interfaces red0, green0, blue0= , orange0 not present. >>>>>>>>> >>>>>>>>> >>>>>>>>> I have given up at this point. >>>>>>>>> >>>>>>>>> I have kept the clones of all three attempts so if you need any add= itional info from logs or files just let me know. >>>>>>>>> >>>>>>>>> If this is something obviously wrong that I am doing, please let me= know. If not then I will probably raise a bug on this. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Adolf. >>>>>>>>> >=20 --===============1001555238831972539==--