From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adolf Belka To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Wed, 10 Jul 2024 15:05:32 +0200 Message-ID: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5103185923116179105==" List-Id: --===============5103185923116179105== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hi Michael, On 10/07/2024 14:59, Adolf Belka wrote: > Hi Michael, > > On 10/07/2024 12:33, Adolf Belka wrote: >> Hi Michael, >> >> On 10/07/2024 11:57, Michael Tremer wrote: >>> Hello again, >>> >>> I managed to (finally) build the toolchain with the updated system. So ho= pefully there should not be any more outstanding problems that I know of so f= ar. >> >> I just did a git pull on your repo to my clone. >> >> Ran ./make.sh gettoolchain and it successfully downloaded the toolchain. >> >> Ran ./make.sh downloadsrc and it successfully tested everything. >> >> Ran ./make.sh clean and build and log directories were cleared out and rem= oved. As far as I can tell it was successful. >> >> Currently running ./make.sh build. So up to this point everything going we= ll. Will let you know how it goes. >> > It has got to building popt. In the normal build system this takes around 3= secs. Currently in the new build it is at nearly 2 hours. Even with an empty= cache, that seems a long build time for popt, unless I am being too optimist= ic. > > I will let it keep going. > I just had a look at the log file and it looks like it completed popt but the= n is stuck on trying to leave the directory /usr/src/lfs. Here is the output = from the log, nothing new is getting written to the log. make[3]: Leaving directory '/usr/src/popt-1.19/po' Making install in tests make[3]: Entering directory '/usr/src/popt-1.19/tests' make[4]: Entering directory '/usr/src/popt-1.19/tests' make[4]: Nothing to be done for 'install-exec-am'. make[4]: Nothing to be done for 'install-data-am'. make[4]: Leaving directory '/usr/src/popt-1.19/tests' make[3]: Leaving directory '/usr/src/popt-1.19/tests' make[3]: Entering directory '/usr/src/popt-1.19' make[4]: Entering directory '/usr/src/popt-1.19' make[4]: Nothing to be done for 'install-exec-am'. =C2=A0/bin/mkdir -p '/usr/share/man/man3' =C2=A0/usr/bin/install -c -m 644 popt.3 '/usr/share/man/man3' =C2=A0/bin/mkdir -p '/usr/lib/pkgconfig' =C2=A0/usr/bin/install -c -m 644 popt.pc '/usr/lib/pkgconfig' make[4]: Leaving directory '/usr/src/popt-1.19' make[3]: Leaving directory '/usr/src/popt-1.19' make[2]: Leaving directory '/usr/src/popt-1.19' make[1]: Leaving directory '/usr/src/popt-1.19' Updating linker cache... Install done; saving file list to /usr/src/log/popt-1.19 ... make: Leaving directory '/usr/src/lfs' Regards, Adolf. > > One thing I found. I am running the new build system while I have been runn= ing some package updates with the old system with its mount points. The two h= ave each run without any impact on the other. > > Regards, > > Adolf. > >> Regards, >> Adolf. >> >>> >>> Best, >>> -Michael >>> >>>> On 9 Jul 2024, at 22:29, Michael Tremer wr= ote: >>>> >>>> Hello Adolf, >>>> >>>> Thank you for testing this. >>>> >>>> There have indeed been plenty of problems there=E2=80=A6 I spent a lot o= f time on this today and hopefully fixed most of them. >>>> >>>> I cannot build the toolchain on my machine and I am not sure why yet, bu= t a build with the packaged toolchain runs through. >>>> >>>> I have also spent some time on getting rid of the strip stage because it= annoyed me how long it takes and creating the disk images as well as package= s should be significantly faster now, too. I hope I didn=E2=80=99t introduce = too many new bugs. >>>> >>>> Please let me know if you have more success now. >>>> >>>> Best, >>>> -Michael >>>> >>>>> On 8 Jul 2024, at 20:34, Adolf Belka wrote: >>>>> >>>>> Hi Michael, >>>>> >>>>> On 08/07/2024 21:15, Adolf Belka wrote: >>>>>> Hi Michael, >>>>>> >>>>>> On 08/07/2024 18:11, Michael Tremer wrote: >>>>>>> Hello, >>>>>>> >>>>>>> I have been spending a lot of time on this problem, because it has be= en bothering me for a long time. I also saw an opportunity to make more chang= es to the build system. >>>>>>> >>>>>>> Currently this is all a little bit WIP, but I hope that we can merge = this into next as soon as the current update has been moved to master. >>>>>>> >>>>>>> I am referring to this branch which is currently based on next: https= ://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/u= nshare >>>>>>> >>>>>>> It makes use of the unshare command which creates new namespaces in L= inux. That way, we can isolate the build system better from the host system a= nd in case something goes wrong, there is less damage. We can also enforce so= me more rules=E2=80=A6 >>>>>>> >>>>>>> So, what has changed? >>>>>>> >>>>>>> * The make.sh script might re-execute itself into a new mount namespa= ce when it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and = =E2=80=9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh = downloadsrc=E2=80=9D for example. >>>>>>> >>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmak= e.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l21= 29 >>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmak= e.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l22= 51 >>>>>>> >>>>>>> * The new mount namespace means that we will no longer see any bind-m= ounts in the host system and we no longer need to umount anything ourselves w= hich is where we occasionally wiped the entire hard drive of the host system.= When the last process exits, the namespace is being cleaned up and everythin= g is being umounted. >>>>>>> >>>>>>> * The function that prepares the build environment has been almost en= tirely rewritten: >>>>>>> >>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmak= e.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 >>>>>>> >>>>>>> It used to mount parts of the host system into the build environment = which are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 >>>>>>> >>>>>>> Instead, it now creates a new /dev mount point and creates a minimal = amount of device nodes and symlinks. That way, we detach from the host system= and no longer allow the build system access to the host=E2=80=99s filesystem= and block devices. We also bind-mount the sources in read-only mode now, so = that the build system cannot change anything in the source tree. On top of th= at, cache is read-only, too. ccache and the log directory are the only places= that are writable. >>>>>>> >>>>>>> We mount a separate /tmp directory. >>>>>>> >>>>>>> * When we then build a package, we create more namespaces for each pa= ckage. These isolate each build process from each other. >>>>>>> >>>>>>> Mostly, this is to detach from the host system. A new UTS namespace a= llows to change the hostname in the build system without affecting the host a= nd so on. We do the same thing with a new time namespace. >>>>>>> >>>>>>> We do however create a new PID namespace which means that the build s= ystem no longer will see any processes running on the host system. That requi= res to mount a new instance of /proc with each package. This also has the eff= ect that if the shell that we launched terminates (because the build is done)= any background processes will be killed immediately. >>>>>>> >>>>>>> Last, we clone the mount namespace that we have created before so tha= t no build command can modify what we set up earlier. >>>>>>> >>>>>>> * Since everything is now so decoupled, we gain a couple of new (mayb= e minor?) features: >>>>>>> >>>>>>> =C2=A0=C2=A0 It is now possible to run =E2=80=9Cmake.sh shell=E2=80= =9D while a build is running. That does not happen a lot, but we can do this = now :) >>>>>>> >>>>>>> =C2=A0=C2=A0 If the build crashes or the host system is being shut do= wn while a build is running, there is nothing to clean up afterwards. >>>>>>> >>>>>>> * I have garnished this all with a lot of code cleanup and I suppose = I might have introduced some new bugs here or there :) >>>>>>> >>>>>>> * This is probably mostly around a new implementation of the timer th= at updates the build time. It has been annoying me a lot that it takes a long= time to walk through all packages that have been built before to finally get= to a package that we want to rebuild. Mostly this was all help up by a call = of =E2=80=9Csleep 0.1=E2=80=9D >>>>>>> >>>>>>> Since bash does not really do any concurrency, I had to be creative a= nd replaced the busy-loop with a background process that is launched whenever= it is needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script o= nce a second. That way, we can just run as usual, but regularly get interrupt= ed to update the runtime. >>>>>>> >>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmak= e.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 >>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmak= e.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 >>>>>>> >>>>>>> We now only fork one extra sub shell and we have to handle the timer = events which is a lot cheaper as well as more straight-forward to code. >>>>>>> >>>>>>> * As there is no difference between the different stages any more (th= ose stages that we inherited from LFS), I have merged them all into one. >>>>>>> >>>>>>> * Last but not least, I have create the option to build for multiple = architectures on the same system. Since we can now mount the entire source tr= ee into (many independent) build environments, we might as well=E2=80=A6 As d= iscussed on the last call, this might not be the best option for ARM, but RIS= V-C builds at a decent speed even when emulated. >>>>>>> >>>>>>> The only thing that I needed to do for this is to suffix the build an= d log directories which are now called build_${ARCH}, i.e. build_aarch64, bui= ld_x86_64, and so on. The packages/ directory is not changed yet, but that wi= ll have to happen as well. Most likely I want to merge this with the generate= d images, but I am not sure what to call this, yet. Happy to hear suggestions= . result_x86_64? Just images_x86_64? >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> I have run a build and this seems to be working just fine on my Debia= n machine. I am writing to all of you to first of all let you know what I am = up to; and secondly to ask to give this a go on your systems. I think it shou= ld run just fine, as all the tools that I require should be available everywh= ere. However, there might be some older kernels that might not support all of= this, yet or any other problems I cannot think of yet. Please give me some f= eedback and send me all the bugs :) >>>>>> I gave this a go but it didn't work. >>>>>> >>>>>> Not sure if I should have run the ./make.sh clean command on the old v= ersion before I pulled the unshare branch into my clone of your repo. >>>>>> >>>>>> Should I have started with a complete new clone of your repo? I might = try that anyway just to see. >>>>>> >>>>> I created a completely new clone of you ipfire-2.x repor and then check= ed out the unshare branch to a branch called unshare in my local repo clone. >>>>> >>>>> gettoolchain gave the same issue, except that this time the toolchain d= irectory ended up completely empty. >>>>> >>>>> downloadsrc had the same result. >>>>> >>>>> clean had nothing to clean up as it was a fresh clone. >>>>> >>>>> build then tried to build the toolchain and came up with this error, di= fferent from before. >>>>> >>>>> ./make.sh build >>>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or di= rectory >>>>> Full toolchain compilation >>>>> stage1 [ FAIL ] >>>>> >>>>> =C2=A0=C2=A0=C2=A0 Jul=C2=A0 8 19:26:39: Building stage1 =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D Installing stage1 ... >>>>> =C2=A0=C2=A0=C2=A0 mkdir -pv /tools_x86_64/lib >>>>> =C2=A0=C2=A0=C2=A0 mkdir: cannot create directory '/tools_x86_64': File= exists >>>>> =C2=A0=C2=A0=C2=A0 make: *** [stage1:50: /home/ahb/sandbox/ms/ipfire-2.= x/log/stage1] Error 1 >>>>> >>>>> ERROR: Building stage1 [ FAIL ] >>>>> =C2=A0=C2=A0=C2=A0 Check /home/ahb/sandbox/ms/ipfire-2.x/log_x86_64/_bu= ild.toolchain.log for errors if applicable [ FAIL ] >>>>> >>>>> so it wasn't as simple as doing a fresh git clone. >>>>> >>>>> Regards, >>>>> >>>>> Adolf. >>>>> >>>>> >>>>>> So I ran ./make.sh gettoolchain first, as I usually would. >>>>>> >>>>>> ./make.sh gettoolchain >>>>>> b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst:= No such file or directory >>>>>> cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED= open or read >>>>>> b2sum: WARNING: 1 listed file could not be read >>>>>> >>>>>> ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with= its b2 file. >>>>>> >>>>>> >>>>>> Then ran ./make.sh downloadsrc >>>>>> >>>>>> Previous version ends with >>>>>> >>>>>> ***Verifying BLAKE2 checksum >>>>>> all files BLAKE2 checksum match [ DONE] >>>>>> >>>>>> after zstd has been checked. >>>>>> >>>>>> New version stops at zstd entry. >>>>>> >>>>>> >>>>>> ./make.sh clean gave the message Cleaning Build directory... but was c= ompleted very quickly. >>>>>> Log and Build directories have not been cleaned out. The img and iso f= iles are still present. >>>>>> >>>>>> >>>>>> ./make.sh build gave message >>>>>> >>>>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or d= irectory >>>>>> >>>>>> and then did a full toolchain compilation which failed with gcc but lo= g is >9000 lines. >>>>>> >>>>>> >>>>>> Regards, >>>>>> Adolf. >>>>>> >>>>>>> >>>>>>> Thank you for listening to this brain-dump. >>>>>>> >>>>>>> All the best, >>>>>>> -Michael >>>>>>> >>>>>>>> On 3 Jul 2024, at 10:58, Michael Tremer wrote: >>>>>>>> >>>>>>>> Hello Adolf, >>>>>>>> >>>>>>>> This happens occasionally that the buildsystem umounts /dev and then= nothing will really work any more. >>>>>>>> >>>>>>>> I rebooted the machine and it is back up again. >>>>>>>> >>>>>>>> -Michael >>>>>>>> >>>>>>>>> On 2 Jul 2024, at 15:42, Adolf Belka wro= te: >>>>>>>>> >>>>>>>>> Hi Michael and all, >>>>>>>>> >>>>>>>>> >>>>>>>>> I ran the arm builder with the 4.20.2 version of samba to test it o= ut. >>>>>>>>> >>>>>>>>> The build got to building gdb and then failed. >>>>>>>>> >>>>>>>>> Interestingly, the nightly build of arm was successful with the sam= e version of gdb. >>>>>>>>> >>>>>>>>> The build log for gdb is attached. The actual error is at line 618. >>>>>>>>> >>>>>>>>> Another thing I found is that I just tried to go back into the arm = builder. I successfully got into people.ipfire.org but then trying to scp int= o the arm builder failed with the following message. >>>>>>>>> >>>>>>>>> ------------------------------------------ >>>>>>>>> >>>>>>>>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >>>>>>>>> PTY allocation request failed on channel 0 >>>>>>>>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.= 1.90-1 (2024-05-03) aarch64 >>>>>>>>> >>>>>>>>> The programs included with the Debian GNU/Linux system are free sof= tware; >>>>>>>>> the exact distribution terms for each program are described in the >>>>>>>>> individual files in /usr/share/doc/*/copyright. >>>>>>>>> >>>>>>>>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >>>>>>>>> permitted by applicable law. >>>>>>>>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permis= sion denied >>>>>>>>> >>>>>>>>> ------------------------------------------ >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> >>>>>>>>> Adolf. >>>>>>>>> >>>>>>>>> <_build.ipfire.gdb.log> >>>> >>>> >>> --===============5103185923116179105==--