From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Tremer <michael.tremer@ipfire.org> To: development@lists.ipfire.org Subject: Re: Problem during building of samba on arm builder Date: Wed, 10 Jul 2024 10:57:16 +0100 Message-ID: <3134CD7D-25D6-48FC-B72E-AEC881EF9357@ipfire.org> In-Reply-To: <0D208105-6697-41E7-88FF-2DAAD6483158@ipfire.org> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4569108461891801027==" List-Id: <development.lists.ipfire.org> --===============4569108461891801027== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Hello again, I managed to (finally) build the toolchain with the updated system. So hopefu= lly there should not be any more outstanding problems that I know of so far. Best, -Michael > On 9 Jul 2024, at 22:29, Michael Tremer <michael.tremer(a)ipfire.org> wrote: >=20 > Hello Adolf, >=20 > Thank you for testing this. >=20 > There have indeed been plenty of problems there=E2=80=A6 I spent a lot of t= ime on this today and hopefully fixed most of them. >=20 > I cannot build the toolchain on my machine and I am not sure why yet, but a= build with the packaged toolchain runs through. >=20 > I have also spent some time on getting rid of the strip stage because it an= noyed me how long it takes and creating the disk images as well as packages s= hould be significantly faster now, too. I hope I didn=E2=80=99t introduce too= many new bugs. >=20 > Please let me know if you have more success now. >=20 > Best, > -Michael >=20 >> On 8 Jul 2024, at 20:34, Adolf Belka <adolf.belka(a)ipfire.org> wrote: >>=20 >> Hi Michael, >>=20 >> On 08/07/2024 21:15, Adolf Belka wrote: >>> Hi Michael, >>>=20 >>> On 08/07/2024 18:11, Michael Tremer wrote: >>>> Hello, >>>>=20 >>>> I have been spending a lot of time on this problem, because it has been = bothering me for a long time. I also saw an opportunity to make more changes = to the build system. >>>>=20 >>>> Currently this is all a little bit WIP, but I hope that we can merge thi= s into next as soon as the current update has been moved to master. >>>>=20 >>>> I am referring to this branch which is currently based on next: https://= git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/unsh= are >>>>=20 >>>> It makes use of the unshare command which creates new namespaces in Linu= x. That way, we can isolate the build system better from the host system and = in case something goes wrong, there is less damage. We can also enforce some = more rules=E2=80=A6 >>>>=20 >>>> So, what has changed? >>>>=20 >>>> * The make.sh script might re-execute itself into a new mount namespace = when it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and =E2= =80=9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh dow= nloadsrc=E2=80=9D for example. >>>>=20 >>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129 >>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251 >>>>=20 >>>> * The new mount namespace means that we will no longer see any bind-moun= ts in the host system and we no longer need to umount anything ourselves whic= h is where we occasionally wiped the entire hard drive of the host system. Wh= en the last process exits, the namespace is being cleaned up and everything i= s being umounted. >>>>=20 >>>> * The function that prepares the build environment has been almost entir= ely rewritten: >>>>=20 >>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426 >>>>=20 >>>> It used to mount parts of the host system into the build environment whi= ch are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6 >>>>=20 >>>> Instead, it now creates a new /dev mount point and creates a minimal amo= unt of device nodes and symlinks. That way, we detach from the host system an= d no longer allow the build system access to the host=E2=80=99s filesystem an= d block devices. We also bind-mount the sources in read-only mode now, so tha= t the build system cannot change anything in the source tree. On top of that,= cache is read-only, too. ccache and the log directory are the only places th= at are writable. >>>>=20 >>>> We mount a separate /tmp directory. >>>>=20 >>>> * When we then build a package, we create more namespaces for each packa= ge. These isolate each build process from each other. >>>>=20 >>>> Mostly, this is to detach from the host system. A new UTS namespace allo= ws to change the hostname in the build system without affecting the host and = so on. We do the same thing with a new time namespace. >>>>=20 >>>> We do however create a new PID namespace which means that the build syst= em no longer will see any processes running on the host system. That requires= to mount a new instance of /proc with each package. This also has the effect= that if the shell that we launched terminates (because the build is done) an= y background processes will be killed immediately. >>>>=20 >>>> Last, we clone the mount namespace that we have created before so that n= o build command can modify what we set up earlier. >>>>=20 >>>> * Since everything is now so decoupled, we gain a couple of new (maybe m= inor?) features: >>>>=20 >>>> It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D while a buil= d is running. That does not happen a lot, but we can do this now :) >>>>=20 >>>> If the build crashes or the host system is being shut down while a bui= ld is running, there is nothing to clean up afterwards. >>>>=20 >>>> * I have garnished this all with a lot of code cleanup and I suppose I m= ight have introduced some new bugs here or there :) >>>>=20 >>>> * This is probably mostly around a new implementation of the timer that = updates the build time. It has been annoying me a lot that it takes a long ti= me to walk through all packages that have been built before to finally get to= a package that we want to rebuild. Mostly this was all help up by a call of = =E2=80=9Csleep 0.1=E2=80=9D >>>>=20 >>>> Since bash does not really do any concurrency, I had to be creative and = replaced the busy-loop with a background process that is launched whenever it= is needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script once= a second. That way, we can just run as usual, but regularly get interrupted = to update the runtime. >>>>=20 >>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361 >>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake.s= h;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834 >>>>=20 >>>> We now only fork one extra sub shell and we have to handle the timer eve= nts which is a lot cheaper as well as more straight-forward to code. >>>>=20 >>>> * As there is no difference between the different stages any more (those= stages that we inherited from LFS), I have merged them all into one. >>>>=20 >>>> * Last but not least, I have create the option to build for multiple arc= hitectures on the same system. Since we can now mount the entire source tree = into (many independent) build environments, we might as well=E2=80=A6 As disc= ussed on the last call, this might not be the best option for ARM, but RISV-C= builds at a decent speed even when emulated. >>>>=20 >>>> The only thing that I needed to do for this is to suffix the build and l= og directories which are now called build_${ARCH}, i.e. build_aarch64, build_= x86_64, and so on. The packages/ directory is not changed yet, but that will = have to happen as well. Most likely I want to merge this with the generated i= mages, but I am not sure what to call this, yet. Happy to hear suggestions. r= esult_x86_64? Just images_x86_64? >>>>=20 >>>> --- >>>>=20 >>>> I have run a build and this seems to be working just fine on my Debian m= achine. I am writing to all of you to first of all let you know what I am up = to; and secondly to ask to give this a go on your systems. I think it should = run just fine, as all the tools that I require should be available everywhere= . However, there might be some older kernels that might not support all of th= is, yet or any other problems I cannot think of yet. Please give me some feed= back and send me all the bugs :) >>> I gave this a go but it didn't work. >>>=20 >>> Not sure if I should have run the ./make.sh clean command on the old vers= ion before I pulled the unshare branch into my clone of your repo. >>>=20 >>> Should I have started with a complete new clone of your repo? I might try= that anyway just to see. >>>=20 >> I created a completely new clone of you ipfire-2.x repor and then checked = out the unshare branch to a branch called unshare in my local repo clone. >>=20 >> gettoolchain gave the same issue, except that this time the toolchain dire= ctory ended up completely empty. >>=20 >> downloadsrc had the same result. >>=20 >> clean had nothing to clean up as it was a fresh clone. >>=20 >> build then tried to build the toolchain and came up with this error, diffe= rent from before. >>=20 >> ./make.sh build >> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or direc= tory >> Full toolchain compilation >> stage1 [ FAIL ] >>=20 >> Jul 8 19:26:39: Building stage1 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Installing stage1 ... >> mkdir -pv /tools_x86_64/lib >> mkdir: cannot create directory '/tools_x86_64': File exists >> make: *** [stage1:50: /home/ahb/sandbox/ms/ipfire-2.x/log/stage1] Error= 1 >>=20 >> ERROR: Building stage1 [ FAIL ] >> Check /home/ahb/sandbox/ms/ipfire-2.x/log_x86_64/_build.toolchain.log f= or errors if applicable [ FAIL ] >>=20 >> so it wasn't as simple as doing a fresh git clone. >>=20 >> Regards, >>=20 >> Adolf. >>=20 >>=20 >>> So I ran ./make.sh gettoolchain first, as I usually would. >>>=20 >>> ./make.sh gettoolchain >>> b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: No= such file or directory >>> cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED op= en or read >>> b2sum: WARNING: 1 listed file could not be read >>>=20 >>> ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with it= s b2 file. >>>=20 >>>=20 >>> Then ran ./make.sh downloadsrc >>>=20 >>> Previous version ends with >>>=20 >>> ***Verifying BLAKE2 checksum >>> all files BLAKE2 checksum match = [ DONE] >>>=20 >>> after zstd has been checked. >>>=20 >>> New version stops at zstd entry. >>>=20 >>>=20 >>> ./make.sh clean gave the message Cleaning Build directory... but was comp= leted very quickly. >>> Log and Build directories have not been cleaned out. The img and iso file= s are still present. >>>=20 >>>=20 >>> ./make.sh build gave message >>>=20 >>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or dire= ctory >>>=20 >>> and then did a full toolchain compilation which failed with gcc but log i= s >9000 lines. >>>=20 >>>=20 >>> Regards, >>> Adolf. >>>=20 >>>>=20 >>>> Thank you for listening to this brain-dump. >>>>=20 >>>> All the best, >>>> -Michael >>>>=20 >>>>> On 3 Jul 2024, at 10:58, Michael Tremer <michael.tremer(a)ipfire.org> w= rote: >>>>>=20 >>>>> Hello Adolf, >>>>>=20 >>>>> This happens occasionally that the buildsystem umounts /dev and then no= thing will really work any more. >>>>>=20 >>>>> I rebooted the machine and it is back up again. >>>>>=20 >>>>> -Michael >>>>>=20 >>>>>> On 2 Jul 2024, at 15:42, Adolf Belka <adolf.belka(a)ipfire.org> wrote: >>>>>>=20 >>>>>> Hi Michael and all, >>>>>>=20 >>>>>>=20 >>>>>> I ran the arm builder with the 4.20.2 version of samba to test it out. >>>>>>=20 >>>>>> The build got to building gdb and then failed. >>>>>>=20 >>>>>> Interestingly, the nightly build of arm was successful with the same v= ersion of gdb. >>>>>>=20 >>>>>> The build log for gdb is attached. The actual error is at line 618. >>>>>>=20 >>>>>> Another thing I found is that I just tried to go back into the arm bui= lder. I successfully got into people.ipfire.org but then trying to scp into t= he arm builder failed with the following message. >>>>>>=20 >>>>>> ------------------------------------------ >>>>>>=20 >>>>>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org >>>>>> PTY allocation request failed on channel 0 >>>>>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1.9= 0-1 (2024-05-03) aarch64 >>>>>>=20 >>>>>> The programs included with the Debian GNU/Linux system are free softwa= re; >>>>>> the exact distribution terms for each program are described in the >>>>>> individual files in /usr/share/doc/*/copyright. >>>>>>=20 >>>>>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent >>>>>> permitted by applicable law. >>>>>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permissio= n denied >>>>>>=20 >>>>>> ------------------------------------------ >>>>>>=20 >>>>>> Regards, >>>>>>=20 >>>>>> Adolf. >>>>>>=20 >>>>>> <_build.ipfire.gdb.log> >=20 >=20 --===============4569108461891801027==--