From mboxrd@z Thu Jan  1 00:00:00 1970
From: Adolf Belka <adolf.belka@ipfire.org>
To: development@lists.ipfire.org
Subject: Re: Problem during building of samba on arm builder
Date: Wed, 10 Jul 2024 14:59:10 +0200
Message-ID: <be466f1c-a93e-4daa-94e1-936f2f7e8d37@ipfire.org>
In-Reply-To: <3afc4a38-0e9f-423d-9148-dfbdaf9fd181@ipfire.org>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="===============3243148572050750598=="
List-Id: <development.lists.ipfire.org>

--===============3243148572050750598==
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable

Hi Michael,

On 10/07/2024 12:33, Adolf Belka wrote:
> Hi Michael,
>
> On 10/07/2024 11:57, Michael Tremer wrote:
>> Hello again,
>>
>> I managed to (finally) build the toolchain with the updated system. So hop=
efully there should not be any more outstanding problems that I know of so fa=
r.
>
> I just did a git pull on your repo to my clone.
>
> Ran ./make.sh gettoolchain and it successfully downloaded the toolchain.
>
> Ran ./make.sh downloadsrc and it successfully tested everything.
>
> Ran ./make.sh clean and build and log directories were cleared out and remo=
ved. As far as I can tell it was successful.
>
> Currently running ./make.sh build. So up to this point everything going wel=
l. Will let you know how it goes.
>
It has got to building popt. In the normal build system this takes around 3 s=
ecs. Currently in the new build it is at nearly 2 hours. Even with an empty c=
ache, that seems a long build time for popt, unless I am being too optimistic.

I will let it keep going.


One thing I found. I am running the new build system while I have been runnin=
g some package updates with the old system with its mount points. The two hav=
e each run without any impact on the other.

Regards,

Adolf.

> Regards,
> Adolf.
>
>>
>> Best,
>> -Michael
>>
>>> On 9 Jul 2024, at 22:29, Michael Tremer <michael.tremer(a)ipfire.org> wro=
te:
>>>
>>> Hello Adolf,
>>>
>>> Thank you for testing this.
>>>
>>> There have indeed been plenty of problems there=E2=80=A6 I spent a lot of=
 time on this today and hopefully fixed most of them.
>>>
>>> I cannot build the toolchain on my machine and I am not sure why yet, but=
 a build with the packaged toolchain runs through.
>>>
>>> I have also spent some time on getting rid of the strip stage because it =
annoyed me how long it takes and creating the disk images as well as packages=
 should be significantly faster now, too. I hope I didn=E2=80=99t introduce t=
oo many new bugs.
>>>
>>> Please let me know if you have more success now.
>>>
>>> Best,
>>> -Michael
>>>
>>>> On 8 Jul 2024, at 20:34, Adolf Belka <adolf.belka(a)ipfire.org> wrote:
>>>>
>>>> Hi Michael,
>>>>
>>>> On 08/07/2024 21:15, Adolf Belka wrote:
>>>>> Hi Michael,
>>>>>
>>>>> On 08/07/2024 18:11, Michael Tremer wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I have been spending a lot of time on this problem, because it has bee=
n bothering me for a long time. I also saw an opportunity to make more change=
s to the build system.
>>>>>>
>>>>>> Currently this is all a little bit WIP, but I hope that we can merge t=
his into next as soon as the current update has been moved to master.
>>>>>>
>>>>>> I am referring to this branch which is currently based on next: https:=
//git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dshortlog;h=3Drefs/heads/un=
share
>>>>>>
>>>>>> It makes use of the unshare command which creates new namespaces in Li=
nux. That way, we can isolate the build system better from the host system an=
d in case something goes wrong, there is less damage. We can also enforce som=
e more rules=E2=80=A6
>>>>>>
>>>>>> So, what has changed?
>>>>>>
>>>>>> * The make.sh script might re-execute itself into a new mount namespac=
e when it is suitable. This happens for =E2=80=9Cmake.sh build=E2=80=9D and =
=E2=80=9Cmake.sh shell=E2=80=9D, but it does not happen for =E2=80=9Cmake.sh =
downloadsrc=E2=80=9D for example.
>>>>>>
>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake=
.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2129
>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake=
.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l2251
>>>>>>
>>>>>> * The new mount namespace means that we will no longer see any bind-mo=
unts in the host system and we no longer need to umount anything ourselves wh=
ich is where we occasionally wiped the entire hard drive of the host system. =
When the last process exits, the namespace is being cleaned up and everything=
 is being umounted.
>>>>>>
>>>>>> * The function that prepares the build environment has been almost ent=
irely rewritten:
>>>>>>
>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake=
.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l426
>>>>>>
>>>>>> It used to mount parts of the host system into the build environment w=
hich are needed to run anything. Those were /dev, /proc, /sys, etc=E2=80=A6
>>>>>>
>>>>>> Instead, it now creates a new /dev mount point and creates a minimal a=
mount of device nodes and symlinks. That way, we detach from the host system =
and no longer allow the build system access to the host=E2=80=99s filesystem =
and block devices. We also bind-mount the sources in read-only mode now, so t=
hat the build system cannot change anything in the source tree. On top of tha=
t, cache is read-only, too. ccache and the log directory are the only places =
that are writable.
>>>>>>
>>>>>> We mount a separate /tmp directory.
>>>>>>
>>>>>> * When we then build a package, we create more namespaces for each pac=
kage. These isolate each build process from each other.
>>>>>>
>>>>>> Mostly, this is to detach from the host system. A new UTS namespace al=
lows to change the hostname in the build system without affecting the host an=
d so on. We do the same thing with a new time namespace.
>>>>>>
>>>>>> We do however create a new PID namespace which means that the build sy=
stem no longer will see any processes running on the host system. That requir=
es to mount a new instance of /proc with each package. This also has the effe=
ct that if the shell that we launched terminates (because the build is done) =
any background processes will be killed immediately.
>>>>>>
>>>>>> Last, we clone the mount namespace that we have created before so that=
 no build command can modify what we set up earlier.
>>>>>>
>>>>>> * Since everything is now so decoupled, we gain a couple of new (maybe=
 minor?) features:
>>>>>>
>>>>>> =C2=A0=C2=A0 It is now possible to run =E2=80=9Cmake.sh shell=E2=80=9D=
 while a build is running. That does not happen a lot, but we can do this now=
 :)
>>>>>>
>>>>>> =C2=A0=C2=A0 If the build crashes or the host system is being shut dow=
n while a build is running, there is nothing to clean up afterwards.
>>>>>>
>>>>>> * I have garnished this all with a lot of code cleanup and I suppose I=
 might have introduced some new bugs here or there :)
>>>>>>
>>>>>> * This is probably mostly around a new implementation of the timer tha=
t updates the build time. It has been annoying me a lot that it takes a long =
time to walk through all packages that have been built before to finally get =
to a package that we want to rebuild. Mostly this was all help up by a call o=
f =E2=80=9Csleep 0.1=E2=80=9D
>>>>>>
>>>>>> Since bash does not really do any concurrency, I had to be creative an=
d replaced the busy-loop with a background process that is launched whenever =
it is needed and which will =E2=80=9Cping=E2=80=9D the main make.sh script on=
ce a second. That way, we can just run as usual, but regularly get interrupte=
d to update the runtime.
>>>>>>
>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake=
.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l361
>>>>>> https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dblob;f=3Dmake=
.sh;h=3Db952627782a0d5ef4ac75f17315b689fcb3b4fe0;hb=3Drefs/heads/unshare#l834
>>>>>>
>>>>>> We now only fork one extra sub shell and we have to handle the timer e=
vents which is a lot cheaper as well as more straight-forward to code.
>>>>>>
>>>>>> * As there is no difference between the different stages any more (tho=
se stages that we inherited from LFS), I have merged them all into one.
>>>>>>
>>>>>> * Last but not least, I have create the option to build for multiple a=
rchitectures on the same system. Since we can now mount the entire source tre=
e into (many independent) build environments, we might as well=E2=80=A6 As di=
scussed on the last call, this might not be the best option for ARM, but RISV=
-C builds at a decent speed even when emulated.
>>>>>>
>>>>>> The only thing that I needed to do for this is to suffix the build and=
 log directories which are now called build_${ARCH}, i.e. build_aarch64, buil=
d_x86_64, and so on. The packages/ directory is not changed yet, but that wil=
l have to happen as well. Most likely I want to merge this with the generated=
 images, but I am not sure what to call this, yet. Happy to hear suggestions.=
 result_x86_64? Just images_x86_64?
>>>>>>
>>>>>> ---
>>>>>>
>>>>>> I have run a build and this seems to be working just fine on my Debian=
 machine. I am writing to all of you to first of all let you know what I am u=
p to; and secondly to ask to give this a go on your systems. I think it shoul=
d run just fine, as all the tools that I require should be available everywhe=
re. However, there might be some older kernels that might not support all of =
this, yet or any other problems I cannot think of yet. Please give me some fe=
edback and send me all the bugs :)
>>>>> I gave this a go but it didn't work.
>>>>>
>>>>> Not sure if I should have run the ./make.sh clean command on the old ve=
rsion before I pulled the unshare branch into my clone of your repo.
>>>>>
>>>>> Should I have started with a complete new clone of your repo? I might t=
ry that anyway just to see.
>>>>>
>>>> I created a completely new clone of you ipfire-2.x repor and then checke=
d out the unshare branch to a branch called unshare in my local repo clone.
>>>>
>>>> gettoolchain gave the same issue, except that this time the toolchain di=
rectory ended up completely empty.
>>>>
>>>> downloadsrc had the same result.
>>>>
>>>> clean had nothing to clean up as it was a fresh clone.
>>>>
>>>> build then tried to build the toolchain and came up with this error, dif=
ferent from before.
>>>>
>>>> ./make.sh build
>>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or dir=
ectory
>>>> Full toolchain compilation
>>>> stage1 [ FAIL ]
>>>>
>>>> =C2=A0=C2=A0=C2=A0 Jul=C2=A0 8 19:26:39: Building stage1 =3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D Installing stage1 ...
>>>> =C2=A0=C2=A0=C2=A0 mkdir -pv /tools_x86_64/lib
>>>> =C2=A0=C2=A0=C2=A0 mkdir: cannot create directory '/tools_x86_64': File =
exists
>>>> =C2=A0=C2=A0=C2=A0 make: *** [stage1:50: /home/ahb/sandbox/ms/ipfire-2.x=
/log/stage1] Error 1
>>>>
>>>> ERROR: Building stage1 [ FAIL ]
>>>> =C2=A0=C2=A0=C2=A0 Check /home/ahb/sandbox/ms/ipfire-2.x/log_x86_64/_bui=
ld.toolchain.log for errors if applicable [ FAIL ]
>>>>
>>>> so it wasn't as simple as doing a fresh git clone.
>>>>
>>>> Regards,
>>>>
>>>> Adolf.
>>>>
>>>>
>>>>> So I ran ./make.sh gettoolchain first, as I usually would.
>>>>>
>>>>> ./make.sh gettoolchain
>>>>> b2sum: cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: =
No such file or directory
>>>>> cache/toolchains/ipfire-2.29-toolchain-20240521-x86_64.tar.zst: FAILED =
open or read
>>>>> b2sum: WARNING: 1 listed file could not be read
>>>>>
>>>>> ipfire-2.29-toolchain-20240210-x86_64.tar.zst is present together with =
its b2 file.
>>>>>
>>>>>
>>>>> Then ran ./make.sh downloadsrc
>>>>>
>>>>> Previous version ends with
>>>>>
>>>>> ***Verifying BLAKE2 checksum
>>>>> all files BLAKE2 checksum match [ DONE]
>>>>>
>>>>> after zstd has been checked.
>>>>>
>>>>> New version stops at zstd entry.
>>>>>
>>>>>
>>>>> ./make.sh clean gave the message Cleaning Build directory... but was co=
mpleted very quickly.
>>>>> Log and Build directories have not been cleaned out. The img and iso fi=
les are still present.
>>>>>
>>>>>
>>>>> ./make.sh build gave message
>>>>>
>>>>> chroot: failed to run command =E2=80=98env=E2=80=99: No such file or di=
rectory
>>>>>
>>>>> and then did a full toolchain compilation which failed with gcc but log=
 is >9000 lines.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Adolf.
>>>>>
>>>>>>
>>>>>> Thank you for listening to this brain-dump.
>>>>>>
>>>>>> All the best,
>>>>>> -Michael
>>>>>>
>>>>>>> On 3 Jul 2024, at 10:58, Michael Tremer <michael.tremer(a)ipfire.org>=
 wrote:
>>>>>>>
>>>>>>> Hello Adolf,
>>>>>>>
>>>>>>> This happens occasionally that the buildsystem umounts /dev and then =
nothing will really work any more.
>>>>>>>
>>>>>>> I rebooted the machine and it is back up again.
>>>>>>>
>>>>>>> -Michael
>>>>>>>
>>>>>>>> On 2 Jul 2024, at 15:42, Adolf Belka <adolf.belka(a)ipfire.org> wrot=
e:
>>>>>>>>
>>>>>>>> Hi Michael and all,
>>>>>>>>
>>>>>>>>
>>>>>>>> I ran the arm builder with the 4.20.2 version of samba to test it ou=
t.
>>>>>>>>
>>>>>>>> The build got to building gdb and then failed.
>>>>>>>>
>>>>>>>> Interestingly, the nightly build of arm was successful with the same=
 version of gdb.
>>>>>>>>
>>>>>>>> The build log for gdb is attached. The actual error is at line 618.
>>>>>>>>
>>>>>>>> Another thing I found is that I just tried to go back into the arm b=
uilder. I successfully got into people.ipfire.org but then trying to scp into=
 the arm builder failed with the following message.
>>>>>>>>
>>>>>>>> ------------------------------------------
>>>>>>>>
>>>>>>>> ssh bonnietwin(a)arm64-01.zrh.ipfire.org
>>>>>>>> PTY allocation request failed on channel 0
>>>>>>>> Linux arm64-01.zrh.ipfire.org 6.1.0-21-cloud-arm64 #1 SMP Debian 6.1=
.90-1 (2024-05-03) aarch64
>>>>>>>>
>>>>>>>> The programs included with the Debian GNU/Linux system are free soft=
ware;
>>>>>>>> the exact distribution terms for each program are described in the
>>>>>>>> individual files in /usr/share/doc/*/copyright.
>>>>>>>>
>>>>>>>> Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
>>>>>>>> permitted by applicable law.
>>>>>>>> /etc/profile.d/Z99-cloud-locale-test.sh: line 14: /dev/null: Permiss=
ion denied
>>>>>>>>
>>>>>>>> ------------------------------------------
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Adolf.
>>>>>>>>
>>>>>>>> <_build.ipfire.gdb.log>
>>>
>>>
>>

--===============3243148572050750598==--