Hi Michael
Thanks for the clear explanation. And indeed I can imagine sync locking up on a stale NFS mount when it still needs data written to it..I didn't consider that as I don't have NFS mounts on my ipfire instance, but other users may indeed have. It does however fits in my "hardware problem" scenario, as sync won't get an ACK from the stale NFS mount.. Only networking errors may be more frequent of a problem than actual hardware failure, increasing the chances that sync indeed locks up.
And it is as you say, a journaling fs should not have a problem when no manual syncs are performed during 'crucial' operations.
The usb flash drive story is new for me, but now that I read your explanation, I now understand why I sometimes have trouble with corrupt usb sticks where I just wrote a boot image to and then restart the machine to boot from that stick to find out it doesn't want to boot due to a corrupt root partition. Now I know exactly why this sometimes happens :-)
Always happy to learn! Thanks
Regards Robin
Michael Tremer schreef op wo 31-01-2024 om 10:54 [+0000]:
Hello Robin,
On 30 Jan 2024, at 19:50, Robin Roevens robin.roevens@disroot.org wrote:
Hi Michael, all
First, this mail does not contain critique, I'm only trying to learn, if I'm wrong. But I think your remark about fixing hardware is not correct?
In my understanding, sync synchronizes the not-yet-written data currently in kernel managed cache memory (hence in volatile system memory) to permanent storage.
Yes.
Normally storage hardware also has its own cache, which I presume you are referring to, but I don't think sync cares or even knows about that and that is/should be handled within the hardware itself. As soon as storage hardware returns an ACK on a write operation, sync is happy. Not actually guaranteeing any physical write, only that all to-be-written data is actually delivered to the storage hardware.
Well, this is all a little bit more complicated, because it massively depends on the type of hardware we are talking about…
A classic example for a piece of hardware with a cache is a RAID controller. When we call sync(), the kernel sends all data to that RAID controller and as soon as the last operation has been confirmed, sync() is considered done. That does not mean that any of that data has actually been written to any physical storage device. RAID controllers have some temporary caches that are persistent and so that data will not be lost even if there was a power outage immediately after the sync().
Then there is plain old hard drives. They traditionally did not use to be very smart. So if the kernel wanted to write something, it would have been written straight to disk and once that was done, sync() would have returned.
Those two scenarios are not a problem at all.
Then there is the more problematic type of hardware which is usually (cheap?) flash storage. Those devices very often have a faster cache and then very slow persistent storage. To pretend that a device is much faster than it actually is, blocks are being written to a volatile cache and later written to the persistent storage, but there is no power source in case of a sudden loss of power.
We discovered this with a regular reboot where we also call sync to make sure that everything is properly written to disk and then we make the system reboot. Those devices confirm that everything is written when it isn’t but then the system cycles power and the last blocks are gone. That will result in a corrupt file system and those systems will perform a filesystem check (and usually repair too) at the next boot.
The visual way to see this phenomenon is a USB stick with a USB light. Calling sync() will return but the light will continue flashing because the device is actually busy, but has told the OS that it is already done. And hence we have this problem...
I do agree that a (full) sync there is probably overkill. The content of the config will be available to the collectd daemon whether it is actually written on disk or still in cache, so no sync needed. If the server crashes/powercycles/... on that point, I think a possible commented line that should be uncommented in that config is then probably of the least concern.
On top of all of this comes that the “flash image” that we ship used to come with no journal. In a journaling file system, a loss of power would either result in the old version of that file, or the new one - depending on how far the writes have come. Neither of that is a problem in our scenario. Without the journal, it is quite likely to have a broken file.
On the other hand, a sync that blocks forever, is probably more of an indicator that the hardware should be fixed, as this would mean sync doesn't get all the ACK's it expects.
It’s not hardware. The system I was dealing with had an NFS volume mounted which was unavailable for a moment. That caused the entire dial-in sequence to hang because collectd was waiting for a sync() to finish just after reconnecting.
So, calling sync() is generally a very bad idea. It works around bugs that should not be there in the first place, but is just creating some fancy race condition where you increase chances that no data will be lost, but it will never be anywhere acceptable for me.
Ergo, don’t buy cheap flash. Use a journaling file system. If your flash dies because you are writing to it it was not fit for purpose.
As you can see I am very annoyed about these things because over all those years they have cost us so much time to debug and “fix” and it simply is not worth it…
Please correct me if I'm wrong.
I hope this clears it up :)
-Michael
Regards Robin
Michael Tremer schreef op di 30-01-2024 om 18:01 [+0000]:
Calling a global sync operation manually is generally a bad idea as it can block for forever. If people have storage that does not retain anything that is being written to it, they need to fix their hardware.
Signed-off-by: Michael Tremer michael.tremer@ipfire.org
src/initscripts/system/collectd | 3 --- 1 file changed, 3 deletions(-)
diff --git a/src/initscripts/system/collectd b/src/initscripts/system/collectd index bb8a2f54f..56b799d56 100644 --- a/src/initscripts/system/collectd +++ b/src/initscripts/system/collectd @@ -146,9 +146,6 @@ case "$1" in sed -i -e "s|^#LoadPlugin swap|LoadPlugin swap|g" /etc/collectd.conf fi
- # sync after config update...
- sync
if [ $(date +%Y) -gt 2011 ]; then boot_mesg "Starting Collection daemon..." /usr/sbin/collectd -C /etc/collectd.conf -- 2.39.2
-- Dit bericht is gescanned op virussen en andere gevaarlijke inhoud door MailScanner en lijkt schoon te zijn.