These changes are mainly meant for upcoming squid 3.5.xx:
- Raised 'while-loop'-time for stopping squid to 120 seconds.
- Changed the 'while-loop' from using 'statusproc' to 'squid -k check', since 'statusproc' didn't find the still running '(squid-1)'-process. Thus, 'squid' hadn't enough time to close the index. This led to a damaged 'swap.state'-file and "Rebuilding storage in /var/log/cache" always ended with "(dirty log)".
- Added some 'boot_mesg' lines in stop-routine.
- Changed the 'flush'-command to delete the entire 'swap.state'-file - it will be automatically rebuild during the next start.
Best, Matthias
Signed-off-by: Matthias Fischer matthias.fischer@ipfire.org --- src/initscripts/init.d/squid | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/src/initscripts/init.d/squid b/src/initscripts/init.d/squid index abed90a..5d9521b 100644 --- a/src/initscripts/init.d/squid +++ b/src/initscripts/init.d/squid @@ -108,22 +108,28 @@ case "$1" in # Wait until all redirectors have been stopped. wait
- # If squid is still running, wait up to 30 seconds + # If squid is still running, wait up to 120 seconds # before we go on to kill it. - counter=30 - - while [ ${counter} -gt 0 ]; do - statusproc /usr/sbin/squid >/dev/null && break; + n=0 + while /usr/sbin/squid -k check && [ $n -lt 120 ]; do sleep 1 - counter=$(( ${counter} - 1)) + n=$(( ${n} + 1 )) done
- # Kill squid service, if still running. - killproc /usr/sbin/squid >/dev/null + # If (squid-1) is still running, kill all squid processes + # and delete damaged '/var/log/cache/swap.state'. + if ( ps ax | grep "(squid-1)$" ); then + killproc /usr/sbin/squid >/dev/null + /bin/rm -f /var/log/cache/swap.state + boot_mesg "(squid-1) was killed after 120 seconds." + else + boot_mesg "(squid-1) exited normally, shutdown OK." + fi + fi
- # Trash remain pid file from squid. + # Delete remaining pid file from squid if it STILL exists. rm -rf /var/run/squid.pid - fi + ;;
restart) @@ -143,8 +149,7 @@ case "$1" in
flush) $0 stop - echo > /var/log/cache/swap.state - chown squid.squid /var/log/cache/swap.state + /bin/rm -f /var/log/cache/swap.state sleep 1 $0 start ;;
Hi,
thanks for working on this.
I still have a few questions and remarks:
On Sat, 2016-05-14 at 19:52 +0200, Matthias Fischer wrote:
These changes are mainly meant for upcoming squid 3.5.xx:
- Raised 'while-loop'-time for stopping squid to 120 seconds.
Is there any objective for this value or is this just an estimate?
I assume what you are trying to do here is to find a timeout that is high enough that even slow storage will have enough time to write back the cache from memory.
I have seen systems where closing and reopening the cache takes 15-20 minutes.
- Changed the 'while-loop' from using 'statusproc' to 'squid -k check',
since 'statusproc' didn't find the still running '(squid-1)'-process. Thus, 'squid' hadn't enough time to close the index. This led to a damaged 'swap.state'-file and "Rebuilding storage in /var/log/cache" always ended with "(dirty log)".
- Added some 'boot_mesg' lines in stop-routine.
That debugging output should not be shown to the user. Just show [ OK ] or [ FAIL ] at the end of the line.
- Changed the 'flush'-command to delete the entire 'swap.state'-file - it
will be automatically rebuild during the next start.
I get that it is intentional to delete this file when it is damaged. It is however the journal of the cache index and when deleted will require squid to rebuild the entire cache from the all files in the cache. That will certainly take a long time (especially on those systems that are likely to lose their swap.state because of slow storage).
We should *always* avoid this when ever we can.
Generally I take this as a workaround for now since squid does not notice when the swap.state file was damaged and continues working with corrupt data. Would you like to file a bug report and see if the squid developers can find a way to automatically detect when squid is reading a damaged file and discard it then? I guess that would be the safest solution.
Best, Matthias
Signed-off-by: Matthias Fischer matthias.fischer@ipfire.org
src/initscripts/init.d/squid | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/src/initscripts/init.d/squid b/src/initscripts/init.d/squid index abed90a..5d9521b 100644 --- a/src/initscripts/init.d/squid +++ b/src/initscripts/init.d/squid @@ -108,22 +108,28 @@ case "$1" in # Wait until all redirectors have been stopped. wait
# If squid is still running, wait up to 30 seconds
# If squid is still running, wait up to 120 seconds
# before we go on to kill it.
counter=30
while [ ${counter} -gt 0 ]; do
statusproc /usr/sbin/squid >/dev/null &&
break;
n=0
while /usr/sbin/squid -k check && [ $n -lt 120 ]; do
sleep 1
counter=$(( ${counter} - 1))
n=$(( ${n} + 1 ))
done
So the status output with the dots is gone now. Can we add a notice to the status line that says something like "Shutting down squid (this may take up to a few minutes).
# Kill squid service, if still running.
killproc /usr/sbin/squid >/dev/null
# If (squid-1) is still running, kill all squid
processes
# and delete damaged '/var/log/cache/swap.state'.
if ( ps ax | grep "(squid-1)$" ); then
Can we use pgrep here?
killproc /usr/sbin/squid >/dev/null
/bin/rm -f /var/log/cache/swap.state
Do not use the full path. "rm" should be enough.
boot_mesg "(squid-1) was killed after 120
seconds."
else
boot_mesg "(squid-1) exited normally,
shutdown OK."
fi
fi
# Trash remain pid file from squid.
# Delete remaining pid file from squid if it STILL
exists. rm -rf /var/run/squid.pid
fi
;; restart) @@ -143,8 +149,7 @@ case "$1" in flush) $0 stop
echo > /var/log/cache/swap.state
chown squid.squid /var/log/cache/swap.state
/bin/rm -f /var/log/cache/swap.state
See above for "rm".
I am also wondering what this was supposed to do. These lines are executed when you click the button to clear the cache on the web user interface. However, the cache is not really cleared.
The cache is rebuilt, but not cleared. Shouldn't we remove all data in the directory and reinitialize everything with squid -z?
Here is some documentation about that:
http://wiki.squid-cache.org/SquidFaq/ClearingTheCache
Just removing the swap.state file still leaves all the data on disk and my assumption when clearing the cache is that I get rid of the data in it. Either to clear up space, because it was corrupted, etc...
sleep 1 $0 start ;;
On 15.05.2016 12:37, Michael Tremer wrote:
Hi,
Hi,
thanks for working on this.
No problem... ;-)
I still have a few questions and remarks:
On Sat, 2016-05-14 at 19:52 +0200, Matthias Fischer wrote:
These changes are mainly meant for upcoming squid 3.5.xx:
- Raised 'while-loop'-time for stopping squid to 120 seconds.
Is there any objective for this value or is this just an estimate?
This value is the most used default. I didn't find a further explanation.
I assume what you are trying to do here is to find a timeout that is high enough that even slow storage will have enough time to write back the cache from memory.
Yep.
I have seen systems where closing and reopening the cache takes 15-20 minutes.
Yep - I know that those systems exist. But if we don't want to end up waiting forever, we have to set a border. 120 seconds is the most found default. Make your choice. ;-)
I set it to 360 seconds now - see below.
- Changed the 'while-loop' from using 'statusproc' to 'squid -k
check', since 'statusproc' didn't find the still running '(squid-1)'-process. Thus, 'squid' hadn't enough time to close the index. This led to a damaged 'swap.state'-file and "Rebuilding storage in /var/log/cache" always ended with "(dirty log)".
- Added some 'boot_mesg' lines in stop-routine.
That debugging output should not be shown to the user. Just show [ OK ] or [ FAIL ] at the end of the line.
Solved. See below.
- Changed the 'flush'-command to delete the entire
'swap.state'-file - it will be automatically rebuild during the next start.
I get that it is intentional to delete this file when it is damaged. It is however the journal of the cache index and when deleted will require squid to rebuild the entire cache from the all files in the cache. That will certainly take a long time (especially on those systems that are likely to lose their swap.state because of slow storage).
We should *always* avoid this when ever we can.
ACK. This is why I altered the loop to 360 seconds. For now.
Generally I take this as a workaround for now since squid does not notice when the swap.state file was damaged and continues working with corrupt data. Would you like to file a bug report and see if the squid developers can find a way to automatically detect when squid is reading a damaged file and discard it then? I guess that would be the safest solution.
Yep. For me its a workaround, too. I'll try to file a bug report and see what happens...
Best, Matthias
Signed-off-by: Matthias Fischer matthias.fischer@ipfire.org --- src/initscripts/init.d/squid | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/src/initscripts/init.d/squid b/src/initscripts/init.d/squid index abed90a..5d9521b 100644 --- a/src/initscripts/init.d/squid +++ b/src/initscripts/init.d/squid @@ -108,22 +108,28 @@ case "$1" in # Wait until all redirectors have been stopped. wait
# If squid is still running, wait up to 30 seconds + # If
squid is still running, wait up to 120 seconds # before we go on to kill it. - counter=30 - - while [ ${counter} -gt 0 ]; do - statusproc /usr/sbin/squid >/dev/null && break; + n=0 + while /usr/sbin/squid -k check && [ $n -lt 120 ]; do sleep 1 - counter=$(( ${counter} - 1)) + n=$(( ${n} + 1 )) done
So the status output with the dots is gone now. Can we add a notice to the status line that says something like "Shutting down squid (this may take up to a few minutes).
Working on it. See below.
# Kill squid service, if still running. - killproc
/usr/sbin/squid >/dev/null + # If (squid-1) is still running, kill all squid processes + # and delete damaged '/var/log/cache/swap.state'. + if ( ps ax | grep "(squid-1)$" ); then
Can we use pgrep here?
Yes. Done. See below.
killproc /usr/sbin/squid >/dev/null + /bin/rm -f
/var/log/cache/swap.state
Do not use the full path. "rm" should be enough.
Done.
boot_mesg "(squid-1) was killed after 120 seconds." + else +
boot_mesg "(squid-1) exited normally, shutdown OK." + fi + fi
# Trash remain pid file from squid. + # Delete remaining pid
file from squid if it STILL exists. rm -rf /var/run/squid.pid - fi + ;;
restart) @@ -143,8 +149,7 @@ case "$1" in
flush) $0 stop - echo > /var/log/cache/swap.state - chown squid.squid /var/log/cache/swap.state + /bin/rm -f /var/log/cache/swap.state
See above for "rm".
Done as above. ;-)
I am also wondering what this was supposed to do. These lines are executed when you click the button to clear the cache on the web user interface. However, the cache is not really cleared.
The cache is rebuilt, but not cleared. Shouldn't we remove all data in the directory and reinitialize everything with squid -z?
Here is some documentation about that:
http://wiki.squid-cache.org/SquidFaq/ClearingTheCache
Just removing the swap.state file still leaves all the data on disk and my assumption when clearing the cache is that I get rid of the data in it. Either to clear up space, because it was corrupted, etc... ...
This would be consistent (correct translation?).
What about this:
... # If squid is still running, wait up to 360 seconds # before we go on to kill it and delete cache structure. boot_mesg "Shutting down squid (this may take up to a few minutes)\n" n=0 while [ /usr/sbin/squid -k check > /dev/null 2>&1 ] && [ $n -lt 360 ]; do sleep 1 n=$(( ${n} + 1 )) done
# If (squid-1) is still running after 360 seconds, # kill all squid processes and delete damaged cache structure. if ( pgrep -fl "(squid-1)" > /dev/null 2>&1 ); then killproc /usr/sbin/squid >/dev/null rm -r /var/log/squid/cache/* boot_mesg -n "You should not be reading this warning.\n" boot_mesg -n "Some squid-processes had to be killed after 360 seconds,\n" boot_mesg -n "so index file and cache contents had to be deleted.\n" boot_mesg -n "Therefore, the complete cache structure must be rebuild\n" boot_mesg "during next start." echo_warning else boot_mesg "All squid processes exited normally." echo_ok boot_mesg "" fi fi
# Delete remaining pid file from squid if it still exists. rm -rf /var/run/squid.pid
;;
Best, Matthias
On Sun, 2016-05-15 at 16:57 +0200, Matthias Fischer wrote:
On 15.05.2016 12:37, Michael Tremer wrote:
Hi,
Hi,
thanks for working on this.
No problem... ;-)
I still have a few questions and remarks:
On Sat, 2016-05-14 at 19:52 +0200, Matthias Fischer wrote:
These changes are mainly meant for upcoming squid 3.5.xx:
- Raised 'while-loop'-time for stopping squid to 120 seconds.
Is there any objective for this value or is this just an estimate?
This value is the most used default. I didn't find a further explanation.
I assume what you are trying to do here is to find a timeout that is high enough that even slow storage will have enough time to write back the cache from memory.
Yep.
I have seen systems where closing and reopening the cache takes 15-20 minutes.
Yep - I know that those systems exist. But if we don't want to end up waiting forever, we have to set a border. 120 seconds is the most found default. Make your choice. ;-)
I set it to 360 seconds now - see below.
- Changed the 'while-loop' from using 'statusproc' to 'squid -k
check', since 'statusproc' didn't find the still running '(squid-1)'-process. Thus, 'squid' hadn't enough time to close the index. This led to a damaged 'swap.state'-file and "Rebuilding storage in /var/log/cache" always ended with "(dirty log)".
- Added some 'boot_mesg' lines in stop-routine.
That debugging output should not be shown to the user. Just show [ OK ] or [ FAIL ] at the end of the line.
Solved. See below.
- Changed the 'flush'-command to delete the entire
'swap.state'-file - it will be automatically rebuild during the next start.
I get that it is intentional to delete this file when it is damaged. It is however the journal of the cache index and when deleted will require squid to rebuild the entire cache from the all files in the cache. That will certainly take a long time (especially on those systems that are likely to lose their swap.state because of slow storage).
We should *always* avoid this when ever we can.
ACK. This is why I altered the loop to 360 seconds. For now.
Generally I take this as a workaround for now since squid does not notice when the swap.state file was damaged and continues working with corrupt data. Would you like to file a bug report and see if the squid developers can find a way to automatically detect when squid is reading a damaged file and discard it then? I guess that would be the safest solution.
Yep. For me its a workaround, too. I'll try to file a bug report and see what happens...
Please connect that to the one on our bugtracker.
Best, Matthias
Signed-off-by: Matthias Fischer matthias.fischer@ipfire.org --- src/initscripts/init.d/squid | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/src/initscripts/init.d/squid b/src/initscripts/init.d/squid index abed90a..5d9521b 100644 --- a/src/initscripts/init.d/squid +++ b/src/initscripts/init.d/squid @@ -108,22 +108,28 @@ case "$1" in # Wait until all redirectors have been stopped. wait
# If squid is still running, wait up to 30
seconds + # If squid is still running, wait up to 120 seconds # before we go on to kill it. - counter=30 - - wh ile [ ${counter} -gt 0 ]; do - statusproc /usr/sbin/squid >/dev/null && break; + n =0 + while /usr/sbin/squid -k check && [ $n -lt 120 ]; do sleep 1 - counter=$(( ${counter} - 1)) + n=$(( ${n} + 1 )) done
So the status output with the dots is gone now. Can we add a notice to the status line that says something like "Shutting down squid (this may take up to a few minutes).
Working on it. See below.
killproc# Kill squid service, if still running. -
/usr/sbin/squid >/dev/null + # If (squid-1) is still running, kill all squid processes + # and delete damaged '/var/log/cache/swap.state'. + if ( ps ax | grep "(squid-1)$" ); then
Can we use pgrep here?
Yes. Done. See below.
killproc /usr/sbin/squid >/dev/null + /bin/rm -f
/var/log/cache/swap.state
Do not use the full path. "rm" should be enough.
Done.
boot_mesg "(squid-1) was killed after 120
seconds." + else + boot_mesg "(squid-1) exited normally, shutdown OK." + fi + fi
# Delete remaining pid# Trash remain pid file from squid. +
file from squid if it STILL exists. rm -rf /var/run/squid.pid - fi + ;;
restart) @@ -143,8 +149,7 @@ case "$1" in
flush) $0 stop - echo > /var/log/cache/swap.state - chown squid.squid /var/log/cache/swap.state + /bin/rm -f /var/log/cache/swap.state
See above for "rm".
Done as above. ;-)
I am also wondering what this was supposed to do. These lines are executed when you click the button to clear the cache on the web user interface. However, the cache is not really cleared.
The cache is rebuilt, but not cleared. Shouldn't we remove all data in the directory and reinitialize everything with squid -z?
Here is some documentation about that:
http://wiki.squid-cache.org/SquidFaq/ClearingTheCache
Just removing the swap.state file still leaves all the data on disk and my assumption when clearing the cache is that I get rid of the data in it. Either to clear up space, because it was corrupted, etc... ...
This would be consistent (correct translation?).
What about this:
... # If squid is still running, wait up to 360 seconds # before we go on to kill it and delete cache structure. boot_mesg "Shutting down squid (this may take up to a few minutes)\n" n=0 while [ /usr/sbin/squid -k check > /dev/null 2>&1 ] && [ $n -lt 360 ]; do sleep 1 n=$(( ${n} + 1 )) done
# If (squid-1) is still running after 360 seconds, # kill all squid processes and delete damaged cache structure. if ( pgrep -fl "(squid-1)" > /dev/null 2>&1 ); then killproc /usr/sbin/squid >/dev/null rm -r /var/log/squid/cache/* boot_mesg -n "You should not be reading this warning.\n" boot_mesg -n "Some squid-processes had to be killed after 360 seconds,\n" boot_mesg -n "so index file and cache contents had to be deleted.\n" boot_mesg -n "Therefore, the complete cache structure must be rebuild\n" boot_mesg "during next start." echo_warning else boot_mesg "All squid processes exited normally." echo_ok boot_mesg "" fi fi
# Delete remaining pid file from squid if it still exists. rm -rf /var/run/squid.pid
;;
Best, Matthias