From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.haj.ipfire.org (localhost [IPv6:::1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4dXDfn5Ggmz30G8 for ; Thu, 18 Dec 2025 15:12:17 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [IPv6:2001:678:b28::25]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mail01.haj.ipfire.org", Issuer "R12" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4dXDfj5sGZz2xGj for ; Thu, 18 Dec 2025 15:12:13 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4dXDfb5Kb0z41t; Thu, 18 Dec 2025 15:12:07 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1766070727; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=abpKNC9dw6r8A2kts6/IDOeYOYfk0/SuWywKpd02qus=; b=rKJSeV7y+Hfo/9uqeYFOXy47M7eRZRbqFag/EmqX5hK8TPyVTfp/iBlW05MyJoBfpBdQN8 YojcMmm1m6gJYPDw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1766070727; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=abpKNC9dw6r8A2kts6/IDOeYOYfk0/SuWywKpd02qus=; b=Fg6TUTNOz/hrzB0ALB00W19VBI1EMEeamlIvuNtqflF4jmCJ/4MUHUiSMXX2hdynyFg87L a7CZ1XHKdGfoCQtHnc7g4UOHn81zLc6JvmIo2hyTbp5V0nPM3X1qIn+75wgNdsJi3qZCjs bp2bMDMFSoKzBwyCGyhS3Qi950I1MI8dkoYplT8U7AJDcDC7Z3NoczrV5jJoY5G5WUnXGx 9VAa+1sqCBgJ2JP/Bw5gFrLMGdzrtPH1uMQTPnWU9VNvOQVI5TMSyaeJzcRgqYLoH5kC04 qtFhN3gAjtI4Xqb6sZxc+Rm2RtC2oxD8TtaBmjaGIgJapKIeuWuqCTPNFn24LQ== Content-Type: text/plain; charset=utf-8 Precedence: list List-Id: List-Subscribe: , List-Unsubscribe: , List-Post: List-Help: Sender: Mail-Followup-To: Mime-Version: 1.0 Subject: Re: Large Suricata cache directory. From: Michael Tremer In-Reply-To: Date: Thu, 18 Dec 2025 16:12:07 +0100 Cc: "IPFire: Development-List" Content-Transfer-Encoding: quoted-printable Message-Id: References: <8ac70e7aa3d72cf2de5cda09a52f1cf6@ipfire.org> <8BFE2D8F-E49A-4823-8795-0A6B54D6A1B5@ipfire.org> To: Adolf Belka Hello, > On 16 Dec 2025, at 13:45, Adolf Belka wrote: >=20 > Hi Michael, >=20 > On 16/12/2025 11:30, Michael Tremer wrote: >> Hello Adam, >>> On 15 Dec 2025, at 19:29, Adam Gibbons = wrote: >>>=20 >>> Hi Michael, Adolf, >>>=20 >>> Yes, Adolf has patched the second issue. I=E2=80=99ve tested this = and backups dropped from 826 MB to around 10 MB. Thank you, Adolf, for = that. >> This sounds like a very reasonable size for a backup file. >>> If the upstream Suricata fix lands soon, we may not need to do = anything further. Regarding the proposed `find` command, my only concern = is partial cache removal. When I removed the entire cache, Suricata = regenerated it cleanly on startup. I=E2=80=99m not sure how it behaves = when only some of the cache is missing, as I=E2=80=99ve only tested = removing all of it (`rm -rf /var/cache/suricata/sgh/*`). Perhaps it = would be cleaner and potentially safer to just purge the cache entirely? >> I suppose that Suricata will just re-generate anything that is = missing from the cache. It would just take a couple of seconds at = startup. >> You can simply test this be removing half the files and restart = Suricata. If it fails to come up, I would consider this a bug that we = should be reporting upstream. However, that would surprise me if it was = implemented as such. >> -Michael >>> Thanks, >>> Adam >>>=20 >>>=20 >>> On 15 December 2025 16:54:51 GMT, Michael Tremer = wrote: >>> Hello Adam, >>>=20 >>> Thank you for raising this here. >>>=20 >>> We seem to have two different issues as far as I can see: >>>=20 >>> 1) The directory just keeps growing >>>=20 >>> 2) It is being backed up and completely blows up the size of the = backup >>>=20 >>> No. 2 has been fixed by Adolf. It is however quite interesting that = we already have something in /var/cache in the backup. The intention was = to have a valid set of rules available as soon as a backup is being = restored, but I think there is very little value in this. The rules are = probably long expires and will be re-downloaded again. >>>=20 >>> We also have a large number of other (also large in disk size) lists = around that we are not backing up, so I would propose that we remove = /var/cache/suricata from the backup entirely. Until we have made a = decision on this, I have merged Adolf=E2=80=99s patch. >>>=20 >>> Regarding No. 1, this is indeed a problem that Suricata does not = clean this up itself. We could add a simple command that looks a bit = like this: >>>=20 >>> find /var/cache/suricata/ -type f -atime +7 -delete >=20 > This doesn't remove anything. >=20 > find /var/cache/suricata/sgh/ -type f gives a list of all files in the = suricata directory and the sgh one. >=20 > However find /var/cache/suricata/sgh/ -type f -atime +7 gives an empty = result. This could be. The command tries to find any files that have not been = read in 7 days. Since Suricata should be reloaded every once in a while, = this should actually be sufficient. Maybe we want 14 or even 30 days. > Maybe that is because I have restarted suricata by changing some rules = selected entries. >=20 > find /var/cache/suricata/sgh/ -type f -mtime +7 -delete took the sgh = size down from 660MB to 127MB and suricata still worked. This would remove any files that have been created (because Suricata = would never actually modify them I believe) before the 7 days threshold. = If signatures have not changed, we would remove some files that are = still needed. > I think we should do the trimming of files in the sgh directory only. Absolutely. I have no idea why I removed the last part from my comment. = That wasn=E2=80=99t intentional. > If we do it for the suricata directory then it will also remove the = tarballs for rulesets that might be selected as providers and with the = rulesets getting an update but the rules not enabled and then the last = updated date is replaced by N/A and if the provider is then enabled = suricata will go through its stuff and say it has completed but if you = then go and look to customise the rules you will find no entries for = that selected provider because the tarball has been removed. >=20 > So for selected but disabled providers you would have to go to the = provider page and force an update to force the tarball to be = re-downloaded and then you can enable it and the rules will be available = to select in the customise page. We should never remove any downloaded data even if it is a bit older. It = is better to have some signatures than no signatures if something during = the update process goes wrong. -Michael > Regards, >=20 > Adolf. >=20 >>>=20 >>> This would delete all of the cached files that have not been = accessed in the last seven days. >>>=20 >>> On my system, the entire directory is 1.4 GiB in size and the = command would remove 500 MiB. >>>=20 >>> Happy to read your thoughts. >>>=20 >>> -Michael >>>=20 >>> On 12 Dec 2025, at 16:49, Adam Gibbons = wrote: >>>=20 >>> Hi all, >>>=20 >>> As discussed on the forum >>> https://community.ipfire.org/t/re-large-backupfile/15346 >>> it appears that Suricata=E2=80=99s new cache optimisation feature is = creating a large number of files under >>> `/var/cache/suricata/sgh/`, which in some cases causes backup files = to grow to 800+ MB. >>>=20 >>> @Adolf has confirmed that this directory probably should not be = included in backups, as it is automatically regenerated, and I believe = he mentioned he is working on a patch to exclude it from the backup. >>>=20 >>> However, in the meantime, this directory continues to grow over = time. The upstream Suricata patches to automatically clean or maintain = the cache have not yet been merged, although they may be soon: >>>=20 >>> https://github.com/OISF/suricata/pull/13850 >>> https://github.com/OISF/suricata/pull/14400 >>>=20 >>> To me this represents a disk-space exhaustion risk on systems with = limited storage. Perhaps we should consider disabling Suricata=E2=80=99s = new cache optimisation feature until automatic cache cleanup/maintenance = is available upstream and included. >>>=20 >>> Thanks, >>> Adam