From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail02.haj.ipfire.org (localhost [IPv6:::1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4dzZ8z3SLyz30X7 for ; Sun, 25 Jan 2026 14:40:51 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519) (Client CN "mail01.haj.ipfire.org", Issuer "R12" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4dzZ8r2BrNz2xJy for ; Sun, 25 Jan 2026 14:40:44 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4dzZ8p65Hqz3sV; Sun, 25 Jan 2026 14:40:42 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1769352043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UhtnVgwLb75w739oY76wXY/FQ/ipASu1wvRinezFsw0=; b=BgYh67O1eRLdxFKUWzg+BskfASV0RC4JfdKU6x2Na3w6IJxo6RMJQA9+dDxZBrkkSK/kn0 QaXo5gEgEf2fcTBQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1769352043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UhtnVgwLb75w739oY76wXY/FQ/ipASu1wvRinezFsw0=; b=M0fI/GoMSl7kuqwVw82JrDlwTBX3KtqI2YAU3uHOagO+VI7HP4piYu91cv9UzdzFPbca5Z qEKvv4snFj+y0zoRUgQu+H3A5rT/1nrpKbUe9K0VuOB9+L8Ev3WPV2GKSE4bmgYz+b4C7F Thcg9q0X5dbEiY07QB3Hyc0A3N0zQmXN3jPoPlDyWe9U+NChGS+qn/w91l754MdnUida8Q n2VpSSSU1gWMjSsbDOcSVrD5DLcGK0+2XHOZXw+BLFqdKnBOYVrknKHeCL4/NNhZCXmGyM bGF2OMbZvwkLqrHvDbvAPvWKRe5vi4epu/jahUHYEmI241GEQqdCwfq9YLkXTQ== Content-Type: text/plain; charset=utf-8 Precedence: list List-Id: List-Subscribe: , List-Unsubscribe: , List-Post: List-Help: Sender: Mail-Followup-To: Mime-Version: 1.0 Subject: Re: Let's launch our own blocklists... From: Michael Tremer In-Reply-To: <45479d5a-309f-4d74-881f-03566e2dd2fe@ipfire.org> Date: Sun, 25 Jan 2026 14:40:42 +0000 Cc: development@lists.ipfire.org Content-Transfer-Encoding: quoted-printable Message-Id: <3F9DD0AE-2193-4612-80D2-7FF20DC09AB3@ipfire.org> References: <7EF00B55-81C0-493F-A70F-B1DDD45363E2@ipfire.org> <9ac9c734-51fb-4152-bc0b-d2442d03d42a@ipfire.org> <5936cb35-c243-4b0f-843f-e6354226f9be@ipfire.org> <0bc86e25-903a-42a5-a338-72defd31c606@ipfire.org> <6AD00CB0-4937-4F7F-B67B-E88D870B4942@ipfire.org> <92BFE2B7-549F-41EC-ADC9-D2D7A29BEC82@ipfire.org> <0B3D624B-9AF8-474A-B6F2-58F78665CB12@ipfire.org> <64DB3B0B-5C6A-4A36-932E-043540B27511@ipfire.org> <45479d5a-309f-4d74-881f-03566e2dd2fe@ipfire.org> To: Matthias Fischer Hello Matthias, Nice catch! I fixed it here and added the missing =E2=80=9C;=E2=80=9D: = https://git.ipfire.org/?p=3Ddbl.git;a=3Dcommitdiff;h=3D775561e322ceed43e25= 5e5547bd76047b9f8a40b If you go to the provider settings there is a button to force a ruleset = update which should give you the fixed version. Please let me know if = this works. Best, -Michael > On 24 Jan 2026, at 23:41, Matthias Fischer = wrote: >=20 > On 23.01.2026 17:39, Michael Tremer wrote: >> Hello Matthias, >=20 > Hi Michael, >=20 >> Thank you very much for testing IPFire DBL. >=20 > No problem - I have news: >=20 > After taking a closer look to the IPS system logs, unfortunately I = found > some parsing errors: >=20 > 'suricata' complains about missing ";". >=20 > ***SNIP*** > ... > 00:32:40 suricata: [13343] -- Including configuration file > /var/ipfire/suricata/suricata-used-rulesfiles.yaml. > 00:32:40 suricata: [13343] -- no terminating ";" found > 00:32:40 suricata: [13343] -- error parsing signature "drop > dns any any -> any any (msg:"IPFire DBL [Advertising] Blocked DNS > Query"; dns.query; domain; dataset:isset,ads,type string,load > datasets/ads.txt; classtype:policy-violation; priority:3; sid:983041; > rev:1; reference:url,https://www.ipfire.org/dbl/ads; metadata:dbl > ads.dbl.ipfire.org)" from file = /var/lib/suricata/ipfire_dnsbl-ads.rules > at line 72 > 00:32:40 suricata: [13343] -- no terminating ";" found > ... > ***SNAP*** >=20 > I tried, but didn't find the right place for any missing ";". >=20 > Can "anyone" confirm? >=20 > Best > Matthias >=20 >>> On 23 Jan 2026, at 15:02, Matthias Fischer = wrote: >>>=20 >>> On 22.01.2026 12:33, Michael Tremer wrote: >>>> Hello everyone, >>>=20 >>> Hi, >>>=20 >>> short feedback from me: >>>=20 >>> - I activated both the suricata (IPFire DBL - Domain Blocklist) - = and >>> the URLfilter lists from 'dbl.ipfire.org'. >>=20 >> This is an interesting case. What I didn=E2=80=99t manage to test yet = is what happens when Suricata blocks the connection first. If URL Filter = sees a domain that is being blocked it will either send you an error = page if you are using HTTP, or simply close the connection if it is = HTTPS. However, when Suricata comes first in the chain (and it will), it = might close the connection because URL Filter has received the request. = In the case of HTTPS this does not make any difference because the = connection will be closed, but in the HTTP case you won=E2=80=99t see an = error page any more and instead have the connection closed, too. You are = basically losing the explicit error notification which is a little bit = annoying. >>=20 >> We could have the same when we are doing the same with Unbound and = DNS filtering. Potentially we would need to whitelist the local DNS = resolver then, but how is Suricata supposed to know that the same = categories are activated in both places? >>=20 >>> - I even took the 'smart-tv' domains from the IFire DBL blacklist = and >>> copied/pasted them in my fritzbox filter lists. >>=20 >> LOL Why not use IPFire to filter this as well? >>=20 >>> Everything works as expected. Besides, the download of the IPFire >>> DBL-list loads a lot faster than the list from 'Univ. Toulouse'... = ;-) >>=20 >> Yes, we don=E2=80=99t have much traffic on the server, yet. >>=20 >>> Functionality is good - no false positives or seen problems. Good = work - >>> thanks! >>=20 >> Nice. We need to distinguish a little between what is a technical = issue and what is a false-positive/missing domain on the list. However, = testing both at the same time is something we will all cope quite well = with :) >>=20 >> -Michael >>=20 >>> Best >>> Matthias >>>=20 >>>> Over the past few weeks I have made significant progress on this = all, and I think we're getting close to something the community will be = really happy with. I'd love to get feedback from the team before we = finalise things. >>>>=20 >>>> So what has happened? >>>>=20 >>>> First of all, the entire project has been renamed. DNSBL is not = entirely what this is. Although the lists can be thrown into DNS, they = have much more use outside of it that I thought we should simply go with = DBL, short for Domain Blocklist. After all, we are only importing = domains. The new home of the project therefore is = https://www.ipfire.org/dbl >>>>=20 >>>> I have added a couple more lists that I thought interesting and I = have added a couple more sources that I considered a good start. = Hopefully, we will soon gather some more feedback on how well this is = all holding up. My main focus has however been on the technology that = will power this project. >>>>=20 >>>> One of the bigger challenges was to create Suricata rules from the = lists. Initially I tried to create a ton of rules but since our lists = are so large, this quickly became too complicated. I have now settled on = using a feature that is only available in more recent versions of = Suricata (I believe 7 and later), but since we are already on Suricata 8 = in IPFire this won=E2=80=99t be a problem for us. All domains for each = list are basically compiled into one massively large dataset and one = single rule is referring to that dataset. This way, we won=E2=80=99t = have the option to remove any false-positives, but at least Suricata and = the GUI won=E2=80=99t starve a really bad death when loading millions of = rules. >>>>=20 >>>> Suricata will now be able to use our rules to block access to any = listed domains of each of the categories over DNS, HTTP, TLS or QUIC. = Although I don=E2=80=99t expect many users to use Suricata to block porn = or other things, this is a great backstop to enforce any policy like = that. For example, if there is a user on the network who is trying to = circumvent the DNS server that might filter out certain domains, even = after getting an IP address resolved through other means, they won=E2=80=99= t be able to open a TLS/QUIC connection or send a HTTP request to all = blocked domains. Some people have said they were interested in blocking = DNS-over-HTTPS and this is a perfect way to do this and actually be sure = that any server that is being blocked on the list will actually be = completely inaccessible. >>>>=20 >>>> Those Suricata rules are already available for testing in Core = Update 200: = https://git.ipfire.org/?p=3Dipfire-2.x.git;a=3Dcommitdiff;h=3D9eb8751487d2= 3dd354a105c28bdbbb0398fe6e85 >>>>=20 >>>> I have chosen various severities for the lists. If someone was to = block advertising using DBL, this is fine, but not a very severe alert. = If someone chooses to block malware and there is a system on the network = trying to access those domains, this is an alert worth being = investigated by an admin. Our new Suricata Reporter will show those = violations in different colours based on the severity which helps to = identify the right alerts to further investigate. >>>>=20 >>>> Formerly I have asked you to test the lists using URL Filter. Those = rules are now available as well in Core Update 200: = https://git.ipfire.org/?p=3Dipfire-2.x.git;a=3Dcommitdiff;h=3Ddb160694279a= 4b10378447f775dd536fdfcfb02a >>>>=20 >>>> I talked about a method to remove any dead domains from any sources = which is a great way to keep our lists smaller. The pure size of them is = a problem in so many ways. That check was however a little bit too = ambitious and I had to make it a little bit less eager. Basically if we = are in doubt, we need to still list the domain because it might be = resolvable by a user. >>>>=20 >>>> = https://git.ipfire.org/?p=3Ddbl.git;a=3Dcommitdiff;h=3Dbb5b6e33b731501d45d= ea293505f7d42a61d5ce7 >>>>=20 >>>> So how else could we make the lists smaller without losing any = actual data? Since we sometimes list a whole TLD (e.g. .xxx or .porn), = there is very little point in listing any domains of this TLD. They will = always be caught anyways. So I built a check that marks all domains that = don=E2=80=99t need to be included on the exported lists because they = will never be needed and was able to shrink the size of the lists by a = lot again. >>>>=20 >>>> The website does not show this data, but the API returns the number = of =E2=80=9Csubsumed=E2=80=9D domains (I didn=E2=80=99t have a better = name): >>>>=20 >>>> curl https://api.dbl.ipfire.org/lists | jq . >>>>=20 >>>> The number shown would normally be added to the total number of = domains and usually cuts the size of the list by 50-200%. >>>>=20 >>>> Those stats will now also be stored in a history table so that we = will be able to track growth of all lists. >>>>=20 >>>> Furthermore, the application will now send email notifications for = any incoming reports. This way, we will be able to stay in close touch = with the reporters and keep them up to date on their submissions as well = as inform moderators that there is something to have a look at. >>>>=20 >>>> The search has been refactored as well, so that we can show clearly = whether something is blocked or not at one glance: = https://www.ipfire.org/dbl/search?q=3Dgithub.com. There is detailed = information available on all domains and what happened to them. In case = of GitHub.com, this seems to be blocked and unblocked by someone all of = the time and we can see a clear audit trail of that: = https://www.ipfire.org/dbl/lists/malware/domains/github.com >>>>=20 >>>> On the DNS front, I have added some metadata to the zones so that = people can programmatically request some data, like when it has been = last updated (in a human-friendly timestamp and not only the serial), = license, description and so on: >>>>=20 >>>> # dig +short ANY _info.ads.dbl.ipfire.org @primary.dbl.ipfire.org >>>> "total-domains=3D42226" >>>> "license=3DCC BY-SA 4.0" >>>> "updated-at=3D2026-01-20T22:17:02.409933+00:00" >>>> "description=3DBlocks domains used for ads, tracking, and ad = delivery=E2=80=9D >>>>=20 >>>> Now, I would like to hear more feedback from you. I know we've all = been stretched thin lately, so I especially appreciate anyone who has = time to review and provide input. Ideas, just say if you like it or not. = Where this could go in the future? >>>>=20 >>>> Looking ahead, I would like us to start thinking about the RPZ = feature that has been on the wishlist. IPFire DBL has been a bigger = piece of work, and I think it's worth having a conversation about = sustainability. Resources for this need to be allocated and paid for. = Open source is about freedom, not free beer =E2=80=94 and to keep = building features like this, we will need to explore some funding = options. I would be interested to hear any ideas you might have that = could work for IPFire. >>>>=20 >>>> Please share your thoughts on the mailing list when you can =E2=80=94= even a quick 'looks good' or 'I have concerns about X' is valuable. = Public discussion helps everyone stay in the loop and contribute. >>>>=20 >>>> I am aiming to move forward with this in a week's time, so if you = have input, now would be a good time to share it. >>>>=20 >>>> Best, >>>> -Michael >>>>=20 >>>>> On 6 Jan 2026, at 10:20, Michael Tremer = wrote: >>>>>=20 >>>>> Good Morning Adolf, >>>>>=20 >>>>> I had a look at this problem yesterday and it seems that parsing = the format is becoming a little bit difficult this way. Since this is = only affecting very few domains, I have simply whitelisted them all = manually and duckduckgo.com and others should = now be easily reachable again. >>>>>=20 >>>>> Please let me know if you have any more findings. >>>>>=20 >>>>> All the best, >>>>> -Michael >>>>>=20 >>>>>> On 5 Jan 2026, at 11:48, Michael Tremer = wrote: >>>>>>=20 >>>>>> Hello Adolf, >>>>>>=20 >>>>>> This is a good find. >>>>>>=20 >>>>>> But if duckduckgo.com is blocked, we = will have to have a source somewhere that blocks that domain. Not only a = sub-domain of it. Otherwise we have a bug somewhere. >>>>>>=20 >>>>>> This is most likely as the domain is listed here, but with some = stuff afterwards: >>>>>>=20 >>>>>> = https://raw.githubusercontent.com/mtxadmin/ublock/refs/heads/master/hosts/= _malware_typo >>>>>>=20 >>>>>> We strip everything after a # away because we consider it a = comment. However, that causes that there is only a line with the domain = left which will cause it being listed. >>>>>>=20 >>>>>> The # sign is used as some special character but at the same time = it is being used for comments. >>>>>>=20 >>>>>> I will fix this and then refresh the list. >>>>>>=20 >>>>>> -Michael >>>>>>=20 >>>>>>> On 5 Jan 2026, at 11:31, Adolf Belka = wrote: >>>>>>>=20 >>>>>>> Hi Michael, >>>>>>>=20 >>>>>>>=20 >>>>>>> On 05/01/2026 12:11, Adolf Belka wrote: >>>>>>>> Hi Michael, >>>>>>>>=20 >>>>>>>> I have found that the malware list includes duckduckgo.com >>>>>>>>=20 >>>>>>> I have checked through the various sources used for the malware = list. >>>>>>>=20 >>>>>>> The ShadowWhisperer (Tracking) list has improving.duckduckgo.com = in its list. I suspect that this one is the one causing the problem. >>>>>>>=20 >>>>>>> The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 = times but not directly as a domain name - looks more like a reference. >>>>>>>=20 >>>>>>> Regards, >>>>>>>=20 >>>>>>> Adolf. >>>>>>>=20 >>>>>>>=20 >>>>>>>> Regards, >>>>>>>> Adolf. >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> On 02/01/2026 14:02, Adolf Belka wrote: >>>>>>>>> Hi, >>>>>>>>>=20 >>>>>>>>> On 02/01/2026 12:09, Michael Tremer wrote: >>>>>>>>>> Hello, >>>>>>>>>>=20 >>>>>>>>>>> On 30 Dec 2025, at 14:05, Adolf Belka = wrote: >>>>>>>>>>>=20 >>>>>>>>>>> Hi Michael, >>>>>>>>>>>=20 >>>>>>>>>>> On 29/12/2025 13:05, Michael Tremer wrote: >>>>>>>>>>>> Hello everyone, >>>>>>>>>>>>=20 >>>>>>>>>>>> I hope everyone had a great Christmas and a couple of quiet = days to relax from all the stress that was the year 2025. >>>>>>>>>>> Still relaxing. >>>>>>>>>>=20 >>>>>>>>>> Very good, so let=E2=80=99s have a strong start into 2026 = now! >>>>>>>>>=20 >>>>>>>>> Starting next week, yes. >>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>>> Having a couple of quieter days, I have been working on a = new, little (hopefully) side project that has probably been high up on = our radar since the Shalla list has shut down in 2020, or maybe even = earlier. The goal of the project is to provide good lists with = categories of domain names which are usually used to block access to = these domains. >>>>>>>>>>>>=20 >>>>>>>>>>>> I simply call this IPFire DNSBL which is short for IPFire = DNS Blocklists. >>>>>>>>>>>>=20 >>>>>>>>>>>> How did we get here? >>>>>>>>>>>>=20 >>>>>>>>>>>> As stated before, the URL filter feature in IPFire has the = problem that there are not many good blocklists available any more. = There used to be a couple more - most famously the Shalla list - but we = are now down to a single list from the University of Toulouse. It is a = great list, but it is not always the best fit for all users. >>>>>>>>>>>>=20 >>>>>>>>>>>> Then there has been talk about whether we could implement = more blocking features into IPFire that don=E2=80=99t involve the proxy. = Most famously blocking over DNS. The problem here remains a the blocking = feature is only as good as the data that is fed into it. Some people = have been putting forward a number of lists that were suitable for them, = but they would not have replaced the blocking functionality as we know = it. Their aim is to provide =E2=80=9Cone list for everything=E2=80=9D = but that is not what people usually want. It is targeted at a classic = home user and the only separation that is being made is any = adult/porn/NSFW content which usually is put into a separate list. >>>>>>>>>>>>=20 >>>>>>>>>>>> It would have been technically possible to include these = lists and let the users decide, but that is not the aim of IPFire. We = want to do the job for the user so that their job is getting easier. = Including obscure lists that don=E2=80=99t have a clear outline of what = they actually want to block (=E2=80=9Cbad content=E2=80=9D is not a = category) and passing the burden of figuring out whether they need the = =E2=80=9CLight=E2=80=9D, =E2=80=9CNormal=E2=80=9D, =E2=80=9CPro=E2=80=9D, = =E2=80=9CPro++=E2=80=9D, =E2=80=9CUltimate=E2=80=9D or even a = =E2=80=9CVenti=E2=80=9D list with cream on top is really not going to = work. It is all confusing and will lead to a bad user experience. >>>>>>>>>>>>=20 >>>>>>>>>>>> An even bigger problem that is however completely = impossible to solve is bad licensing of these lists. A user has asked = the publisher of the HaGeZi list whether they could be included in = IPFire and under what terms. The response was that the list is available = under the terms of the GNU General Public License v3, but that does not = seem to be true. The list contains data from various sources. Many of = them are licensed under incompatible licenses (CC BY-SA 4.0, MPL, = Apache2, =E2=80=A6) and unless there is a non-public agreement that this = data may be redistributed, there is a huge legal issue here. We would = expose our users to potential copyright infringement which we cannot do = under any circumstances. Furthermore many lists are available under a = non-commercial license which excludes them from being used in any kind = of business. Plenty of IPFire systems are running in businesses, if not = even the vast majority. >>>>>>>>>>>>=20 >>>>>>>>>>>> In short, these lists are completely unusable for us. Apart = from HaGeZi, I consider OISD to have the same problem. >>>>>>>>>>>>=20 >>>>>>>>>>>> Enough about all the things that are bad. Let=E2=80=99s = talk about the new, good things: >>>>>>>>>>>>=20 >>>>>>>>>>>> Many blacklists on the internet are an amalgamation of = other lists. These lists vary in quality with some of them being not = that good and without a clear focus and others being excellent data. = Since we don=E2=80=99t have the man power to start from scratch, I felt = that we can copy the concept that HaGeZi and OISD have started and = simply create a new list that is based on other lists at the beginning = to have a good starting point. That way, we have much better control = over what is going on these lists and we can shape and mould them as we = need them. Most importantly, we don=E2=80=99t create a single lists, but = many lists that have a clear focus and allow users to choose what they = want to block and what not. >>>>>>>>>>>>=20 >>>>>>>>>>>> So the current experimental stage that I am in has these = lists: >>>>>>>>>>>>=20 >>>>>>>>>>>> * Ads >>>>>>>>>>>> * Dating >>>>>>>>>>>> * DoH >>>>>>>>>>>> * Gambling >>>>>>>>>>>> * Malware >>>>>>>>>>>> * Porn >>>>>>>>>>>> * Social >>>>>>>>>>>> * Violence >>>>>>>>>>>>=20 >>>>>>>>>>>> The categories have been determined by what source lists we = have available with good data and are compatible with our chosen license = CC BY-SA 4.0. This is the same license that we are using for the IPFire = Location database, too. >>>>>>>>>>>>=20 >>>>>>>>>>>> The main use-cases for any kind of blocking are to comply = with legal requirements in networks with children (i.e. schools) to = remove any kind of pornographic content, sometimes block social media as = well. Gambling and violence are commonly blocked, too. Even more common = would be filtering advertising and any malicious content. >>>>>>>>>>>>=20 >>>>>>>>>>>> The latter is especially difficult because so many source = lists throw phishing, spyware, malvertising, tracking and other things = into the same bucket. Here this is currently all in the malware list = which has therefore become quite large. I am not sure whether this will = stay like this in the future or if we will have to make some = adjustments, but that is exactly why this is now entering some larger = testing. >>>>>>>>>>>>=20 >>>>>>>>>>>> What has been built so far? In order to put these lists = together properly, track any data about where it is coming from, I have = built a tool in Python available here: >>>>>>>>>>>>=20 >>>>>>>>>>>> https://git.ipfire.org/?p=3Ddnsbl.git;a=3Dsummary >>>>>>>>>>>>=20 >>>>>>>>>>>> This tool will automatically update all lists once an hour = if there have been any changes and export them in various formats. The = exported lists are available for download here: >>>>>>>>>>>>=20 >>>>>>>>>>>> https://dnsbl.ipfire.org/lists/ >>>>>>>>>>> The download using dnsbl.ipfire.org/lists/squidguard.tar.gz = as the custom url works fine. >>>>>>>>>>>=20 >>>>>>>>>>> However you need to remember not to put the https:// at the = front of the url otherwise the WUI page completes without any error = messages but leaves an error message in the system logs saying >>>>>>>>>>>=20 >>>>>>>>>>> URL filter blacklist - ERROR: Not a valid URL filter = blacklist >>>>>>>>>>>=20 >>>>>>>>>>> I found this out the hard way. >>>>>>>>>>=20 >>>>>>>>>> Oh yes, I forgot that there is a field on the web UI. If that = does not accept https:// as a prefix, please file a bug and we will fix = it. >>>>>>>>>=20 >>>>>>>>> I will confirm it and raise a bug. >>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>> The other thing I noticed is that if you already have the = Toulouse University list downloaded and you then change to the ipfire = custom url then all the existing Toulouse blocklists stay in the = directory on IPFire and so you end up with a huge number of category = tick boxes, most of which are the old Toulouse ones, which are still = available to select and it is not clear which ones are from Toulouse and = which ones from IPFire. >>>>>>>>>>=20 >>>>>>>>>> Yes, I got the same thing, too. I think this is a bug, too, = because otherwise you would have a lot of unused categories lying around = that will never be updated. You cannot even tell which ones are from the = current list and which ones from the old list. >>>>>>>>>>=20 >>>>>>>>>> Long-term we could even consider to remove the Univ. Toulouse = list entirely and only have our own lists available which would make the = problem go away. >>>>>>>>>>=20 >>>>>>>>>>> I think if the blocklist URL source is changed or a custom = url is provided the first step should be to remove the old ones already = existing. >>>>>>>>>>> That might be a problem because users can also create their = own blocklists and I believe those go into the same directory. >>>>>>>>>>=20 >>>>>>>>>> Good thought. We of course cannot delete the custom lists. >>>>>>>>>>=20 >>>>>>>>>>> Without clearing out the old blocklists you end up with a = huge number of checkboxes for lists but it is not clear what happens if = there is a category that has the same name for the Toulouse list and the = IPFire list such as gambling. I will have a look at that and see what = happens. >>>>>>>>>>>=20 >>>>>>>>>>> Not sure what the best approach to this is. >>>>>>>>>>=20 >>>>>>>>>> I believe it is removing all old content. >>>>>>>>>>=20 >>>>>>>>>>> Manually deleting all contents of the urlfilter/blacklists/ = directory and then selecting the IPFire blocklist url for the custom url = I end up with only the 8 categories from the IPFire list. >>>>>>>>>>>=20 >>>>>>>>>>> I have tested some gambling sites from the IPFire list and = the block worked on some. On others the site no longer exists so there = is nothing to block or has been changed to an https site and in that = case it went straight through. Also if I chose the http version of the = link, it was automatically changed to https and went through without = being blocked. >>>>>>>>>>=20 >>>>>>>>>> The entire IPFire infrastructure always requires HTTPS. If = you start using HTTP, you will be automatically redirected. It is 2026 = and we don=E2=80=99t need to talk HTTP any more :) >>>>>>>>>=20 >>>>>>>>> Some of the domains in the gambling list (maybe quite a lot) = seem to only have an http access. If I tried https it came back with the = fact that it couldn't find it. >>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> I am glad to hear that the list is actually blocking. It = would have been bad if it didn=E2=80=99t. Now we have the big task to = check out the =E2=80=9Cquality=E2=80=9D - however that can be = determined. I think this is what needs some time=E2=80=A6 >>>>>>>>>>=20 >>>>>>>>>> In the meantime I have set up a small page on our website: >>>>>>>>>>=20 >>>>>>>>>> https://www.ipfire.org/dnsbl >>>>>>>>>>=20 >>>>>>>>>> I would like to run this as a first-class project inside = IPFire like we are doing with IPFire Location. That means that we need = to tell people about what we are doing. Hopefully this page is a little = start. >>>>>>>>>>=20 >>>>>>>>>> Initially it has a couple of high-level bullet points about = what we are trying to achieve. I don=E2=80=99t think the text is very = good, yet, but it is the best I had in that moment. There is then also a = list of the lists that we currently offer. For each list, a detailed = page will tell you about the license, how many domains are listed, when = the last update has been, the sources and even there is a history page = that shows all the changes whenever they have happened. >>>>>>>>>>=20 >>>>>>>>>> Finally there is a section that explains =E2=80=9CHow To = Use?=E2=80=9D the list which I would love to extend to include AdGuard = Plus and things like that as well as Pi-Hole and whatever else could use = the list. In a later step we should go ahead and talk to any projects to = include our list(s) into their dropdown so that people can enable them = nice and easy. >>>>>>>>>>=20 >>>>>>>>>> Behind the web page there is an API service that is running = on the host that is running the DNSBL. The frontend web app that is = running www.ipfire.org is connecting to that = API service to fetch the current lists, any details and so on. That way, = we can split the logic and avoid creating a huge monolith of a web app. = This also means that page could be down a little as I am still working = on the entire thing and will frequently restart it. >>>>>>>>>>=20 >>>>>>>>>> The API documentation is available here and the API is = publicly available: https://api.dnsbl.ipfire.org/docs >>>>>>>>>>=20 >>>>>>>>>> The website/API allows to file reports for anything that does = not seem to be right on any of the lists. I would like to keep it as an = open process, however, long-term, this cannot cost us any time. In the = current stage, the reports are getting filed and that is about it. I = still need to build out some way for admins or moderators (I am not sure = what kind of roles I want to have here) to accept or reject those = reports. >>>>>>>>>>=20 >>>>>>>>>> In case of us receiving a domain from a source list, I would = rather like to submit a report to upstream for them to de-list. That = way, we don=E2=80=99t have any admin to do and we are contributing back = to other list. That would be a very good thing to do. We cannot however = throw tons of emails at some random upstream projects without = co-ordinating this first. By not reporting upstream, we will probably = over time create large whitelists and I am not sure if that is a good = thing to do. >>>>>>>>>>=20 >>>>>>>>>> Finally, there is a search box that can be used to find out = if a domain is listed on any of the lists. >>>>>>>>>>=20 >>>>>>>>>>>> If you download and open any of the files, you will see a = large header that includes copyright information and lists all sources = that have been used to create the individual lists. This way we ensure = maximum transparency, comply with the terms of the individual licenses = of the source lists and give credit to the people who help us to put = together the most perfect list for our users. >>>>>>>>>>>>=20 >>>>>>>>>>>> I would like this to become a project that is not only = being used in IPFire. We can and will be compatible with other solutions = like AdGuard, PiHole so that people can use our lists if they would like = to even though they are not using IPFire. Hopefully, these users will = also feed back to us so that we can improve our lists over time and make = them one of the best options out there. >>>>>>>>>>>>=20 >>>>>>>>>>>> All lists are available as a simple text file that lists = the domains. Then there is a hosts file available as well as a DNS zone = file and an RPZ file. Each list is individually available to be used in = squidGuard and there is a larger tarball available with all lists that = can be used in IPFire=E2=80=99s URL Filter. I am planning to add = Suricata/Snort signatures whenever I have time to do so. Even though it = is not a good idea to filter pornographic content this way, I suppose = that catching malware and blocking DoH are good use-cases for an IPS. = Time will tell=E2=80=A6 >>>>>>>>>>>>=20 >>>>>>>>>>>> As a start, we will make these lists available in = IPFire=E2=80=99s URL Filter and collect some feedback about how we are = doing. Afterwards, we can see where else we can take this project. >>>>>>>>>>>>=20 >>>>>>>>>>>> If you want to enable this on your system, simply add the = URL to your autoupdate.urls file like here: >>>>>>>>>>>>=20 >>>>>>>>>>>> = https://git.ipfire.org/?p=3Dpeople/ms/ipfire-2.x.git;a=3Dcommitdiff;h=3Dbf= 675bb937faa7617474b3cc84435af3b1f7f45f >>>>>>>>>>> I also tested out adding the IPFire url to autoupdate.urls = and that also worked fine for me. >>>>>>>>>>=20 >>>>>>>>>> Very good. Should we include this already with Core Update = 200? I don=E2=80=99t think we would break anything, but we might already = gain a couple more people who are helping us to test this all? >>>>>>>>>=20 >>>>>>>>> I think that would be a good idea. >>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> The next step would be to build and test our DNS = infrastructure. In the =E2=80=9CHow To Use?=E2=80=9D Section on the = pages of the individual lists, you can already see some instructions on = how to use the lists as an RPZ. In comparison to other =E2=80=9Cproviders=E2= =80=9D, I would prefer if people would be using DNS to fetch the lists. = This is simply to push out updates in a cheap way for us and also do it = very regularly. >>>>>>>>>>=20 >>>>>>>>>> Initially, clients will pull the entire list using AXFR. = There is no way around this as they need to have the data in the first = place. After that, clients will only need the changes. As you can see in = the history, the lists don=E2=80=99t actually change that often. = Sometimes only once a day and therefore downloading the entire list = again would be a huge waste of data, both on the client side, but also = for us hosting then. >>>>>>>>>>=20 >>>>>>>>>> Some other providers update their lists =E2=80=9Cevery 10 = minutes=E2=80=9D, and there won't be any changes whatsoever. We don=E2=80=99= t do that. We will only export the lists again when they have actually = changed. The timestamps on the files that we offer using HTTPS can be = checked by clients so that they won=E2=80=99t re-download the list again = if it has not been changed. But using HTTPS still means that we would = have to re-download the entire list and not only the changes. >>>>>>>>>>=20 >>>>>>>>>> Using DNS and IXFR will update the lists by only transferring = a few kilobytes and therefore we can have clients check once an hour if = a list has actually changed and only send out the raw changes. That way, = we will be able to serve millions of clients at very cheap cost and they = will always have a very up to date list. >>>>>>>>>>=20 >>>>>>>>>> As far as I can see any DNS software that supports RPZs = supports AXFR/IXFR with exception of Knot Resolver which expects the = zone to be downloaded externally. There is a ticket for AXFR/IXFR = support (https://gitlab.nic.cz/knot/knot-resolver/-/issues/195). >>>>>>>>>>=20 >>>>>>>>>> Initially, some of the lists have been *huge* which is why a = simple HTTP download is not feasible. The porn list was over 100 MiB. We = could have spent thousands on just traffic alone which I don=E2=80=99t = have for this kind of project. It would also be unnecessary money being = spent. There are simply better solutions out there. But then I built = something that basically tests the data that we are receiving from = upstream but simply checking if a listed domain still exists. The result = was very astonishing to me. >>>>>>>>>>=20 >>>>>>>>>> So whenever someone adds a domain to the list, we will = (eventually, but not immediately) check if we can resolve the domain=E2=80= =99s SOA record. If not, we mark the domain as non-active and will no = longer include them in the exported data. This brought down the porn = list from just under 5 million domains to just 421k. On the sources page = (https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing the = percentage of dead domains from each of them and the UT1 list has 94% = dead domains. Wow. >>>>>>>>>>=20 >>>>>>>>>> If we cannot resolve the domain, neither can our users. So we = would otherwise fill the lists with tons of domains that simply could = never be reached. And if they cannot be reached, why would we block = them? We would waste bandwidth and a lot of memory on each single = client. >>>>>>>>>>=20 >>>>>>>>>> The other sources have similarly high rations of dead = domains. Most of them are in the 50-80% range. Therefore I am happy that = we are doing some extra work here to give our users much better data for = their filtering. >>>>>>>>>=20 >>>>>>>>> Removing all dead entries sounds like an excellent step. >>>>>>>>>=20 >>>>>>>>> Regards, >>>>>>>>>=20 >>>>>>>>> Adolf. >>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> So, if you like, please go and check out the RPZ blocking = with Unbound. Instructions are on the page. I would be happy to hear how = this is turning out. >>>>>>>>>>=20 >>>>>>>>>> Please let me know if there are any more questions, and I = would be glad to answer them. >>>>>>>>>>=20 >>>>>>>>>> Happy New Year, >>>>>>>>>> -Michael >>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> Regards, >>>>>>>>>>> Adolf. >>>>>>>>>>>> This email is just a brain dump from me to this list. I = would be happy to answer any questions about implementation details, = etc. if people are interested. Right now, this email is long enough = already=E2=80=A6 >>>>>>>>>>>>=20 >>>>>>>>>>>> All the best, >>>>>>>>>>>> -Michael >>>>>>>>>>>=20 >>>>>>>>>>> --=20 >>>>>>>>>>> Sent from my laptop >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>=20 >>>>>>>=20 >>>>>>> --=20 >>>>>>> Sent from my laptop >>>>>>>=20 >>>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>>=20 >>>=20 >>>=20 >>=20 >=20 >=20