Re[2]: Let's launch our own blocklists...

public inbox for development@lists.ipfire.org
 help / color / mirror / Atom feed

From: "Jon Murphy" <jon.murphy@ipfire.org>
To: "Adolf Belka" <adolf.belka@ipfire.org>,
	"Michael Tremer" <michael.tremer@ipfire.org>
Cc: "IPFire: Development-List" <development@lists.ipfire.org>
Subject: Re[2]: Let's launch our own blocklists...
Date: Tue, 30 Dec 2025 15:52:28 +0000	[thread overview]
Message-ID: <emfe28f6ab-381c-468b-80b8-09758b618f01@ipfire.org> (raw)

Forgot this part…

Comments are below.


------ Original Message ------
From "Adolf Belka" <adolf.belka@ipfire.org>
To "Michael Tremer" <michael.tremer@ipfire.org>
Cc "IPFire: Development-List" <development@lists.ipfire.org>
Date 12/30/2025 8:05:20 AM
Subject Re: Let's launch our own blocklists...

>Hi Michael,
>
>On 29/12/2025 13:05, Michael Tremer wrote:
>>Hello everyone,
>>
>>I hope everyone had a great Christmas and a couple of quiet days to relax from all the stress that was the year 2025.
>Still relaxing.
>>Having a couple of quieter days, I have been working on a new, little (hopefully) side project that has probably been high up on our radar since the Shalla list has shut down in 2020, or maybe even earlier. The goal of the project is to provide good lists with categories of domain names which are usually used to block access to these domains.
>>
>>I simply call this IPFire DNSBL which is short for IPFire DNS Blocklists.
>>
>>How did we get here?
>>
>>As stated before, the URL filter feature in IPFire has the problem that there are not many good blocklists available any more. There used to be a couple more - most famously the Shalla list - but we are now down to a single list from the University of Toulouse. It is a great list, but it is not always the best fit for all users.
>>
>>Then there has been talk about whether we could implement more blocking features into IPFire that don’t involve the proxy. Most famously blocking over DNS. The problem here remains a the blocking feature is only as good as the data that is fed into it. Some people have been putting forward a number of lists that were suitable for them, but they would not have replaced the blocking functionality as we know it. Their aim is to provide “one list for everything” but that is not what people usually want. It is targeted at a classic home user and the only separation that is being made is any adult/porn/NSFW content which usually is put into a separate list.
>>
>>It would have been technically possible to include these lists and let the users decide, but that is not the aim of IPFire. We want to do the job for the user so that their job is getting easier. Including obscure lists that don’t have a clear outline of what they actually want to block (“bad content” is not a category) and passing the burden of figuring out whether they need the “Light”, “Normal”, “Pro”, “Pro++”, “Ultimate” or even a “Venti” list with cream on top is really not going to work. It is all confusing and will lead to a bad user experience.
>>
>>An even bigger problem that is however completely impossible to solve is bad licensing of these lists. A user has asked the publisher of the HaGeZi list whether they could be included in IPFire and under what terms. The response was that the list is available under the terms of the GNU General Public License v3, but that does not seem to be true. The list contains data from various sources. Many of them are licensed under incompatible licenses (CC BY-SA 4.0, MPL, Apache2, …) and unless there is a non-public agreement that this data may be redistributed, there is a huge legal issue here. We would expose our users to potential copyright infringement which we cannot do under any circumstances. Furthermore many lists are available under a non-commercial license which excludes them from being used in any kind of business. Plenty of IPFire systems are running in businesses, if not even the vast majority.
>>
>>In short, these lists are completely unusable for us. Apart from HaGeZi, I consider OISD to have the same problem.
>>
>>Enough about all the things that are bad. Let’s talk about the new, good things:
>>
>>Many blacklists on the internet are an amalgamation of other lists. These lists vary in quality with some of them being not that good and without a clear focus and others being excellent data. Since we don’t have the man power to start from scratch, I felt that we can copy the concept that HaGeZi and OISD have started and simply create a new list that is based on other lists at the beginning to have a good starting point. That way, we have much better control over what is going on these lists and we can shape and mould them as we need them. Most importantly, we don’t create a single lists, but many lists that have a clear focus and allow users to choose what they want to block and what not.
>>
>>So the current experimental stage that I am in has these lists:
>>
>>    * Ads
>>    * Dating
>>    * DoH
>>    * Gambling
>>    * Malware
>>    * Porn
>>    * Social
>>    * Violence
>>
>>The categories have been determined by what source lists we have available with good data and are compatible with our chosen license CC BY-SA 4.0. This is the same license that we are using for the IPFire Location database, too.
>>
>>The main use-cases for any kind of blocking are to comply with legal requirements in networks with children (i.e. schools) to remove any kind of pornographic content, sometimes block social media as well. Gambling and violence are commonly blocked, too. Even more common would be filtering advertising and any malicious content.
>>
>>The latter is especially difficult because so many source lists throw phishing, spyware, malvertising, tracking and other things into the same bucket. Here this is currently all in the malware list which has therefore become quite large. I am not sure whether this will stay like this in the future or if we will have to make some adjustments, but that is exactly why this is now entering some larger testing.
>>
>>What has been built so far? In order to put these lists together properly, track any data about where it is coming from, I have built a tool in Python available here:
>>
>>https://git.ipfire.org/?p=dnsbl.git;a=summary
>>
>>This tool will automatically update all lists once an hour if there have been any changes and export them in various formats. The exported lists are available for download here:
>>
>>https://dnsbl.ipfire.org/lists/
>The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as the custom url works fine.
>
>However you need to remember not to put the https:// at the front of the url otherwise the WUI page completes without any error messages but leaves an error message in the system logs saying
>
>URL filter blacklist - ERROR: Not a valid URL filter blacklist
>
>I found this out the hard way.
>
>The other thing I noticed is that if you already have the Toulouse University list downloaded and you then change to the ipfire custom url then all the existing Toulouse blocklists stay in the directory on IPFire and so you end up with a huge number of category tick boxes, most of which are the old Toulouse ones, which are still available to select and it is not clear which ones are from Toulouse and which ones from IPFire.
>
>I think if the blocklist URL source is changed or a custom url is provided the first step should be to remove the old ones already existing.
>That might be a problem because users can also create their own blocklists and I believe those go into the same directory.
>
>Without clearing out the old blocklists you end up with a huge number of checkboxes for lists but it is not clear what happens if there is a category that has the same name for the Toulouse list and the IPFire list such as gambling. I will have a look at that and see what happens.
>
>Not sure what the best approach to this is.


To find the older files I used this:
     find /var/ipfire/urlfilter/blacklists -mtime +365 -type f -ls

To delete the older files and folders I did this:
     find /var/ipfire/urlfilter/blacklists -mtime +365 -type f -delete -o -type d -empty -delete


>
>
>Manually deleting all contents of the urlfilter/blacklists/ directory and then selecting the IPFire blocklist url for the custom url I end up with only the 8 categories from the IPFire list.
>
>I have tested some gambling sites from the IPFire list and the block worked on some. On others the site no longer exists so there is nothing to block or has been changed to an https site and in that case it went straight through. Also if I chose the http version of the link, it was automatically changed to https and went through without being blocked.
>
>
>>If you download and open any of the files, you will see a large header that includes copyright information and lists all sources that have been used to create the individual lists. This way we ensure maximum transparency, comply with the terms of the individual licenses of the source lists and give credit to the people who help us to put together the most perfect list for our users.
>>
>>I would like this to become a project that is not only being used in IPFire. We can and will be compatible with other solutions like AdGuard, PiHole so that people can use our lists if they would like to even though they are not using IPFire. Hopefully, these users will also feed back to us so that we can improve our lists over time and make them one of the best options out there.
>>
>>All lists are available as a simple text file that lists the domains. Then there is a hosts file available as well as a DNS zone file and an RPZ file. Each list is individually available to be used in squidGuard and there is a larger tarball available with all lists that can be used in IPFire’s URL Filter.
>
I tested the URL Filter with the change to the autoupdate.urls.  For me it only picked up one URL but I think my Web Proxy is configured wrong.

  • Is the non-Transparent needed to utilize the IPFire DNSBL (a.k.a. IPFire DNS Blocklists)??

  • Does Web Proxy Auto-Discovery Protocol (WPAD) need to be setup?

I ask because I disabled the Web Proxy once Shalla Services stopped a few years ago.

I think the answer is yes to get HTTPS sites recognized.

>
>>I am planning to add Suricata/Snort signatures whenever I have time to do so. Even though it is not a good idea to filter pornographic content this way, I suppose that catching malware and blocking DoH are good use-cases for an IPS. Time will tell…
>>
>>As a start, we will make these lists available in IPFire’s URL Filter and collect some feedback about how we are doing. Afterwards, we can see where else we can take this project.
>>
>>If you want to enable this on your system, simply add the URL to your autoupdate.urls file like here:
>>
>>https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f
>I also tested out adding the IPFire url to autoupdate.urls and that also worked fine for me.
>
>Regards,
>Adolf.
>>This email is just a brain dump from me to this list. I would be happy to answer any questions about implementation details, etc. if people are interested. Right now, this email is long enough already…
>>
>>All the best,
>>-Michael
>
>-- Sent from my laptop
>
>
>

next             reply	other threads:[~2025-12-30 15:52 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-30 15:52 Jon Murphy [this message]
2026-01-02 11:14 ` Michael Tremer
  -- strict thread matches above, loose matches on Subject: below --
2025-12-29 12:05 Michael Tremer
2025-12-30 14:05 ` Adolf Belka
2025-12-30 15:49   ` Re[2]: " Jon Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=emfe28f6ab-381c-468b-80b8-09758b618f01@ipfire.org \
    --to=jon.murphy@ipfire.org \
    --cc=adolf.belka@ipfire.org \
    --cc=development@lists.ipfire.org \
    --cc=michael.tremer@ipfire.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox