* Auto-generated plaintext updates; re: regression
@ 2022-03-09 23:42 Jordan Savoca
2022-03-10 9:15 ` Michael Tremer
0 siblings, 1 reply; 5+ messages in thread
From: Jordan Savoca @ 2022-03-09 23:42 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 353 bytes --]
Hello o/
Would it be possible to deploy the patch to the 0.9.11 regression on the
system which generates the database.txt file committed to the
location-database repository[1]? The database is currently devoid of IP
announcement information.
Thank so much!
[1] https://git.ipfire.org/?p=location/location-database.git;a=summary
--
Jordan Savoca
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Auto-generated plaintext updates; re: regression
2022-03-09 23:42 Auto-generated plaintext updates; re: regression Jordan Savoca
@ 2022-03-10 9:15 ` Michael Tremer
2022-03-10 14:45 ` Jordan Savoca
0 siblings, 1 reply; 5+ messages in thread
From: Michael Tremer @ 2022-03-10 9:15 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 1051 bytes --]
Hello Jordan,
Thanks for your email and welcome here.
> On 9 Mar 2022, at 23:42, Jordan Savoca <jsavoca(a)posteo.net> wrote:
>
> Hello o/
>
> Would it be possible to deploy the patch to the 0.9.11 regression on the system which generates the database.txt file committed to the location-database repository[1]? The database is currently devoid of IP announcement information.
Yes, I will do this as soon as possible. At the moment, there are lots of other changes in the repository and I didn’t yet know for certain that they won’t break anything.
But in general, I would like to say, that you should not use the text dump. It is only there for audit purposes and committed to the Git repository so that everyone can see any changes to the database easily.
I would strongly recommend using the binary database and the Python/Perl bindings. What is your application for the data?
Best,
-Michael
> Thank so much!
>
> [1] https://git.ipfire.org/?p=location/location-database.git;a=summary
>
> --
> Jordan Savoca
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Auto-generated plaintext updates; re: regression
2022-03-10 9:15 ` Michael Tremer
@ 2022-03-10 14:45 ` Jordan Savoca
2022-03-10 14:48 ` Michael Tremer
0 siblings, 1 reply; 5+ messages in thread
From: Jordan Savoca @ 2022-03-10 14:45 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 1223 bytes --]
On 3/10/22 02:15, Michael Tremer wrote:
> Yes, I will do this as soon as possible. At the moment, there are lots of other changes in the repository and I didn’t yet know for certain that they won’t break anything.
Awesome, thank you!
> But in general, I would like to say, that you should not use the text dump. It is only there for audit purposes and committed to the Git repository so that everyone can see any changes to the database easily.
>
> I would strongly recommend using the binary database and the Python/Perl bindings. What is your application for the data?
I had originally intended to use the binary database but I found the
non-standard format and introduction of dependencies less flexible for
my use case relative to fetching the plaintext set via submodule to
store in a sqlite db where it can then be queried using standard
libraries. For now my use case is limited to visualizing announcement
changes over time but I hope to make something of import at some point.
The semi-recent change to maxmind's license + requiring an account to
fetch geolocation data led me to look into alternative sources, so thank
you for your work on location(8) :).
--
Jordan Savoca
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Auto-generated plaintext updates; re: regression
2022-03-10 14:45 ` Jordan Savoca
@ 2022-03-10 14:48 ` Michael Tremer
2022-03-10 16:02 ` Jordan Savoca
0 siblings, 1 reply; 5+ messages in thread
From: Michael Tremer @ 2022-03-10 14:48 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 2193 bytes --]
Hello,
> On 10 Mar 2022, at 14:45, Jordan Savoca <jsavoca(a)posteo.net> wrote:
>
> On 3/10/22 02:15, Michael Tremer wrote:
>> Yes, I will do this as soon as possible. At the moment, there are lots of other changes in the repository and I didn’t yet know for certain that they won’t break anything.
>
> Awesome, thank you!
>
>> But in general, I would like to say, that you should not use the text dump. It is only there for audit purposes and committed to the Git repository so that everyone can see any changes to the database easily.
>> I would strongly recommend using the binary database and the Python/Perl bindings. What is your application for the data?
>
> I had originally intended to use the binary database but I found the non-standard format and introduction of dependencies less flexible for my use case relative to fetching the plaintext set via submodule to store in a sqlite db where it can then be queried using standard libraries. For now my use case is limited to visualizing announcement changes over time but I hope to make something of import at some point.
I would strongly recommend this, because parsing the text database and going with the first match is quite likely going to give you inaccurate results.
The reason why we are going with a binary format is because it is organised as a binary tree and therefore can be searched *very* quickly. It also allows us to store multiple networks with the same start address, but a different prefix length which is what you won’t have in a relational database and therefore you will quite likely have inaccurate results.
We kept the library as small as possible and we are looking at upstreaming it to all major distributions because that will help integrating it into other software a lot easier. Any help with this would be greatly appreciated.
> The semi-recent change to maxmind's license + requiring an account to fetch geolocation data led me to look into alternative sources, so thank you for your work on location(8) :).
You are welcome :) You can also support us with a donation if you like at https://www.ipfire.org/donate.
Best,
-Michael
>
> --
> Jordan Savoca
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Auto-generated plaintext updates; re: regression
2022-03-10 14:48 ` Michael Tremer
@ 2022-03-10 16:02 ` Jordan Savoca
0 siblings, 0 replies; 5+ messages in thread
From: Jordan Savoca @ 2022-03-10 16:02 UTC (permalink / raw)
To: location
[-- Attachment #1: Type: text/plain, Size: 1103 bytes --]
On 3/10/22 07:48, Michael Tremer wrote:
> I would strongly recommend this, because parsing the text database and going with the first match is quite likely going to give you inaccurate results.
>
> The reason why we are going with a binary format is because it is organised as a binary tree and therefore can be searched *very* quickly. It also allows us to store multiple networks with the same start address, but a different prefix length which is what you won’t have in a relational database and therefore you will quite likely have inaccurate results.
Yes! I store block announcements w/ autonomous system table foreign keys
to support anycast networks and varied prefixes without relation to or
dependence on the order in which the set is parsed, albeit at a
performance cost as you suggested.
> We kept the library as small as possible and we are looking at upstreaming it to all major distributions because that will help integrating it into other software a lot easier. Any help with this would be greatly appreciated.
Very cool, that would be sweet.
--
Jordan Savoca
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-03-10 16:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-09 23:42 Auto-generated plaintext updates; re: regression Jordan Savoca
2022-03-10 9:15 ` Michael Tremer
2022-03-10 14:45 ` Jordan Savoca
2022-03-10 14:48 ` Michael Tremer
2022-03-10 16:02 ` Jordan Savoca
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox