Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAIDA's AS relationship #64

Open
romain-fontugne opened this issue Aug 15, 2023 · 9 comments
Open

CAIDA's AS relationship #64

romain-fontugne opened this issue Aug 15, 2023 · 9 comments
Assignees

Comments

@romain-fontugne
Copy link
Member

romain-fontugne commented Aug 15, 2023

Import CAIDA AS relationship data, it should be very similar to bgpkit as2rel crawler.

The data is available here:
https://publicdata.caida.org/datasets/as-relationships/serial-2/
https://publicdata.caida.org/datasets/as-relationships/serial-1/

@roopeshsn roopeshsn self-assigned this Aug 15, 2023
@roopeshsn
Copy link
Member

roopeshsn commented Aug 27, 2023

According to the link you've provided it states that,

"The as-rel files contain p2p and p2c relationships.
The format is:
<provider-as>|<customer-as>|-1
<peer-as>|<peer-as>|0|<source>"

But I was able to see only the data in the latter format (p2c) in the latest .txt file (20230801.as-rel2.txt.bz2),
1|5467|0|bgp

It'll be better if you clarify it @romain-fontugne.

@romain-fontugne
Copy link
Member Author

Thanks @roopeshsn for looking at that. I just checked the latest file (20230801.as-rel2.txt.bz2) and the first few lines (after the long comments) seems OK to me:

1|5467|0|bgp
1|8641|0|bgp
1|50377|-1|bgp
1|51705|0|bgp
1|51728|0|bgp
1|59572|0|bgp
2|3999|-1|bgp

I think the README is wrong the format is

<provider-as>|<customer-as>|-1|<source>
<peer-as>|<peer-as>|0|<source>

I will report that to CAIDA, thanks!

@romain-fontugne
Copy link
Member Author

I got back from CAIDA, we should use data in https://publicdata.caida.org/datasets/as-relationships/serial-1/ (not serial-2)

@roopeshsn
Copy link
Member

These are the blockages right now,

  • I need to process only the file named 20230801.as-rel.txt.bz2 right?
  • In README it is mentioned that the relationship will be of two types, <provider-as>|<customer-as>|-1 and <peer-as>|<peer-as>|0. So the relationship will look like (:AS {asn: xxxx})-[:PEERS_WITH {rel: -1}]-(:AS {asn: xxxx}) and (:AS {asn: xxxx})-[:PEERS_WITH {rel: 0}]-(:AS {asn: xxxx}) right?

@m-appel
Copy link
Member

m-appel commented Sep 11, 2023

Ideally the crawler should fetch the latest version of the *.as-rel.txt.bz2 file, yes.

Your relationships are correct, although you will have to assign a direction when creating them, but this can be arbitrary as we always fetch them without direction.

I wonder if we should normalize the rel format with BGPKIT, since they use rel: 1 for customer-provider relationships instead of rel: -1. Any thoughts @romain-fontugne?

@m-appel
Copy link
Member

m-appel commented Sep 13, 2023

Actually, now I think we should not change the source data, because if you then compare with the corresponding README, it gets confusing. I propose leaving the rel property as-is for now and maybe create a new "parallel", but directed, relationship for the customer-provider case at some point (not now).

@romain-fontugne
Copy link
Member Author

Yes, I think we can keep the data as it is. But note that this data contains directed links, for the provider-customer relationships the direction is important.

@m-appel
Copy link
Member

m-appel commented Sep 13, 2023

Is it though if we add it as a PEERS_WITH relationship? As far as I am aware we always match these without direction (and it is also not intuitive on which end of a directed PEERS_WITH relationship the provider and on which end the customer should be).

Anyways, to be consistent with the BGPKIT crawler, you can parse the lines in their current order, the direction in case of a provider-customer relationship is:

(Provider:AS)-[:PEERS_WITH {rel: -1}]->(Customer:AS)

@romain-fontugne
Copy link
Member Author

There is at least one example where we use the PEERS_WITH direction:
https://github.com/InternetHealthReport/internet-yellow-pages/blob/main/documentation/iij.md#iijs-main-competitors

yes, anyways, let's just be consistent with whatever we are doing with BGPKIT crawler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🔖 Ready
Development

No branches or pull requests

3 participants