Generate overview page for node and relationship types #21

m-appel · 2023-01-19T09:08:01Z

The different node and relationship types that are created by the crawlers are currently only accessible via the individual README's of the crawlers (e.g., APNIC). To make it easier getting started with the database, we need an overview page that summarizes briefly all nodes and relationships. However, to avoid having to maintain everything twice, this page should be generated from the individual README's.

Figure out what information to include and how to represent it in the overview.
Decide on a template for crawler README's.
Create a script (maybe GitHub action?) to generate overview page.

Yh010 · 2023-03-06T13:58:03Z

As per me :

A)For the overview HTML page, we could include the data in a tabular format that contains each crawler's generated
1)node types
2)relationship type between the nodes
3)a brief description about the nodes and the relation
4)A link to that Crawler's README

the table will contain the above for each of the crawler used.

B)for crawler's README template:
again we can use a table displaying the node types, relation type between the nodes , and a brief description.
this table will have to be updated by the crawler developer .

C)Script(Github action) to generate the overview page:
1.it should parse the crawlers README info and extract the info
2. next it should generate a HTML page
3.this should be an automation script using github action that runs on each update to the crawler

The nodes could include IP addresses, domain names, URLs, email addresses, usernames, and other types of information that can be identified and extracted from web pages.

Yh010 · 2023-03-07T08:33:40Z

should I work on this issue?

romain-fontugne · 2023-03-13T01:56:11Z

yes, sure. But I think the final output should be md no html

Yh010 · 2023-03-16T13:18:29Z

@romain-fontugne, what if we create the crawler readme template this way:
https://round-bobcat-0ac.notion.site/GSOC-75bc3ef547614960b154275359d12562
If new columns are required, they can be added to the template in a similar fashion.
the above link shows how the data will be displayed on the overview page in .md format.

All new crawlers will be added in the above format.

the scripts required to automate the process are :

whenever a new crawler is added, its readme info should be parsed and added to the overview page

romain-fontugne assigned Yh010 Mar 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate overview page for node and relationship types #21

Generate overview page for node and relationship types #21

m-appel commented Jan 19, 2023

Yh010 commented Mar 6, 2023

Yh010 commented Mar 7, 2023

romain-fontugne commented Mar 13, 2023

Yh010 commented Mar 16, 2023

Generate overview page for node and relationship types #21

Generate overview page for node and relationship types #21

Comments

m-appel commented Jan 19, 2023

Yh010 commented Mar 6, 2023

Yh010 commented Mar 7, 2023

romain-fontugne commented Mar 13, 2023

Yh010 commented Mar 16, 2023