Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for JSON containing multiple events #2545

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

tim427
Copy link

@tim427 tim427 commented Dec 11, 2024

Currently the intelmq.bots.parsers.json.parser is only able to parse or single events in JSON, or multiple events in JSON, each on their own line.

This PR contains an option to parse multiple events within a JSON, by adding the multiple_events (boolean) to the config.

@sebix sebix self-assigned this Dec 16, 2024
Copy link
Member

@sebix sebix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example feed at hand (so we can extract an example for the tests, add it to the docs)? To my knowledge no documented feed is using such a format.


def process(self):
report = self.receive_message()
if self.splitlines:
if self.multiple_events:
lines = [json.dumps(event) for event in json.loads(base64_decode(report['raw']))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting the data forth and back appears to be inefficient.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, could imagine this. Any tips to do this a proper way?

Currently this PR is running in our production and works just fine, but I agree on the double JSON conversion isn't the most efficient way to do this.

@tim427
Copy link
Author

tim427 commented Dec 16, 2024

Do you have an example feed at hand (so we can extract an example for the tests, add it to the docs)? To my knowledge no documented feed is using such a format.

Our National Cyber Security Centre (NCSC) is sending us "IntelMQ JSON's" in a ZIP-file by mail.
The ZIP-file contains a single JSON-file.

Here's an example (I tried to anonymise most values):

[
    {
        "extra.dataset_collections": "0",
        "extra.dataset_files": "1",
        "extra.dataset_infected": "false",
        "extra.dataset_ransom": "null",
        "extra.dataset_rows": "0",
        "extra.dataset_size": "301",
        "protocol.application": "https",
        "protocol.transport": "tcp",
        "source.asn": 12345689,
        "source.fqdn": "fqdn-example-1.tld",
        "source.geolocation.cc": "NL",
        "source.geolocation.city": "Enschede",
        "source.geolocation.latitude": 52.0000000000000,
        "source.geolocation.longitude": 6.0000000000000,
        "source.geolocation.region": "Overijssel",
        "source.ip": "127.1.2.1",
        "source.network": "127.1.0.0/16",
        "source.port": 80,
        "time.source": "2024-12-16T02:08:06+00:00"
    },
    {
        "extra.dataset_collections": "0",
        "extra.dataset_files": "1",
        "extra.dataset_infected": "false",
        "extra.dataset_ransom": "null",
        "extra.dataset_rows": "0",
        "extra.dataset_size": "615",
        "extra.os_name": "Ubuntu",
        "extra.software": "Apache",
        "extra.tag": "rescan",
        "extra.version": "2.4.58",
        "protocol.application": "https",
        "protocol.transport": "tcp",
        "source.asn": 12345689,
        "source.fqdn": "fqdn-example-2.tld",
        "source.geolocation.cc": "NL",
        "source.geolocation.city": "Eindhoven",
        "source.geolocation.latitude": 51.0000000000000,
        "source.geolocation.longitude": 5.0000000000000,
        "source.geolocation.region": "North Brabant",
        "source.ip": "127.1.2.2",
        "source.network": "127.1.0.0/16",
        "source.port": 443,
        "time.source": "2024-12-16T02:08:12+00:00"
    },
    {
        "extra.dataset_collections": "0",
        "extra.dataset_files": "1",
        "extra.dataset_infected": "false",
        "extra.dataset_ransom": "null",
        "extra.dataset_rows": "0",
        "extra.dataset_size": "421",
        "protocol.application": "http",
        "protocol.transport": "tcp",
        "source.asn": 12345689,
        "source.geolocation.cc": "NL",
        "source.geolocation.city": "Enschede",
        "source.geolocation.latitude": 52.0000000000000,
        "source.geolocation.longitude": 6.0000000000000,
        "source.geolocation.region": "Overijssel",
        "source.ip": "127.1.2.3",
        "source.network": "127.1.0/16",
        "source.port": 9000,
        "time.source": "2024-12-15T21:09:49+00:00"
    }
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants