Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Caching, advanced mapping and separating events for MISP Feed output bot #2509

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 9 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
### Core
- `intelmq.lib.utils.drop_privileges`: When IntelMQ is called as `root` and dropping the privileges to user `intelmq`, also set the non-primary groups associated with the `intelmq` user. Makes the behaviour of running intelmqctl as `root` closer to the behaviour of `sudo -u intelmq ...` (PR#2507 by Mikk Margus Möll).
- `intelmq.lib.utils.unzip`: Ignore directories themselves when extracting data to prevent the extraction of empty data for a directory entries (PR#2512 by Kamil Mankowski).
- `intelmq.lib.mixins.cache.CacheMixin` was extended to support temporary storing messages in a cache queue
(PR#2509 by Kamil Mankowski).

### Development

Expand Down Expand Up @@ -43,7 +45,13 @@
- Treat value `false` for parameter `filter_regex` as false (PR#2499 by Sebastian Wagner).

#### Outputs
- `intelmq.bots.outputs.misp.output_feed`: Handle failures if saved current event wasn't saved or is incorrect (PR by Kamil Mankowski).
- `intelmq.bots.outputs.misp.output_feed`:
- Handle failures if saved current event wasn't saved or is incorrect (PR by Kamil Mankowski).
- Allow saving messages in bulks instead of refreshing the feed immediately (PR#2509 by Kamil Mankowski).
- Add `attribute_mapping` parameter to allow selecting a subset of event attributes as well as additional attribute parameters (PR#2509 by Kamil Mankowski).
- Add `event_separator` parameter to allow keeping IntelMQ events in separated MISP Events based on a given field (PR#2509 by Kamil Mankowski).
- Add `tagging` parameter to allow adding tags to MISP events (PR#2509 by Kamil Mankowski).
- Add `additional_info` parameter to extend the default description of MISP Events (PR#2509 by Kamil Mankowski).
- `intelmq.bots.outputs.smtp_batch.output`: Documentation on multiple recipients added (PR#2501 by Edvard Rejthar).

### Documentation
Expand Down
14 changes: 13 additions & 1 deletion docs/dev/bot-development.md
Original file line number Diff line number Diff line change
Expand Up @@ -197,14 +197,26 @@ The `CacheMixin` provides methods to cache values for bots in a Redis database.
- `redis_cache_ttl: int = 15`
- `redis_cache_password: Optional[str] = None`

and provides the methods:
and provides the methods to cache key-value pairs:

- `cache_exists`
- `cache_get`
- `cache_set`
- `cache_flush`
- `cache_get_redis_instance`

and following methods to cache objects in a queue:

- `cache_put`
- `cache_pop`
- `cache_length`.

Caching key-value pairs and queue caching are two separated mechanisms. The first is designed
for arbitrary values, the second one is focused on temporary storing messages (but can handle other
data). You won't see caches from one in the another. For example, if adding a key-value pair using
`cache_set`, it does not change the value from `cache_length`, and if adding an element using
`cache_put` you cannot use `check_exists` to look for it.

### Pipeline Interactions

We can call three methods related to the pipeline:
Expand Down
103 changes: 103 additions & 0 deletions docs/user/bots.md
Original file line number Diff line number Diff line change
Expand Up @@ -4608,6 +4608,12 @@ Create a directory layout in the MISP Feed format.
The PyMISP library >= 2.4.119.1 is required, see
[REQUIREMENTS.txt](https://github.com/certtools/intelmq/blob/master/intelmq/bots/outputs/misp/REQUIREMENTS.txt).

Note: please test the produced feed before using in production. This bot allows you to do an
extensive customisation of the MISP feed, including creating multiple events and tags, but it can
be tricky to configure properly. Misconfiguration can prevent bot from starting or have bad
consequences for your MISP Instance (e.g. spaming with events). Use `intelmqctl check` command
to validate your configuration against common mistakes.

**Module:** `intelmq.bots.outputs.misp.output_feed`

**Parameters:**
Expand All @@ -4632,6 +4638,103 @@ The PyMISP library >= 2.4.119.1 is required, see
() The output bot creates one event per each interval, all data in this time frame is part of this event. Default "1
hour", string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens when set to 0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked in the code, and it looks like in such case, this would be a ValueError. However, you can use "0 hours" what should result in creating a new MISP event for every IntelMQ event.


**`bulk_save_count`**

(optional, int) If set to a non-0 value, the bot won't refresh the MISP feed immediately, but will cache
incoming messages until the given number of them. Use it if your bot proceeds a high number of messages
and constant saving to the disk is a problem. Reloading or restarting bot as well as generating
a new MISP event based on `interval_event` triggers regenerating MISP feed regardless of the cache size.

**`attribute_mapping`**

(optional, dict) If set, allows selecting which IntelMQ event fields are mapped to MISP attributes
as well as attribute parameters (like e.g. a comment). The expected format is a *dictionary of dictionaries*:
first-level key represents an IntelMQ field that will be directly translated to a MISP attribute; nested
dictionary represents additional parameters PyMISP can take when creating an attribute. They can use
names of other IntelMQ fields (then the value of such field will be used), or static values. If not needed,
leave empty dict.

For available attribute parameters, refer to the
[PyMISP documentation](https://pymisp.readthedocs.io/en/latest/_modules/pymisp/mispevent.html#MISPObjectAttribute)
for the `MISPObjectAttribute`.

For example:

```yaml
attribute_mapping:
source.ip:
feed.name:
comment: event_description.text
destination.ip:
to_ids: False
```

would create a MISP object with three attributes `source.ip`, `feed.name` and `destination.ip`
and set their values as in the IntelMQ event. In addition, the `feed.name` would have a comment
as given in the `event_description.text` from IntelMQ event, and `destination.ip` would be set
as not usable for IDS.
kamil-certat marked this conversation as resolved.
Show resolved Hide resolved

**`event_separator`

(optional, string): If set to a field name from IntelMQ event, the bot will work in parallel on a few
events instead of saving all incoming messages to a one. Each unique value from the field will
use its own MISP Event. This is useful if your feed provides data about multiple entities you would
like to group, for example IPs of C2 servers from different botnets. For a given value, the bot will
use the same MISP Event as long as it's allowed by the `interval_event`.

**`additional_info`

(optional, string): If set, the generated MISP Event will use it in the `info` field of the event,
in addition to the standard IntelMQ description with the time frame (you cannot remove it as the bot
depends of datetimes saved there). If you use `event_separator`, you may want to use `{separator}`
placeholder which will be then replaced with the value of the separator.

For example, the following configuration can be used to create MISP Feed with IPs of C2 servers
of different botnets, having each botnet in a separated MISP Events with an appropriate description.
Each MISP Event will contain objects with the `source.ip` field only, and the events' info will look
like *C2 Servers for botnet-1. IntelMQ event 2024-07-09T14:51:10.825123 - 2024-07-10T14:51:10.825123*

```yaml
event_separator: malware.name
additional_info: C2 Servers for {separator}.
attribute_mapping:
source.ip:
```

**`tagging`

(optional, dict): Allows setting MISP tags to MISP events. The structure is a *dict of list of dicts*.
The keys refers to which MISP events you want to tag. If you want to tag all of them, use `__all__`.
If you use `event_separator` and want to add additional tags to some events, use the expected values
of the separation field. The *list of dicts* defines MISP tags as parameters to create `MISPTag`
objects from. Each dictionary has to have at least `name`. For all available parameters refer to the
[PyMISP documentation](https://pymisp.readthedocs.io/en/latest/_modules/pymisp/abstract.html#MISPTag)
for `MISPTag`.

Note: setting `name` is enough for MISP to match a correct tag from the global collection. You may
see it lacking the colour in the MISP Feed view, but it will be retriven after importing to your
instance.

Example 1 - set two tags for every MISP event:

```yaml
tagging:
__all__:
- name: tlp:red
- name: source:intelmq
```

Example 2 - create separated events based on `malware.name` and set additional family tag:

```yaml
event_separator: malware.name
tagging:
__all__:
- name: tlp:red
njrat:
- name: njrat
```

**Usage in MISP**

Configure the destination directory of this feed as feed in MISP, either as local location, or served via a web server.
Expand Down
Loading
Loading