Embiggen the goodness of maghemite #385

taspelund · 2024-10-03T20:37:46Z

This addresses a few different items found in Maghemite.

remove_prefix_path() did not delete the prefix key from rib_in after the last path was removed.
Fixes: Static route key retained despite deleting all paths #369
remove_prefix_path() was removing all paths for a prefix that happened to share a nexthop. This is bad because a single call to static_remove_v4_route() would delete all paths with that nexthop even when fields like rib_priority didn't match. This would likely also allow the deletion of a floating static route to inadvertently clobber a BGP route via the same next-hop.
apply_update() and send_update() were incorrectly applying the relevant ImportExportPolicy to the list of withdrawn routes. Filtering route withdrawals causes stale routing information, as we can end up ignoring a legitimate withdrawal from a peer or we could forget to inform a peer that we're no longer originating a route (e.g. if we first set an export policy matching N routes in an announce-set, then removed a route from that announce-set). In either case, this is incorrect behavior that can lead to forwarding issues.
Fixes: When Export Policy is configured, withdrawn NLRI can only ever contain routes allowed by Export Policy #330

taspelund · 2024-10-03T20:48:16Z

rib_in key behavior before:

➜  maghemite git:(f5dbb6a) ./target/debug/mgadm static get-v4-routes
{
    "1.1.1.1/32": [
        Path {
            bgp: None,
            local_pref: None,
            nexthop: 100.64.0.1,
            shutdown: false,
            vlan_id: None,
        },
    ],
}
➜  maghemite git:(f5dbb6a) ./target/debug/mgadm static remove-v4-routes 1.1.1.1/32 100.64.0.1
➜  maghemite git:(f5dbb6a) ./target/debug/mgadm static get-v4-routes
{
    "1.1.1.1/32": [],
}

rib_in key behavior after:

➜  maghemite git:(trey/cleanup_static_route_key) ./target/debug/mgadm static get-v4-routes
{
    "1.1.1.1/32": [
        Path {
            bgp: None,
            nexthop: 100.64.0.0,
            rib_priority: 2,
            shutdown: false,
            vlan_id: None,
        },
        Path {
            bgp: None,
            nexthop: 100.64.0.0,
            rib_priority: 10,
            shutdown: false,
            vlan_id: None,
        },
    ],
}
➜  maghemite git:(trey/cleanup_static_route_key) ✗ ./target/debug/mgadm static remove-v4-routes 1.1.1.1/32 100.64.0.0
➜  maghemite git:(trey/cleanup_static_route_key) ✗ ./target/debug/mgadm static get-v4-routes
{}

zeeshanlakhani

This seems fine to me, but can we add a test demonstrating the example you showed for removing the prefix from the rib?

Fixes: #369 Signed-off-by: Trey Aspelund <[email protected]>

Signed-off-by: Trey Aspelund <[email protected]>

Import/Export filters are meant to modify which advertised prefixes are allowed. For Import, this is simply an allow-list that accepts a subset of the advertised nlri in a received update. For Export, this is an allow-list that accepts a subset of the locally originated nlri. In neither case do you want to apply these filters to the list of withdrawn nlri, as this can result in stale routes if a legitimate withdrawal is not sent or received. Fixes: #330 Signed-off-by: Trey Aspelund <[email protected]>

Removes an unused function. Guards an illumos-specific import. Consolidates (bfd) nexthop enabled functions into one that takes a bool. Moves RIB locking and looping of prefixes and PrefixChangeNotifications into RIB removal helper functions to improve pcn batching and consolidate locations for future RIB batch work. Removes re-processing of BGP path attributes. Removes re-looping over routes/paths in a few places. Eliminates some return types when no callers handled Result. Adds some TODOs. Signed-off-by: Trey Aspelund <[email protected]>

taspelund · 2024-10-20T08:38:33Z

I added two more commits + rebased off main.

The second commit adds the requested tests, plus additional unit testing for rdb::Db methods.

The first commit is a refactor of various rdb::Db methods and their callers. Highlights below:

Rename of various add/remove methods + helper functions, ideally helping to keep naming more consistent (e.g. using "peer" instead of "id", adding "bgp"/"static" to method names)
In the add/remove codepaths, there is now a single PrefixChangeNotification used to batch all pcn's together instead of firing them off individually. I tried to keep the updated routes together in a collection up until the point of processing them, so it will be more clear in the future where to make changes to support bulk prefix updates.
Delete codepath now accepts a closure that determines what paths are removed
removed an unused method get_imported() and consolidated the enable_nexthop()/disable_nexthop() methods into a single method accepting a bool: set_nexthop_shutdown().
Changed return types from Result -> void, since callers were not actually checking the Result anyway
loc_rib() now returns a Rib, which aligns with full_rib(), static_rib(), and bgp_rib()
Moved BGP attribute -> Path parsing out of a for loop, since these are per-update not per-route
Guarded an illumos-specific debug import
Added some TODOs into the comments, primarily for bgp: router id insufficient discriminator for route-refresh stale marking #241 and to check that we cover all cases for nexthop parsing
Introduced get_selected_prefix_paths() to get an entry from rib_loc

Other refactoring we might consider later:

- Moving test helpers into a common spot (e.g. wait_for_eq macros in bgp
- Moving logging helpers into a common spot (e.g. all of bgp/src/log.rs)
- Cleaning up remaining instances of .lock().unwrap() by moving to lock!()
- More ergonomic conversions between messages::Prefix and rdb::Prefix types/subtypes (e.g. Prefix4)

edit: Addressed the logging helpers and updates to use lock!() in new commits

Signed-off-by: Trey Aspelund <[email protected]>

mgd/src/bgp_admin.rs was re-defining lock! identically to the one in mg-common. Remove this dupe definition and just import from mg-common. Signed-off-by: Trey Aspelund <[email protected]>

Replace all old instances of .lock().unwrap() with lock!() Signed-off-by: Trey Aspelund <[email protected]>

Signed-off-by: Trey Aspelund <[email protected]>

Actually run the same cargo clippy command as CI, so I can see errors locally :/ Signed-off-by: Trey Aspelund <[email protected]>

rcgoodfellow

Thanks Trey! Some great improvements here. Just a few comments.

rdb/src/db.rs

bgp/src/session.rs

rcgoodfellow · 2024-10-22T05:28:22Z

rdb/src/db.rs

@@ -378,8 +379,13 @@ impl Db {
        }
    }

-    pub fn get_imported(&self) -> Rib {
-        lock!(self.rib_in).clone()
+    pub fn get_selected_prefix_paths(&self, prefix: &Prefix) -> Vec<Path> {


If this function is only used for tests, it should probably have a #[cfg(test)] attribute.

Today it's only used in tests, but it seems like this would be a useful function for exposing per-prefix details up through mgadm/nexus. I can add the attribute if you'd like, but if so then I'd probably want to add a TODO comment along with it.

get_prefix_paths() is used in the same way today, except it's being imported by #[test] functions in bgp/src/test.rs... and mod test; in bgp/src/lib.rs is also guarded by #[cfg(test)].

For whatever reason, clippy refuses to accept the import of get_prefix_paths() in bgp/src/test.rs while its definition is guarded with #[cfg(test)].

I think that's probably something that could be addressed while doing cleanup of the existing test code. If you feel that's important to address in this PR, I can do that. Otherwise I can follow up on that later.

rdb/src/db.rs

rcgoodfellow · 2024-10-22T05:43:21Z

rdb/src/db.rs

        }

-        Self::update_loc_rib(&rib, &mut lock!(self.rib_loc), prefix);
-        self.notify(prefix.into());
+        self.notify(pcn);


I think we may want to flush before sending out the PCN.

I moved the flush after the notification because an error can cause an early return. My thought was that a successful update to the loc_rib should trigger a PCN even if the flush of the persistent route DB fails later, since the "running state" is what other listeners are likely intending to react to.

You can correct me if my read on the situation is wrong, or if we need to give some more thought to robust handling here.

The thing I worry about here is sending out a PCN that causes a persistent DB read and the data not being flushed yet. I think there is a deeper issue here, as well as in several other methods that modify in-memory state and then write the result to disk (that were also there before this PR). We need transactional semantics for these methods, either everything succeeds or nothing does. If it's possible to run all the fallible commands that persist state up front and then run the soft state updates that don't have error conditions, that would be ideal.

Probably not an important detail in this context, but it doesn't look like PCNs trigger a read of the persistent DB.
mg-lower seems to be the only watcher and it reacts to PCNs by kicking off sync_prefix(), which looks at running state known in dpd (ASIC) or the loc_rib.

All that seems appropriate to me, as I don't think we want mg-lower paying attention to rib_in or state changes that don't make their way to the set of best paths. And since the persistent bits here are specific to the static route DB, I also think it makes sense to only update the running state if the persistent change succeeds.

I do agree we should have transactional semantics (especially when the config and running state overlap so closely), but I'm not sure how to go about reverting a failed operation, or what an effective way to propagate errors for a collection of operands (routes) would be.
e.g.
If flush() fails after making a file smaller (removing a route), would we expect the revert action (presumably re-adding the route back into the DB and calling flush() again) to succeed? What happens if the subsequent flush() to get us into the last known good state also fails?
The API also allows for a collection of static routes to come in through this delete path. What happens if an error happens for just a subset of those routes? Should we bail out upon first error (which I think we did prior to this PR), stop and try to revert the operations we've already done, or try to complete the operations for as many routes as we can (aggregating or transforming the errors we've gotten up until this point)?

For the revert action, I think it might behoove us to understand what guarantees flush() makes when it returns an Err. Do we have any guarantees which prior writes have/haven't made it to disk? If not, then I'm not sure what other approach to take beyond calling flush() again... which feels a bit like the definition of insanity (something something... expecting a different outcome)

rdb/src/db.rs

Signed-off-by: Trey Aspelund <[email protected]>

taspelund added Bug mgd Maghemite daemon labels Oct 3, 2024

taspelund requested a review from rcgoodfellow October 3, 2024 20:37

taspelund self-assigned this Oct 3, 2024

taspelund marked this pull request as ready for review October 3, 2024 22:25

taspelund requested a review from internet-diglett October 3, 2024 22:32

taspelund marked this pull request as draft October 4, 2024 07:50

taspelund added the needs testing label Oct 4, 2024

taspelund changed the title ~~Cleanup key in rib_in when all paths are removed~~ Embiggen the goodness of maghemite Oct 4, 2024

taspelund added bgp Border Gateway Protocol static Static Routing labels Oct 4, 2024

taspelund marked this pull request as ready for review October 4, 2024 08:19

taspelund requested a review from zeeshanlakhani October 4, 2024 17:18

zeeshanlakhani reviewed Oct 7, 2024

View reviewed changes

taspelund added 4 commits October 18, 2024 03:44

Cleanup key in rib_in when all paths are removed

082ff27

Fixes: #369 Signed-off-by: Trey Aspelund <[email protected]>

Use proper path comparison during route path deletion

a40b7b8

Signed-off-by: Trey Aspelund <[email protected]>

taspelund force-pushed the trey/cleanup_route_key branch 2 times, most recently from 334fbdb to eb07d9d Compare October 20, 2024 08:29

Add tests for RIB insertion/removal

67b520a

Signed-off-by: Trey Aspelund <[email protected]>

taspelund force-pushed the trey/cleanup_route_key branch from eb07d9d to 67b520a Compare October 20, 2024 08:39

make clippy happy

d3dd364

taspelund force-pushed the trey/cleanup_route_key branch from c546a03 to d3dd364 Compare October 20, 2024 09:01

taspelund added 4 commits October 21, 2024 10:19

Remove duplicate lock! macro

72e6b00

mgd/src/bgp_admin.rs was re-defining lock! identically to the one in mg-common. Remove this dupe definition and just import from mg-common. Signed-off-by: Trey Aspelund <[email protected]>

Move completely over to use of lock! macro

9ac2822

Replace all old instances of .lock().unwrap() with lock!() Signed-off-by: Trey Aspelund <[email protected]>

Move log module from bgp into mg_common

4255029

Signed-off-by: Trey Aspelund <[email protected]>

Move bgp test to mg_common::log

85277cc

Signed-off-by: Trey Aspelund <[email protected]>

Make clippy happy

0d8270f

Actually run the same cargo clippy command as CI, so I can see errors locally :/ Signed-off-by: Trey Aspelund <[email protected]>

rcgoodfellow reviewed Oct 22, 2024

View reviewed changes

taspelund added 3 commits October 21, 2024 23:15

add breadcrumb for bgp-id issue

112a142

Make remove_prefix_path closure more intuitive

ad22371

Signed-off-by: Trey Aspelund <[email protected]>

remove pcn todo

daf90ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embiggen the goodness of maghemite #385

Embiggen the goodness of maghemite #385

taspelund commented Oct 3, 2024 •

edited

Loading

taspelund commented Oct 3, 2024 •

edited

Loading

zeeshanlakhani left a comment

taspelund commented Oct 20, 2024 •

edited

Loading

rcgoodfellow left a comment

rcgoodfellow Oct 22, 2024

taspelund Oct 22, 2024

taspelund Oct 22, 2024

rcgoodfellow Oct 22, 2024

taspelund Oct 22, 2024

rcgoodfellow Oct 22, 2024

taspelund Oct 22, 2024

Embiggen the goodness of maghemite #385

Are you sure you want to change the base?

Embiggen the goodness of maghemite #385

Conversation

taspelund commented Oct 3, 2024 • edited Loading

taspelund commented Oct 3, 2024 • edited Loading

zeeshanlakhani left a comment

Choose a reason for hiding this comment

taspelund commented Oct 20, 2024 • edited Loading

rcgoodfellow left a comment

Choose a reason for hiding this comment

rcgoodfellow Oct 22, 2024

Choose a reason for hiding this comment

taspelund Oct 22, 2024

Choose a reason for hiding this comment

taspelund Oct 22, 2024

Choose a reason for hiding this comment

rcgoodfellow Oct 22, 2024

Choose a reason for hiding this comment

taspelund Oct 22, 2024

Choose a reason for hiding this comment

rcgoodfellow Oct 22, 2024

Choose a reason for hiding this comment

taspelund Oct 22, 2024

Choose a reason for hiding this comment

taspelund commented Oct 3, 2024 •

edited

Loading

taspelund commented Oct 3, 2024 •

edited

Loading

taspelund commented Oct 20, 2024 •

edited

Loading