-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embiggen the goodness of maghemite #385
base: main
Are you sure you want to change the base?
Conversation
rib_in key behavior before:
rib_in key behavior after:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to me, but can we add a test demonstrating the example you showed for removing the prefix from the rib?
Fixes: #369 Signed-off-by: Trey Aspelund <[email protected]>
Signed-off-by: Trey Aspelund <[email protected]>
Import/Export filters are meant to modify which advertised prefixes are allowed. For Import, this is simply an allow-list that accepts a subset of the advertised nlri in a received update. For Export, this is an allow-list that accepts a subset of the locally originated nlri. In neither case do you want to apply these filters to the list of withdrawn nlri, as this can result in stale routes if a legitimate withdrawal is not sent or received. Fixes: #330 Signed-off-by: Trey Aspelund <[email protected]>
Removes an unused function. Guards an illumos-specific import. Consolidates (bfd) nexthop enabled functions into one that takes a bool. Moves RIB locking and looping of prefixes and PrefixChangeNotifications into RIB removal helper functions to improve pcn batching and consolidate locations for future RIB batch work. Removes re-processing of BGP path attributes. Removes re-looping over routes/paths in a few places. Eliminates some return types when no callers handled Result. Adds some TODOs. Signed-off-by: Trey Aspelund <[email protected]>
334fbdb
to
eb07d9d
Compare
I added two more commits + rebased off main. The second commit adds the requested tests, plus additional unit testing for rdb::Db methods. The first commit is a refactor of various rdb::Db methods and their callers. Highlights below:
Other refactoring we might consider later:
edit: Addressed the logging helpers and updates to use lock!() in new commits |
Signed-off-by: Trey Aspelund <[email protected]>
eb07d9d
to
67b520a
Compare
c546a03
to
d3dd364
Compare
mgd/src/bgp_admin.rs was re-defining lock! identically to the one in mg-common. Remove this dupe definition and just import from mg-common. Signed-off-by: Trey Aspelund <[email protected]>
Replace all old instances of .lock().unwrap() with lock!() Signed-off-by: Trey Aspelund <[email protected]>
Signed-off-by: Trey Aspelund <[email protected]>
Signed-off-by: Trey Aspelund <[email protected]>
Actually run the same cargo clippy command as CI, so I can see errors locally :/ Signed-off-by: Trey Aspelund <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Trey! Some great improvements here. Just a few comments.
@@ -378,8 +379,13 @@ impl Db { | |||
} | |||
} | |||
|
|||
pub fn get_imported(&self) -> Rib { | |||
lock!(self.rib_in).clone() | |||
pub fn get_selected_prefix_paths(&self, prefix: &Prefix) -> Vec<Path> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is only used for tests, it should probably have a #[cfg(test)]
attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today it's only used in tests, but it seems like this would be a useful function for exposing per-prefix details up through mgadm/nexus. I can add the attribute if you'd like, but if so then I'd probably want to add a TODO comment along with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_prefix_paths()
is used in the same way today, except it's being imported by #[test]
functions in bgp/src/test.rs... and mod test;
in bgp/src/lib.rs is also guarded by #[cfg(test)]
.
For whatever reason, clippy refuses to accept the import of get_prefix_paths()
in bgp/src/test.rs while its definition is guarded with #[cfg(test)]
.
I think that's probably something that could be addressed while doing cleanup of the existing test code. If you feel that's important to address in this PR, I can do that. Otherwise I can follow up on that later.
} | ||
|
||
Self::update_loc_rib(&rib, &mut lock!(self.rib_loc), prefix); | ||
self.notify(prefix.into()); | ||
self.notify(pcn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to flush before sending out the PCN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the flush after the notification because an error can cause an early return. My thought was that a successful update to the loc_rib should trigger a PCN even if the flush of the persistent route DB fails later, since the "running state" is what other listeners are likely intending to react to.
You can correct me if my read on the situation is wrong, or if we need to give some more thought to robust handling here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing I worry about here is sending out a PCN that causes a persistent DB read and the data not being flushed yet. I think there is a deeper issue here, as well as in several other methods that modify in-memory state and then write the result to disk (that were also there before this PR). We need transactional semantics for these methods, either everything succeeds or nothing does. If it's possible to run all the fallible commands that persist state up front and then run the soft state updates that don't have error conditions, that would be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not an important detail in this context, but it doesn't look like PCNs trigger a read of the persistent DB.
mg-lower seems to be the only watcher and it reacts to PCNs by kicking off sync_prefix()
, which looks at running state known in dpd (ASIC) or the loc_rib.
All that seems appropriate to me, as I don't think we want mg-lower paying attention to rib_in or state changes that don't make their way to the set of best paths. And since the persistent bits here are specific to the static route DB, I also think it makes sense to only update the running state if the persistent change succeeds.
I do agree we should have transactional semantics (especially when the config and running state overlap so closely), but I'm not sure how to go about reverting a failed operation, or what an effective way to propagate errors for a collection of operands (routes) would be.
e.g.
If flush() fails after making a file smaller (removing a route), would we expect the revert action (presumably re-adding the route back into the DB and calling flush() again) to succeed? What happens if the subsequent flush() to get us into the last known good state also fails?
The API also allows for a collection of static routes to come in through this delete path. What happens if an error happens for just a subset of those routes? Should we bail out upon first error (which I think we did prior to this PR), stop and try to revert the operations we've already done, or try to complete the operations for as many routes as we can (aggregating or transforming the errors we've gotten up until this point)?
For the revert action, I think it might behoove us to understand what guarantees flush() makes when it returns an Err. Do we have any guarantees which prior writes have/haven't made it to disk? If not, then I'm not sure what other approach to take beyond calling flush() again... which feels a bit like the definition of insanity (something something... expecting a different outcome)
Signed-off-by: Trey Aspelund <[email protected]>
This addresses a few different items found in Maghemite.
remove_prefix_path() did not delete the prefix key from
rib_in
after the last path was removed.Fixes: Static route key retained despite deleting all paths #369
remove_prefix_path() was removing all paths for a prefix that happened to share a nexthop. This is bad because a single call to
static_remove_v4_route()
would delete all paths with that nexthop even when fields like rib_priority didn't match. This would likely also allow the deletion of a floating static route to inadvertently clobber a BGP route via the same next-hop.apply_update() and send_update() were incorrectly applying the relevant ImportExportPolicy to the list of withdrawn routes. Filtering route withdrawals causes stale routing information, as we can end up ignoring a legitimate withdrawal from a peer or we could forget to inform a peer that we're no longer originating a route (e.g. if we first set an export policy matching N routes in an announce-set, then removed a route from that announce-set). In either case, this is incorrect behavior that can lead to forwarding issues.
Fixes: When Export Policy is configured, withdrawn NLRI can only ever contain routes allowed by Export Policy #330