Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Couldn't find non overlapping protected path' when restore broken ISL on protected path. #5608

Open
pkazlenka opened this issue Mar 11, 2024 · 4 comments

Comments

@pkazlenka
Copy link
Collaborator

pkazlenka commented Mar 11, 2024

Currently our automated test "ProtectedPathSpec.Unable to swap paths for an inactive flow()" fails each time when we use switches 8 and 9 as switch pair (for some reason, the rest works ok).
At some moment of test (after the step when: "Restore ISL for the protected path") both protected and main paths are up, but the flow stays in Degraded state with error : Couldn't find non overlapping protected path. Skipped creating it:
image
Flow history entries:

{
    "clazz": "org.openkilda.messaging.payload.history.FlowHistoryEntry",
    "flow_id": "11Mar085334_860_cloves0146",
    "timestamp": 1710146451,
    "timestamp_iso": "2024-03-11T08:40:51.324Z",
    "actor": "AUTO",
    "action": "Flow rerouting",
    "task_id": "88bc4925-e861-40f4-98ae-41cf39788b44 : W : 00:00:00:00:00:00:00:08_16 : DM : 00:00:00:00:00:00:00:08_16 : 8208bb89-97f6-47c2-b662-6f71b92fd81a : 11Mar085334_860_cloves0146",
    "details": "Reason: ISL 00:00:00:00:00:00:00:03_6 <===> 00:00:00:00:00:00:00:08_16 status become ACTIVE",
    "payload": [
      {
        "timestamp": 1710146451,
        "action": "Flow rerouting operation has been started.",
        "details": null,
        "timestamp_iso": "2024-03-11T08:40:51.324Z"
      },
      {
        "timestamp": 1710146451,
        "action": "The flow has been validated successfully",
        "details": null,
        "timestamp_iso": "2024-03-11T08:40:51.341Z"
      },
      {
        "timestamp": 1710146451,
        "action": "Couldn't find non overlapping protected path. Skipped creating it",
        "details": null,
        "timestamp_iso": "2024-03-11T08:40:51.394Z"
      },
      {
        "timestamp": 1710146451,
        "action": "The flow status was set to DEGRADED",
        "details": null,
        "timestamp_iso": "2024-03-11T08:40:51.407Z"
      },
      {
        "timestamp": 1710146451,
        "action": "Failed to reroute the flow",
        "details": "Couldn't find non overlapping protected path",
        "timestamp_iso": "2024-03-11T08:40:51.413Z"
      }
    ],
    "dumps": []
  }

Workaround: flow goes into 'Up' state when we try to reroute the flow manually.

@pkazlenka
Copy link
Collaborator Author

Same problem (I think) occurs in test ProtectedPathSpec."Flow swaps to protected path when main path gets broken, becomes DEGRADED if protected path is unable to reroute(no bw)"()
Flow with protected path denies to swap paths with the same reason if the flow is between virtual switches 8-7:
image

 {
    "clazz": "org.openkilda.messaging.payload.history.FlowHistoryEntry",
    "flow_id": "11Mar115334_734_turnips5072",
    "timestamp": 1710154663,
    "timestamp_iso": "2024-03-11T10:57:43.233Z",
    "actor": "AUTO",
    "action": "Flow rerouting",
    "task_id": "af5afadf-d137-4282-87f9-7a178bbe9d11 : 11Mar115334_734_turnips5072",
    "details": "Reason: ISL 00:00:00:00:00:00:00:07_50 <===> 00:00:00:00:00:00:00:08_7 become INACTIVE due to physical link DOWN event on 00:00:00:00:00:00:00:07_50",
    "payload": [
      {
        "timestamp": 1710154663,
        "action": "Flow rerouting operation has been started.",
        "details": null,
        "timestamp_iso": "2024-03-11T10:57:43.233Z"
      },
      {
        "timestamp": 1710154663,
        "action": "The flow has been validated successfully",
        "details": null,
        "timestamp_iso": "2024-03-11T10:57:43.244Z"
      },
      {
        "timestamp": 1710154663,
        "action": "Couldn't find non overlapping protected path. Skipped creating it",
        "details": null,
        "timestamp_iso": "2024-03-11T10:57:43.266Z"
      },
      {
        "timestamp": 1710154663,
        "action": "The flow status was set to DEGRADED",
        "details": null,
        "timestamp_iso": "2024-03-11T10:57:43.271Z"
      },
      {
        "timestamp": 1710154663,
        "action": "Failed to reroute the flow",
        "details": "Couldn't find non overlapping protected path",
        "timestamp_iso": "2024-03-11T10:57:43.275Z"
      }
    ],
    "dumps": []
  }

@izadorozhna
Copy link
Collaborator

Reproduced this issue in the test Flow swaps to protected path when main path gets broken, becomes DEGRADED if protected path is unable to reroute(no bw) even without 8-9 switches. In my case, when I reproduced the issue, the switch pair was 2-3 switches. I am in progress of investigating

@izadorozhna
Copy link
Collaborator

The issue is reproduced with switches 2-3.\

Here the main path has 1 ISL which is 00:00:00:00:00:00:00:02-3 -> 00:00:00:00:00:00:00:03-1. The protected path has also only 1 ISL which is 00:00:00:00:00:00:00:02-1 -> 00:00:00:00:00:00:00:03-2.

In this case, the main path ISL is broken, and the protected path ISL is UP and has enough bw to keep the flow, so the test is not failing with "Not enough bandwidth or no path found." since the BW is enough. Please see the picture.

However, in this case, the new protected path (earlier it was the main path before the swap) ISL is red because it is broken, so a new protected path cannot be found. And the test fails because the flow status is degraded due to the other reason.

image

izadorozhna added a commit that referenced this issue Apr 25, 2024
Implements #5390
Related to #5608

* Fixed the test "Flow swaps to protected path when main path gets
  broken, becomes DEGRADED if protected path is unable to
  reroute(no bw)"
* Earlier in some cases when the switchPair was set to 2-3 or 8-9,
  the protected path had only 1 ISL and it had enough BW, so the
  test failed because the BW was not reduced for some protected
  path ISLs.
* Now this test passes even when the swPair is 2-3 or 8-9, and
  other switches. So the temporary fix to skip 8-9 switches
  is removed.
@izadorozhna
Copy link
Collaborator

I think, to fix the issue, we just need to include the protected path ISLs to the list for BW decreasing. Thus, protected path ISLs will have not enough bw and the test will pass.
Please see PR #5648

izadorozhna added a commit that referenced this issue Apr 25, 2024
Implements #5390
Related to #5608

* Fixed the test "Flow swaps to protected path when main path gets
  broken, becomes DEGRADED if protected path is unable to
  reroute(no bw)"
* Earlier in some cases when the switchPair was set to 2-3 or 8-9,
  the protected path had only 1 ISL and it had enough BW, so the
  test failed because the BW was not reduced for some protected
  path ISLs.
* Now this test passes even when the swPair is 2-3 or 8-9, and
  other switches. So the temporary fix to skip 8-9 switches
  is removed.
* Removed skip 8-9 switches workaround from the test "Flow swaps
  to protected path when main path gets broken, becomes DEGRADED
  if protected path is unable to reroute(no path)"
izadorozhna added a commit that referenced this issue Apr 25, 2024
Implements #5390
Related to #5608

* Fixed the test "Flow swaps to protected path when main path gets
  broken, becomes DEGRADED if protected path is unable to
  reroute(no bw)"
* Earlier in some cases when the switchPair was set to 2-3 or 8-9,
  the protected path had only 1 ISL and it had enough BW, so the
  test failed because the BW was not reduced for some protected
  path ISLs.
* Now this test passes even when the swPair is 2-3 or 8-9, and
  other switches. So the temporary fix to skip 8-9 switches
  is removed.
* Removed the workaround to skip 8-9 from 2 tests
izadorozhna added a commit that referenced this issue May 8, 2024
Implements #5390
Related to #5608
Fixes #5653

* Fixed the test "Flow swaps to protected path when main path gets
  broken, becomes DEGRADED if protected path is unable to
  reroute(no bw)"
* Earlier the otherIsls list was not correct since it contained
  some ISLs from the protected path. Now the otherIsls (not
  involved into mair or protected paths) list is correct and
  the “Couldn’t find non overlapping protected path” message
  is correct.
* Removed the workaround to skip 8-9 from 2 tests
oleksir pushed a commit that referenced this issue Jul 9, 2024
Implements #5390
Related to #5608
Fixes #5653

* Fixed the test "Flow swaps to protected path when main path gets
  broken, becomes DEGRADED if protected path is unable to
  reroute(no bw)"
* Earlier the otherIsls list was not correct since it contained
  some ISLs from the protected path. Now the otherIsls (not
  involved into mair or protected paths) list is correct and
  the “Couldn’t find non overlapping protected path” message
  is correct.
* Removed the workaround to skip 8-9 from 2 tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants