-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust transfer methods behaviour when interrupted #308
Adjust transfer methods behaviour when interrupted #308
Conversation
A transer is attempted even if the transport has been interrupted (with a timeout). When the timeout is reached, transfer methods will return TransferResult::interrupted (-3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me think if we should take this change as an opportunity to move fairmq/sdk/Error.*
up to fairmq/
and add the new named transfer error codes there instead of creating a new enum?
Fine with me, as long as they keep their numerical values of -1 and -2, for backwards-compatibility. |
At some point we will change the return value of the transfer methods, see #85. But for now i would not break it. |
ok, then let's do it at a later point in time when we touch the |
87fae46
to
b575626
Compare
Looks like on the new macOS 10.15 machine, the operation completes within 1 millisecond^^ |
diff --git a/fairmq/sdk/Topology.h b/fairmq/sdk/Topology.h
index 7f3ab9a..557d471 100644
--- a/fairmq/sdk/Topology.h
+++ b/fairmq/sdk/Topology.h
@@ -464,7 +464,7 @@ class BasicTopology : public AsioBase<Executor, Allocator>
}
}
- using Duration = std::chrono::milliseconds;
+ using Duration = std::chrono::microseconds;
using ChangeStateCompletionSignature = void(std::error_code, TopologyState);
private:
diff --git a/test/sdk/_topology.cxx b/test/sdk/_topology.cxx
index 54027e9..183da18 100644
--- a/test/sdk/_topology.cxx
+++ b/test/sdk/_topology.cxx
@@ -361,7 +361,7 @@ TEST_F(Topology, AsyncSetPropertiesTimeout)
topo.AsyncSetProperties({{"key1", "val1"}},
"",
- std::chrono::milliseconds(1),
+ std::chrono::microseconds(1),
[=](std::error_code ec, sdk::FailedDevices) mutable {
LOG(info) << ec;
EXPECT_EQ(ec, MakeErrorCode(ErrorCode::OperationTimeout)); fixes it on |
Another error class I see (also on current
This happens for all |
The errors in Are the results from macOS10.15 uploaded to cdash? |
Nope, something fails with the certificate. I guess the Spack cmake does not have access to the correct ca certificate list. That's why I posted logs here. |
diff --git a/test/device/_error_state.cxx b/test/device/_error_state.cxx
index 61a06f0..60bc6ac 100644
--- a/test/device/_error_state.cxx
+++ b/test/device/_error_state.cxx
@@ -74,7 +74,7 @@ void RunErrorStateIn(const string& state, const string& control, const string& i
device_thread.join();
- ASSERT_NE(string::npos, result.console_out.find("going to change to Error state from " + state + "()"));
+ // ASSERT_NE(string::npos, result.console_out.find("going to change to Error state from " + state + "()"));
exit(result.exit_code);
} This fixes it on |
I think I understand why, with newer FairLogger debug log is disabled, which is why asserted log line is missing. |
Let's make it a template arg to the methods to avoid any breakage, like we do in FairMQDevice:
And the internal value we can store in microseconds. |
This would allow any granularity. My original idea was to require a minimal resolution. See https://godbolt.org/z/c7debW |
Ok, minimal is enough. Didn't realize it can convert downwards automatically. Want to push it? |
diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
index 53df7fa..0b12f3e 100644
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@@ -12,6 +12,18 @@ include(GTestHelper)
# FairMQ Testsuites/helpers #
#############################
+if(FairLogger_VERSION VERSION_LESS 1.9.0 AND FairLogger_VERSION VERSION_GREATER_EQUAL 1.7.0)
+ LIST(APPEND definitions FAIR_MIN_SEVERITY=trace)
+endif()
+
+if(BUILD_OFI_TRANSPORT)
+ LIST(APPEND definitions BUILD_OFI_TRANSPORT)
+endif()
+
+if(definitions)
+ set(definitions DEFINITIONS ${definitions})
+endif()
+
add_testhelper(runTestDevice
SOURCES
helper/runTestDevice.cxx
@@ -30,16 +42,9 @@ add_testhelper(runTestDevice
helper/devices/TestExceptions.h
LINKS FairMQ
+ ${definitions}
)
-if(BUILD_OFI_TRANSPORT)
- LIST(APPEND definitions BUILD_OFI_TRANSPORT)
-endif()
-
-if(definitions)
- set(definitions DEFINITIONS ${definitions})
-endif()
-
set(MQ_CONFIG "${CMAKE_BINARY_DIR}/test/testsuite_FairMQ.IOPatterns_config.json")
set(RUN_TEST_DEVICE "${CMAKE_BINARY_DIR}/test/testhelper_runTestDevice")
set(FAIRMQ_BIN_DIR ${CMAKE_BINARY_DIR}/fairmq) This fixes the other tests.
ya, I'll push both fixes in a minute. |
…fast enough to complete within 1ms
https://cdash.gsi.de/testDetails.php?test=7170379&build=246945 @rbx I have seen this one already, does not happen always. |
It fails to create some shmem ressource. Looks like this particular test doesn't use a unique id for the session. I'll push a fix in a bit. |
Currently, a
Stop
transition during theRunning
state will interrupt the transfers. Blocking transfer calls (Send
/Receive
) will return-2
(timeout). If another transfer is attempted before transitioning to Ready state, the behaviour of zmq/shmem transport is inconsistent. Another call to zmq transfer will attempt to complete the transfer with a certain timeout, while shmem transfer will not attempt any transfer and immediately return-2
again.There are scenarios when doing another transfer can be useful. One such scenario is to send an
end-of-stream
message whenStop
happens. See some discussion of such a use case here: https://alice.its.cern.ch/jira/browse/O2-1499This PR:
TransferResult::error
,TransferResult::timeout
,TransferResult::interrupted
.TransferResult::interrupted
.