-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(dashmate): invalid drive status check #2248
Conversation
WalkthroughThe pull request introduces several modifications across multiple files to enhance platform status determination and error handling. Key changes include the enhancement of the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (9)
packages/dashmate/src/status/scopes/overview.js (1)
60-62
: LGTM! Consider adding a comment for clarity.The changes look good and align with the PR objectives. The addition of the
drive
property to theplatform
object allows for more comprehensive status reporting.Consider adding a brief comment explaining the purpose of the
drive
property, similar to howtenderdash
is documented in the initialplatform
object declaration. This would improve code readability and maintainability.if (config.get('platform.enable')) { const { drive, tenderdash } = await getPlatformScope(config); + // Add drive status to the platform object platform.drive = drive; platform.tenderdash = tenderdash; }
packages/dashmate/src/status/determineStatus.js (1)
52-55
: Approved: Good addition to handle non-started status.The new conditional block correctly addresses the case when the docker status is
not_started
, returning astopped
status instead oferror
. This aligns well with the PR objectives and improves the accuracy of status reporting.For consistency with the rest of the method, consider using a single-line return statement:
if (dockerStatus === DockerStatusEnum.not_started) { - return ServiceStatusEnum.stopped; + return ServiceStatusEnum.stopped; }This minor change would make the new block consistent with the style used in the rest of the method.
packages/dashmate/src/config/generateEnvsFactory.js (1)
70-73
: LGTM! Consider future configurability for metrics URL.The implementation of the
PLATFORM_DRIVE_ABCI_METRICS_URL
environment variable looks good and aligns with the PR objectives. It correctly sets the metrics URL only when the feature is enabled.For future consideration: The hardcoded IP and port (0.0.0.0:29090) for the metrics URL might limit flexibility. Consider making these values configurable in the future if there's a possibility that different setups might require different IP/port combinations.
Also applies to: 89-89
packages/dashmate/src/commands/status/index.js (1)
111-116
: Approved: Good consolidation of platform statusThe introduction of the
platformStatus
variable effectively consolidates the status of tenderdash and drive services, prioritizing tenderdash status. This change aligns well with the PR objectives and improves the clarity of the status determination logic.Consider adding a brief comment explaining the priority given to tenderdash status over drive status. This would enhance code readability and make the intent clearer for future maintainers. For example:
// Prioritize tenderdash status over drive status for overall platform status const platformStatus = platform.tenderdash.serviceStatus !== ServiceStatusEnum.up ? platform.tenderdash.serviceStatus : platform.drive.serviceStatus;packages/dashmate/src/status/scopes/platform.js (1)
166-193
: Improved error handling and status checking.The changes enhance the robustness of the
getDriveInfo
function:
- Distinguishing between a missing container and other errors.
- Determining service status more accurately by checking if the service is responding.
- Consistent use of the newly imported
DockerComposeError
.These improvements align well with the PR objectives of addressing issues related to drive status checks.
Consider adding a comment explaining the significance of the exit code check on line 187:
// Non-zero exit code indicates an error in the drive-abci status command if (e instanceof DockerComposeError && e.dockerComposeExecutionResult && e.dockerComposeExecutionResult.exitCode !== 0) { info.serviceStatus = ServiceStatusEnum.error; } else { throw e; }packages/rs-drive-abci/src/main.rs (4)
229-229
: Adjust logging syntax for error reportingIn the
tracing::error!
macro, using the{e}
formatting inside the message is acceptable, but you can enhance clarity by using the provided syntax of thetracing
crate.Consider modifying the error logging for consistency:
- tracing::error!(error = e, "drive-abci failed: {e}"); + tracing::error!(%e, "drive-abci failed");This way, the error is logged both as structured data (
%e
uses theDisplay
implementation) and the message remains clear.
446-446
: Remove unnecessary leading colon inuse
statementThe leading
::
in the use statement is redundant since paths inuse
statements are already absolute from the crate root.Apply this diff to simplify the import:
-use ::drive::{drive::Drive, query::Element}; +use drive::{drive::Drive, query::Element};
447-448
: Organize imports for clarityThe imports in the test module can be grouped and organized for better readability.
Consider reorganizing the imports:
-use dpp::block::epoch::Epoch; -use drive::drive::credit_pools::epochs::epoch_key_constants; +use drive::{ + drive::Drive, + drive::credit_pools::epochs::{epoch_key_constants, paths::EpochProposers}, + query::Element, +}; +use dpp::block::epoch::Epoch;
112-112
: Log server start completion for better observabilityAn additional log statement indicating that the server has started successfully can improve observability.
Consider adding a log statement after the server has started:
server::start(runtime, Arc::new(platform), config, cancel); +tracing::info!("drive-abci server started successfully");
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (7)
- packages/dashmate/src/commands/status/index.js (1 hunks)
- packages/dashmate/src/config/generateEnvsFactory.js (1 hunks)
- packages/dashmate/src/status/determineStatus.js (1 hunks)
- packages/dashmate/src/status/scopes/overview.js (1 hunks)
- packages/dashmate/src/status/scopes/platform.js (2 hunks)
- packages/dashmate/test/unit/status/scopes/platform.spec.js (3 hunks)
- packages/rs-drive-abci/src/main.rs (8 hunks)
🧰 Additional context used
🔇 Additional comments (11)
packages/dashmate/src/status/determineStatus.js (2)
Line range hint
1-55
: Summary: Improved status determination with accurate handling of non-started status.The changes made to the
platform
method in this file effectively address one of the key objectives of the PR. By adding a specific condition to handle theDockerStatusEnum.not_started
case, the code now correctly reports astopped
status instead of anerror
status for non-started drives. This improvement enhances the accuracy of the status determination process and aligns well with the overall goals of the PR.The changes are well-integrated with the existing code structure and logic. They do not introduce any breaking changes and maintain the overall readability and consistency of the method.
To further ensure the robustness of this change, consider:
- Adding a unit test specifically for this new condition.
- Documenting this new behavior in any relevant documentation or comments.
Overall, these changes represent a positive improvement to the status determination logic.
52-55
: Verify consistency ofDockerStatusEnum.not_started
usage.The addition of handling for
DockerStatusEnum.not_started
is a good improvement. To ensure consistency across the codebase, let's verify how this status is used in other parts of the system.Run the following script to check the usage of
DockerStatusEnum.not_started
:This will help us ensure that the new handling of
not_started
status is consistent with its usage elsewhere in the codebase.✅ Verification successful
Consistency of
DockerStatusEnum.not_started
usage confirmed.No issues found regarding the usage of
DockerStatusEnum.not_started
across the codebase.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check usage of DockerStatusEnum.not_started across the codebase # Search for DockerStatusEnum.not_started in all JavaScript files echo "Occurrences of DockerStatusEnum.not_started:" rg --type js "DockerStatusEnum\.not_started" -n # Search for other methods that determine docker status echo "\nOther methods determining docker status:" rg --type js "docker.*status" -nLength of output: 1485
packages/dashmate/src/status/scopes/platform.js (3)
2-2
: LGTM: New import for improved error handling.The addition of the
DockerComposeError
import is appropriate for the new error handling implemented in thegetDriveInfo
function.
194-195
: LGTM: Consistent return statement placement.Moving the return statement outside of the try-catch block ensures that the
info
object is always returned, regardless of whether an error occurred. This is a good practice that improves the function's consistency and readability.
Line range hint
1-324
: Summary: Improved drive status checking aligns with PR objectives.The changes in this file, primarily in the
getDriveInfo
function, successfully address the PR objectives:
- The function now correctly handles cases where the Drive container is stopped, representing its status as "stopped" instead of "errored".
- The improved error handling ensures more accurate status reporting, particularly when the gRPC getStatus endpoint is used.
These modifications enhance the reliability of the platform status determination without introducing breaking changes. The overall structure of the file remains intact, maintaining code readability and organization.
packages/dashmate/test/unit/status/scopes/platform.spec.js (2)
1-2
: LGTM: New import for error handlingThe addition of
ContainerIsNotPresentError
import is appropriate. This new import suggests improved error handling for scenarios where a container is not present, which aligns with the PR objectives of addressing issues related to drive status checks.
382-382
: LGTM: Improved handling of non-present drive containerThese changes correctly implement the PR objective of addressing the incorrect treatment of a non-started drive. By throwing a
ContainerIsNotPresentError
for the 'drive_abci' container and updating the expecteddockerStatus
tonot_started
andserviceStatus
tostopped
, the test now accurately represents the scenario where the drive is not running. This improvement enhances the robustness of the test suite and ensures more accurate status reporting.Also applies to: 435-436
packages/rs-drive-abci/src/main.rs (4)
6-9
: Imported modules are appropriately addedThe new imports from
dapi_grpc
are necessary for the gRPC client implementation in thecheck_status
function. They are correctly specified and used in the code.
149-149
: Ensure asynchronouscheck_status
is properly awaitedThe
check_status
function is now asynchronous. Usingruntime.block_on
is appropriate for synchronously executing this asynchronous function.
419-419
: Verify conversion from verbose level to log levelWhen converting
cli.verbose
usingtry_into()
, ensure that all possible verbosity levels are correctly handled by the implementation ofTryFrom<u8>
for your log level type.To confirm that all verbosity levels are supported, review the implementation:
#!/bin/bash # Description: Check the implementation of TryFrom<u8> for LogLevel. # Search for the TryFrom implementation in the codebase. rg --type rust 'impl.*TryFrom<u8>.*for.*LogLevel'
14-14
: Confirm necessity of Prometheus importThe
Prometheus
import is retained, and it is still being used in thestart_prometheus
function. Ensure that retaining Prometheus metrics aligns with the PR objectives of moving away from relying on Prometheus for status checks.To verify if
Prometheus
is still required, check for its usage across the codebase:✅ Verification successful
Prometheus import is still necessary
The
Prometheus
import is actively used in thestart_prometheus
function and other parts of the codebase. Retaining Prometheus metrics aligns with the current implementation.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for usages of Prometheus in the codebase. # Search for 'Prometheus' import statements and usages. rg --type rust 'Prometheus'Length of output: 127
Script:
#!/bin/bash # Description: Check for definitions and usages of Prometheus and start_prometheus in the codebase. # Search for 'Prometheus' definitions and usages. rg --type rust 'Prometheus' # Search for 'start_prometheus' function definitions and usages. rg --type rust 'start_prometheus'Length of output: 1808
tracing::info!( | ||
version = env!("CARGO_PKG_VERSION"), | ||
features = list_enabled_features().join(","), | ||
rust = env!("CARGO_PKG_RUST_VERSION"), | ||
"drive-abci server initializing", | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential undefined environment variable CARGO_PKG_RUST_VERSION
The use of env!("CARGO_PKG_RUST_VERSION")
may result in a compilation error because CARGO_PKG_RUST_VERSION
is not a standard Cargo-provided environment variable. This could cause the build to fail if the environment variable isn't defined.
Consider using the rustc_version
crate to obtain the Rust compiler version:
-
Add
rustc_version
to yourCargo.toml
dependencies:[dependencies] rustc_version = "0.4"
-
Modify the code to use
rustc_version
:tracing::info!( version = env!("CARGO_PKG_VERSION"), features = list_enabled_features().join(","), - rust = env!("CARGO_PKG_RUST_VERSION"), + rust = rustc_version::version().unwrap().to_string(), "drive-abci server initializing", );
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a standard env.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shumkov, understood. Thank you for the clarification.
(・ω・)?
✏️ Learnings added
Learnt from: shumkov
PR: dashpay/platform#2248
File: packages/rs-drive-abci/src/main.rs:106-111
Timestamp: 2024-10-17T08:52:54.159Z
Learning: In this project, the environment variable `CARGO_PKG_RUST_VERSION` is defined and can be used safely with the `env!` macro.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
async fn check_status(config: &PlatformConfig) -> Result<(), String> { | ||
// Convert the gRPC bind address string to a Uri | ||
let uri = Uri::from_str(&format!("http://{}", config.grpc_bind_address)) | ||
.map_err(|e| format!("invalid url: {e}"))?; | ||
|
||
// Connect to the gRPC server | ||
let mut client = PlatformClient::connect(uri.clone()) | ||
.await | ||
.map_err(|e| format!("can't connect to grpc server {uri}: {e}"))?; | ||
|
||
// Make a request to the server | ||
let request = dapi_grpc::platform::v0::GetStatusRequest { | ||
version: Some(get_status_request::Version::V0(GetStatusRequestV0 {})), | ||
}; | ||
|
||
println!("{}", body); | ||
Ok(()) | ||
} else { | ||
Err("ABCI_PROMETHEUS_BIND_ADDRESS not defined, cannot check status".to_string()) | ||
} | ||
// Should return non-zero error code if Drive is not responding | ||
client | ||
.get_status(request) | ||
.await | ||
.map(|_| ()) | ||
.map_err(|e| format!("can't request status: {e}")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance error handling in check_status
function
The current error handling in the check_status
function could be improved to provide more context and robustness. Adding timeouts and handling specific error cases can make the function more reliable.
Consider implementing timeouts and more detailed error messages:
use tokio::time::{timeout, Duration};
async fn check_status(config: &PlatformConfig) -> Result<(), String> {
// Convert the gRPC bind address string to a Uri
let uri = Uri::from_str(&format!("http://{}", config.grpc_bind_address))
.map_err(|e| format!("Invalid gRPC URL '{}': {}", config.grpc_bind_address, e))?;
// Set a timeout duration
let timeout_duration = Duration::from_secs(5);
// Connect to the gRPC server within the timeout period
let mut client = timeout(
timeout_duration,
PlatformClient::connect(uri.clone())
)
.await
.map_err(|_| format!("Timeout connecting to gRPC server at {}", uri))?
.map_err(|e| format!("Can't connect to gRPC server {}: {}", uri, e))?;
// Create the request
let request = dapi_grpc::platform::v0::GetStatusRequest {
version: Some(get_status_request::Version::V0(GetStatusRequestV0 {})),
};
// Send the request within the timeout period
timeout(timeout_duration, client.get_status(request))
.await
.map_err(|_| "Timeout requesting status from gRPC server".to_string())?
.map(|_| ())
.map_err(|e| format!("Can't request status: {}", e))
}
This adds a timeout to both the connection and the request, providing better control over potential network issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This connection happening in local network. Either port is listening or not. We do not expect timeouts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shumkov, understood. Since the connection is within the local network and timeouts are not expected, the current implementation should suffice.
✏️ Learnings added
Learnt from: shumkov
PR: dashpay/platform#2248
File: packages/rs-drive-abci/src/main.rs:311-331
Timestamp: 2024-10-17T08:52:46.706Z
Learning: In our codebase, connections to the gRPC server happen within the local network, so timeouts are not expected, and adding timeouts is unnecessary.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
Issue being fixed or feature implemented
Drive's and Platform's statuses in dashmate doesn't work correctly due to multiple issues:
What was done?
How Has This Been Tested?
Breaking Changes
None
Checklist:
For repository code-owners and collaborators only
Summary by CodeRabbit
New Features
PLATFORM_DRIVE_ABCI_METRICS_URL
, enhancing configuration options.drive
, to the platform overview, providing more detailed service information.Bug Fixes
drive_abci
service.Documentation
Tests