Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: use commons module in server #272

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ata-nas
Copy link
Contributor

@ata-nas ata-nas commented Oct 17, 2024

Description:

  • refactor :server to start using logic from :common

Related issue(s):
Fixes #252

Notes for reviewer:

  • in the places where com.hedera.block.common.utils.FileUtilities#createPathIfNotExists is used, the previous alternative used Files#createDirectory, where the new version uses Files#createDirectories which acts quite differently, in fact, one unit test, namely testProvidesBlockWriter_IOException, which expects an exception to be thrown was not able to run successfully since it now creates the path using the new implementation with Files#createDirectories and previously it was not able to (the test is now modified). We should be careful about this since it is possible that the new usage could lead to silent bugs if unit tests do not already cover the needed edge cases.
  • have I missed any places to refactor?

Checklist

  • refactor com.hedera.block.server.Constants remove app.properties and logging.properties and use them from com.hedera.block.common.constants.StringsConstants;
  • remove com.hedera.block.server.persistence.storage.FileUtils and start using com.hedera.block.common.utils.FileUtilities instead;

  • Documented (Code comments, README, etc.)
  • Tested (unit, integration, etc.) - N/A

@ata-nas ata-nas added Improvement Code changes driven by non business requirements Common Issue/PR related to Common module labels Oct 17, 2024
@ata-nas ata-nas added this to the 0.2.0 milestone Oct 17, 2024
@ata-nas ata-nas self-assigned this Oct 17, 2024
@ata-nas ata-nas linked an issue Oct 17, 2024 that may be closed by this pull request
@ata-nas ata-nas marked this pull request as ready for review October 18, 2024 13:16
@ata-nas ata-nas requested a review from a team as a code owner October 18, 2024 13:16
@ata-nas ata-nas force-pushed the 252-refactor-use-commons-module-in-server branch 4 times, most recently from 40350e5 to b8092ec Compare October 23, 2024 07:57
@ata-nas ata-nas force-pushed the 252-refactor-use-commons-module-in-server branch from b8092ec to b58d804 Compare October 24, 2024 06:56
@ata-nas ata-nas force-pushed the 252-refactor-use-commons-module-in-server branch from b58d804 to 7ead1c1 Compare October 24, 2024 12:59
AlfredoG87
AlfredoG87 previously approved these changes Oct 24, 2024
Copy link
Contributor

@AlfredoG87 AlfredoG87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Member

@jsync-swirlds jsync-swirlds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing, but these changes are not correct.

try {
createPathIfNotExists(path, ERROR, BLOCK_NODE_ROOT_DIRECTORY_SEMANTIC_NAME, true);
} catch (final IOException e) {
throw new RuntimeException(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't just wrap an exception in RuntimeException; that just makes code super hard to debug.
At minimum, if you cannot declare the exception in throws, include a message that explains clearly why an exception is being rewrapped in such a way that it is effectively hidden.

@@ -16,37 +16,40 @@

package com.hedera.block.server.persistence.storage.read;

import com.hedera.block.server.persistence.storage.FileUtils;
import static com.hedera.block.common.utils.FileUtilities.DEFAULT_DIR_PERMISSIONS;
Copy link
Member

@jsync-swirlds jsync-swirlds Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't copy the default here (it really should be private in common, and it was).
Either accept the default using the overload method, or set your own from a local constant.

Copy link
Contributor Author

@ata-nas ata-nas Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about that. It should be the responsibility of the BlockAsDirReader to either call the overloaded OR to supply it's own set that is internal to the reader, read by configuration (based on other comments of yours on the builders for the reader/writer).

On another note, from other comments you have made, it is my understanding that the BlockAsDirWriter is not a production class, which would imply that BlockAsDirReader would also not be a production class.

Also, I do agree that the permission set here should come from configuration and by default, we should use the overloaded method in the file utility class directly, but considering that these classes are not production, do we still need to provide anything other than the default (using the overloaded method in the files utility as suggested)? Meaning do you expect that for this class we will ever supply anything else than the defaults, as it is now, nothing except the default has ever been provided. If not, then could we consider to remove the filePerms field here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, both reader and writer are not expected to be used in production, but we mustn't design them as test code, because any operator might choose to use them as the production storage for whatever reason (perhaps their node stores a small subset of block data and they want other processes to have easy access).
  2. Always try to design software well, and make the formal design documents (whether Java, C++, Rust, etc...) as readable and clear as you can within constraints.
    • This includes good practice of configurable classes with reasonable defaults managed as private data within the class.
  3. It is reasonable to remove the filePerms field, but might be better to leave it in with a default value of null.

Comment on lines -114 to +112
@NonNull final FileAttribute<Set<PosixFilePermission>> filePermissions,
@NonNull final FileAttribute<Set<PosixFilePermission>> folderPermissions,
@NonNull final FileAttribute<Set<PosixFilePermission>> permissions,
Copy link
Member

@jsync-swirlds jsync-swirlds Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. Files and folders MUST NOT have the same permissions, the split was added very deliberately to prevent a serious design flaw.

Copy link
Contributor Author

@ata-nas ata-nas Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to be confused about this.

I do understand of course that files and directories must not have the same permissions, so that part is clear.

But technically we will always either create a directory or a file. We will never need both filePermissions and folderPermissions, we technically need only one set of permissions and to have the boolean createDir to either create a directory or a folder using the permissions we pass. To me, if I will supply file and folder permissions and the boolean createDir is true, I will always use only the folderPermissions and if the boolean createDir is false, then I will always use only the filePermissions, so if I will always use only one set of permissions, why should I be supplying both?

But why I must be supplying both, when only one will ever be used based on the boolean? What is the design flaw in supplying only one set of permissions? Is it because it is error prone in the sense that I can supply mistakenly directory permissions and have the boolean createDir set to false and vice versa? If that is the reason, then I would argue that forcing the caller to supply both file and folder permissions is the same in terms of being error prone, if not even more error prone since we need to now consider to not mistakenly put the file permissions in the place of the folder permissions as well?

But I believe I am missing something here and I fail to understand the design flaw of having to supply only one set of permissions and having the boolean instead of both permissions and having the boolean as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error prone nature is exactly why we have both, it's not perfect, but it's less likely to have an unnoticed error (because code review can more easily notice mismatch between input and parameter name).

For this case, if we can guarantee that no current usage needs to create a file, then rename this to only be about folders, remove file permissions and the boolean flag, and adjust the callers as appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes to this file are incorrect.
You are re-introducing a severe design flaw by merging the permissions types, and the constants are not supposed to be public for good reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree for the constants of the file permissions that should be private, it does make sense for them to be only part of the FileUtilities internally.

I made them public, because I only wanted to make changes for this PR to reroute the already existing places where usages of the now deleted file utility class that was part of the server. I do however agree that some things need to be changed in general and we should have a bit of a different approach for the permission sets as you have suggested in previous comments, for example in the builders and tests and other places you have mentioned.

As for the re-introduction of a severe design flaw by merging the permission types into one parameter, I do fail to understand that part, explained in the comment above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, it comes down to file and folder permissions and the need to keep them separate.
Also, it seems the commons method should have just been for creating folders, not both files and folders (and making it more specific clears up the design flaws too).

If we need a similar method for files in the future, we're probably better off having two methods, the current method for folders, and a new method for files.

Copy link
Member

@jsync-swirlds jsync-swirlds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the rest reviewed.
Let's chat about this code, as there are obviously some gaps in terms of understanding why some things are written the way they were.

Comment on lines 174 to 179
createPathIfNotExists(
calculateBlockPath(),
DEBUG,
filePerms,
BLOCK_NODE_ROOT_DIRECTORY_SEMANTIC_NAME,
true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the override with defaults for this class; it's not a production class.
In the production classes we'll always set file and folder permissions, which should not have a default outside of the configuration records.

Copy link
Contributor Author

@ata-nas ata-nas Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about that, but since this class is not a production class, do you envision that we will ever need anything else than the defaults, meaning to use the overloaded method? If not, can we then consider to remove the filePerms field, as well as in the builder for this class and also to do the same in the BlockAsDirReader and it's builder, since it is the same thing there (hopefully I am implying correctly that the reader is also a non-production class). The only place currently where for the BlockAsDirReaderBuilder and BlockAsDirReaderWriter the filePerms field is anything other than the default (the setter in the builder is used), is in tests, but they again are currently setting the defaults themselves, so basically they are currently not doing anything. But if you envision that for these non-production classes, we will ever need anything other than the default, then they should stay as fields and be set via configs as you suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of the overload is to enable a class like this to accept defaults without having to manage that default value.

It's fine to remove the class attribute, but it also makes sense to allow for override in a test/development class like this. If the value is not set (it is null) then it was not overridden and the code will call the overload without a permissions. If the value is set (it is not null), then the code calls the version that requires permissions. The tricky part is that file and folder must not have the same permissions.
For now, just removing the permissions support in this class and calling the overload without the permissions parameters is acceptable.

@@ -16,8 +16,9 @@

package com.hedera.block.server.persistence.storage.write;

import static com.hedera.block.common.utils.FileUtilities.DEFAULT_DIR_PERMISSIONS;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be digging into commons for constants, any constants we need belong in this module.
That said, there should not be defaults for file or folder permissions. Those need to be in configuration or nowhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my response for this in the above comments, for the BlockAsDirWriterBuilder and BlockAsDirWriterBuilder, it applies here as well.

private final BlockNodeContext blockNodeContext;
private FileAttribute<Set<PosixFilePermission>> filePerms = FileUtils.defaultPerms;
private FileAttribute<Set<PosixFilePermission>> filePerms = DEFAULT_DIR_PERMISSIONS;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted elsewhere, it's never correct to copy a constant like this into a class attribute. Either the caller supplies the value, or it's null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my response for this in the above comments, for the BlockAsDirWriterBuilder and BlockAsDirWriterBuilder, it applies here as well.

this.blockRemover =
new BlockAsDirRemover(Path.of(config.rootPath()), FileUtils.defaultPerms);
new BlockAsDirRemover(Path.of(config.rootPath()), DEFAULT_DIR_PERMISSIONS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we setting permissions for removing folders? That seems broken and incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I am not sure as well. The file perms are used in a method inside the remover remove(long). Based on the comments inside and the single usage it has currently, I believe the method is intended as a best-effort attempt to delete side effects caused by exceptions that could occur during the writing of a BlockItem to a file. There is a retry for the write which will first attempt to fix the file permissions and if it fails again then we go into the remove(long) method of the remover. Inside the remove method, the file perms that are passed in the constructor are used as Files.setPosixFilePermissions(blockPath, filePerms.value()); before attempting to do the actual removal of the file. Maybe @mattp-swirldslabs could let us know and give us more context as to why this should be done like so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend just removing these permissions from the remover.
If we cannot delete a file, we also cannot change permissions (at least in any POSIX compliant system).

@@ -241,7 +241,7 @@ private void removeBlockItemReadPerms(
// IOException while allowing the real setPerm() method to remain protected.
private static final class TestBlockAsDirReader extends BlockAsDirReader {
public TestBlockAsDirReader(PersistenceStorageConfig config) {
super(config, FileUtils.defaultPerms);
super(config, DEFAULT_DIR_PERMISSIONS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either accept private defaults, or set explicit values in the test. Having a test explicitly set a value it doesn't control is asking for a brittle test failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed about that. Based on comments above and on what we shall agree on, if to remove or not the filePerms field in the super, this would either become null and the actual reader will use the overloaded method internally, or I will supply them by locally having them declared here and not use from the common module

@@ -76,7 +76,7 @@ public void testRemoveNonExistentBlock() throws IOException, ParseException {
}

// Remove a block that does not exist
final BlockRemover blockRemover = new BlockAsDirRemover(testPath, FileUtils.defaultPerms);
final BlockRemover blockRemover = new BlockAsDirRemover(testPath, DEFAULT_DIR_PERMISSIONS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, my comment is the same as my above response, considered only for the tests isolated.

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.61%. Comparing base (446103e) to head (4d4f691).

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #272      +/-   ##
============================================
- Coverage     99.62%   99.61%   -0.01%     
+ Complexity      288      285       -3     
============================================
  Files            57       56       -1     
  Lines          1060     1051       -9     
  Branches         78       78              
============================================
- Hits           1056     1047       -9     
  Misses            3        3              
  Partials          1        1              
Files with missing lines Coverage Δ
...a/com/hedera/block/common/utils/FileUtilities.java 95.00% <100.00%> (ø)
.../persistence/storage/PersistenceStorageConfig.java 100.00% <100.00%> (ø)
...sistence/storage/read/BlockAsDirReaderBuilder.java 100.00% <100.00%> (ø)
...er/persistence/storage/write/BlockAsDirWriter.java 100.00% <100.00%> (ø)
...istence/storage/write/BlockAsDirWriterBuilder.java 100.00% <100.00%> (ø)

@ata-nas
Copy link
Contributor Author

ata-nas commented Oct 25, 2024

@jsync-swirlds thanks for the reviews.

Keep in mind that when I was doing these changes (as well as the ones in the simulator in the other PR), my aim was to only kind of 'redirect' all current usages of the utilities and constants to the new common module that we introduced. That is why I have left everything as is and have not made any significant changes, but took the approach to modify a bit the common module in order to fit the 'redirection', which I do agree with you is not a good thing to do.

I have fixed the obvious and commented on the rest, but I agree, lets have a conversation in general about the topics here and others as well, since this would help me to gain more insight into the Hedera philosophy and the team's specific views on what we should aim for in terms of implementation and thought process. This will help avoid such big gaps and so many questions in the future. Thanks! :)

@jsync-swirlds
Copy link
Member

@jsync-swirlds thanks for the reviews.

Keep in mind that when I was doing these changes (as well as the ones in the simulator in the other PR), my aim was to only kind of 'redirect' all current usages of the utilities and constants to the new common module that we introduced. That is why I have left everything as is and have not made any significant changes, but took the approach to modify a bit the common module in order to fit the 'redirection', which I do agree with you is not a good thing to do.

I have fixed the obvious and commented on the rest, but I agree, lets have a conversation in general about the topics here and others as well, since this would help me to gain more insight into the Hedera philosophy and the team's specific views on what we should aim for in terms of implementation and thought process. This will help avoid such big gaps and so many questions in the future. Thanks! :)

I understand the intent. I think (as we discussed) that you would be better off (and the result will be better designed) to keep the shared library as clean as practical, and make whatever changes are needed in the server to correctly use that library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Common Issue/PR related to Common module Improvement Code changes driven by non business requirements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor: use commons module in server
3 participants