Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: use commons module in simulator #271

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ata-nas
Copy link
Contributor

@ata-nas ata-nas commented Oct 17, 2024

Description:

  • refactor :simulator to start using logic from :common

Related issue(s):
Fixes #253

Notes for reviewer:

  • have I missed any places to refactor?

Checklist

  • removed com.hedera.block.simulator.Constants from :simulator and replace usages with com.hedera.block.common.constants.StringsConstants from :common
  • removed com.hedera.block.simulator.generator.Utils from :simulator and replace usages with com.hedera.block.common.utils.FileUtilities from :common

  • Documented (Code comments, README, etc.) - N/A
  • Tested (unit, integration, etc.) - N/A

@ata-nas ata-nas added Improvement Code changes driven by non business requirements Common Issue/PR related to Common module labels Oct 17, 2024
@ata-nas ata-nas added this to the 0.2.0 milestone Oct 17, 2024
@ata-nas ata-nas self-assigned this Oct 17, 2024
@ata-nas ata-nas linked an issue Oct 17, 2024 that may be closed by this pull request
@ata-nas ata-nas marked this pull request as ready for review October 17, 2024 14:24
@ata-nas ata-nas requested a review from a team as a code owner October 17, 2024 14:24
@ata-nas ata-nas force-pushed the 253-refactor-use-commons-module-in-simulator branch 4 times, most recently from a44ffcc to db7654d Compare October 23, 2024 07:22
@ata-nas ata-nas force-pushed the 253-refactor-use-commons-module-in-simulator branch from 3e5f927 to 7b202ca Compare October 24, 2024 06:55
@ata-nas ata-nas force-pushed the 253-refactor-use-commons-module-in-simulator branch from 7b202ca to 76efcc6 Compare October 24, 2024 12:59
AlfredoG87
AlfredoG87 previously approved these changes Oct 24, 2024
Copy link
Contributor

@AlfredoG87 AlfredoG87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, just 1 question/nit.

* @throws IOException if unable to read the file.
* @throws OutOfMemoryError if a byte array large enough to contain the file contents cannot be
* allocated (either because it exceeds MAX_INT bytes or exceeds
* available heap memory).
*/
public static byte[] readFileBytesUnsafe(@NonNull final Path filePath) throws IOException {
return readFileBytesUnsafe(filePath, ".blk", ".gz");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this use the Constants instead of hardcoding the values .blk and .gz?

Copy link
Contributor Author

@ata-nas ata-nas Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlfredoG87 good question. I was thinking about that today when reviewing #300, since that issue should IMO use this logic (see this comment in the review there). That being said, we used to have both these strings as constants in the beginning, but after a conversation we had with @jsync-swirlds and @mattp-swirldslabs we decided to remove them, since .blk is not generic and .gz we did not see any point in having, which I agree with both.

The reason you see that implemented like that is because this is a single use, does not require a constant and is intended to be here only and not visible to the outside world (@jsync-swirlds actually finished that part after I left a comment on that method, I believe this was the thought process, correct me if I am wrong). And this method was extracted from a previously existing one and it was implemented with that if-else check for the extensions. I agree that currently this is maybe not very generic because of the .blk extension check.

I would propose that this can be utilized now by @jasperpotts for this comment. and I think that the method should be refactored to not check for .blk but to be in the same fashion of what @jasperpotts currently has at 6bbb159 from my comment, and to also ommit the null check. Personally I would propose the following change to the readFileByteUnsafe:

    public static byte[] readFileBytesUnsafe(@NonNull final Path filePath) throws IOException {
        final String filePathAsString = Objects.requireNonNull(filePath).toString();
        if (filePathAsString.endsWith(".gz")) {
            return readGzipFileUnsafe(filePath);
        } else {
            return Files.readAllBytes(filePath);
        }
    }

and of course we remove the overloaded method that already exists since they have now the same parameters.

I do not believe this change will break the places this method with it's current implementation is currently used at, since they always hit either the .gz or the .blk so I expect no change in behavior, but of course this needs to be double checked.

This seems more generic to me and better fits the common module. @jsync-swirlds @mattp-swirldslabs @AlfredoG87 @jasperpotts What do you guys think about that change? I think is better to be done and @jasperpotts can use the generic logic in his tools now. Let me know

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these extensions is likely to remain in place forever. The whole point here is to have a method that accepts parameters for file extensions and offer a sane default via an override. The values are hardcoded in the override just to avoid breaking existing code. Once existing code changes to call the "parameterized" version, this convenience method can be removed entirely.

Let's not (ever) rewrite things as a single method that hardcodes things like .gz extensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsync-swirlds I do share your concern for breaking the existing logic if we change the implementation, I did mention that in my comment, I do not believe that if we change to an if-else we will currently break anything (but need to check thoroughly) but yet it is indeed dangerous to change the implementation in such a drastic way. So yes, based on that I do tend to agree that this should remain implemented as is and not get changed. @AlfredoG87 hope that resolves it, from my side it does, the danger of breaking something (even if it might not be visible now) is enough for me to drop the proposal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not concerned about breaking existing logic; we own that code and can fix it easily enough.

Designing methods that include hard-coded default values is just bad design. The whole point of the design before this change is that we have separate methods for default (take whatever the utility assumes with regard to file names), and specific (provide what values the caller requires with regard to file names).

Baking a filename suffix assumption into the method is just poor design, in my opinion (and is what we replaced in the PR that added this code to the common library).

My concern here is that we are taking well designed methods in a library, and making them less well-designed in order to not change logic in callers that is, itself, not well designed (because it had to be done quickly, with limited requirements visibility).

Copy link
Member

@jsync-swirlds jsync-swirlds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears you ran an IDE reformat on these files without disabling a lot of very bad defaults. It's important to keep in mind that most IDEs do not, and often cannot, correctly format the codebase. The gradle spotlessApply task is much preferred.
Most of these comments are to not apply those reformats (particularly in Javadoc which is using a very resource-expensive, completely unnecessary, and relatively difficult to read formatting pattern).

Beyond that the Constants file should not have been removed (it holds exactly the constants we should be keeping and using from the Simulator).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should actually be retained.
These constants are specific to the simulator and should be passed to the methods in common, rather than relying on defaults in that module.
We didn't leave constants out of common to hide things, but to encourage modules like simulator to retain their own constants and pass them in as parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the file should generally be retained, I removed it since all the usages it had are no longer used, and we would have had a class with either no fields (if I deleted the unused, in our case these are all fields), or only with fields that have 0 usages.

If you think I should bring that back now, let me know, and also if I bring it back, should I exclude the fields that were there, since they have 0 usages and we will be left with an empty placeholder file, or should I return it as it was and it will be a file with all fields unused?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be restored and the constants used in the appropriate calls. The reason you don't see usage is that you removed that usage. In reality we should be using these constants as parameters provided to methods in the common library when called from this module.

Copy link

codecov bot commented Oct 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.62%. Comparing base (446103e) to head (ee058f2).

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #271      +/-   ##
============================================
- Coverage     99.62%   99.62%   -0.01%     
+ Complexity      288      285       -3     
============================================
  Files            57       56       -1     
  Lines          1060     1054       -6     
  Branches         78       77       -1     
============================================
- Hits           1056     1050       -6     
  Misses            3        3              
  Partials          1        1              
Files with missing lines Coverage Δ
...a/com/hedera/block/common/utils/FileUtilities.java 95.12% <100.00%> (+0.12%) ⬆️
...ulator/generator/BlockAsDirBlockStreamManager.java 100.00% <100.00%> (ø)
...lator/generator/BlockAsFileBlockStreamManager.java 100.00% <100.00%> (ø)
.../simulator/generator/BlockAsFileLargeDataSets.java 100.00% <100.00%> (ø)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Common Issue/PR related to Common module Improvement Code changes driven by non business requirements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

refactor: use commons module in simulator
3 participants