Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EC2 Instance sync => new data model #1146

Merged
merged 3 commits into from
Mar 31, 2023
Merged

Conversation

achantavy
Copy link
Contributor

@achantavy achantavy commented Mar 20, 2023

Refactors EC2 instance sync to use the cartography data model. Adds previously missing tests.

This is not a breaking change as the ID conventions are still the same as before (e.g. using EC2Instance instanceid for id instead of arn), even if they are inconsistent with other modules.

A follow-up PR will standardize IDs using ARNs.


for network_interface in instance['NetworkInterfaces']:
for security_group in network_interface['Groups']:
network_interface_list.append({
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@resilience-jychp - remember how we talked about cardinality? This is how I'm dealing with it in a transform function. In this case, an instance can have multiple network interfaces, and each of those can have multiple security groups, so I'll just add them each as their own entry to the list.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@achantavy did the same, not totaly convinced it's the best way to do it, but definitively do the job

@achantavy achantavy requested a review from jychp March 25, 2023 04:58
jychp
jychp previously approved these changes Mar 27, 2023
Copy link
Collaborator

@jychp jychp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model migration seems OK
I have a much better understanding of the role of the SubResource relationship now

Did not check, but maybe module doc need to be updated after this migration ?

InstanceId=instance_data['InstanceId'],
update_tag=update_tag,
)
def transform_ec2_instances(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ramonpetgrave64 - what you think about this big function, should I split it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather that return such a big tuple. Maybe convert it to a namedtuple, so we don't have to be so careful about the ordering?

It's a big one, but if we break it into many functions, we would be repeating a bit of code to dig deep into the objects. I'm okay with it either way.

)


def test_transform_and_load_ec2_tags(neo4j_session):
@patch.object(cartography.intel.aws.ec2.instances, 'get_ec2_instances', return_value=DESCRIBE_INSTANCES['Reservations'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocker: i feel like the patch belongs attached to _ensure_local_neo4j_has_test_ec2_instance_data . Maybe you can convert it to an autouse fixture, and do the patch within there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm going to defer this one, we can refactor the test in a follow-up.

)


def test_load_instance_information(neo4j_session):
@patch.object(cartography.intel.aws.ec2.instances, 'get_ec2_instances', return_value=DESCRIBE_INSTANCES['Reservations'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nonblocker: same comment about patches

@achantavy
Copy link
Contributor Author

Did not check, but maybe module doc need to be updated after this migration ?

Nah, this PR keeps all the fields the same so should be good.

@achantavy achantavy merged commit 915c254 into master Mar 31, 2023
@achantavy achantavy deleted the ec2instancenewdatamodel branch March 31, 2023 21:25
achantavy added a commit that referenced this pull request Apr 3, 2023
Fast follow of #1146 - removes
duplicate calls to cleanup jobs. All these manual jobs will soon be
deprecated as we refactor modules to the new data model.
chandanchowdhury pushed a commit to juju4/cartography that referenced this pull request Jun 26, 2024
Refactors EC2 instance sync to use the cartography data model. Adds
previously missing tests.

This is not a breaking change as the ID conventions are still the same
as before (e.g. using EC2Instance instanceid for id instead of arn),
even if they are inconsistent with other modules.

A follow-up PR will standardize IDs using ARNs.
chandanchowdhury pushed a commit to juju4/cartography that referenced this pull request Jun 26, 2024
Fast follow of cartography-cncf#1146 - removes
duplicate calls to cleanup jobs. All these manual jobs will soon be
deprecated as we refactor modules to the new data model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants