Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service module defaulting user to root #221

Open
1 task
thenom opened this issue Aug 27, 2024 · 14 comments
Open
1 task

Service module defaulting user to root #221

thenom opened this issue Aug 27, 2024 · 14 comments
Labels

Comments

@thenom
Copy link

thenom commented Aug 27, 2024

Description

The container definition sub module is configured to set the user to null if not specified as it should. When using the container definition submodule in the service sub module, it gets set to 0 when not supplied.

https://github.com/terraform-aws-modules/terraform-aws-ecs/blob/3b70e1e46e1b96a2da7fbfe6e2c11d44009607f1/modules/service/main.tf#L573C3-L573C93

user                     = try(each.value.user, var.container_definition_defaults.user, 0)
  • ✋ I have searched the open/closed issues and my issue is not listed.

This issue is listed and i totally agree with the original posters comment by @jackylamhk:
#190 (comment)

I also spent too long trying to find out why my container was failing when moving to this module due to this issue. Defaulting to root (especially when un-documented) is surely a security issue.

I am posting this again as i noticed that the original one was closed/locked which means this it drops off the radar. I get the breaking change but surely security should be the priority and be cause enough for this to stay on the radar.

@bryantbiggs
Copy link
Member

And what value should it be set to knowing that not setting a value causes conflicts with the ECS APIs?

@thenom
Copy link
Author

thenom commented Aug 27, 2024

The AWS documentation states that the user property of a container definition is not required:
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ContainerDefinition.html

"The user to use inside the container. This parameter maps to User in the docker container create command and the --user option to docker run."

This suggests that if it does map to the docker --user then if not provided then the User specified in the image is used. Are you saying that the problem is that the service module always provides user so you need a value? Surely if its null then it just doesnt get provided at all or am i missing something?

Out of interest, what conflict do you get when its not set? To be honest, this is the first time i have encountered the user property in a task definition as i have always driven it with the dockerfile.

@bryantbiggs
Copy link
Member

the issue is not within this module, but a conflict between the ECS API and the Terraform AWS provider. You can see here for more details hashicorp/terraform-provider-aws#11526 (comment)

While I agree that using the root user is far from ideal - it then becomes a question of what value do you use? We cannot simply guess at a UID/GUID and "hope" that works - the root user tends to work most often because of its permissions and therefore a safer default value to ensure things "just work" in the majority of cases. This then puts the onus on users to explicitly set a value that works for their container that is not the root user ID

@jackylamhk
Copy link

jackylamhk commented Aug 28, 2024

Setting the user to null also works and lets ECS use the user set in the Dockerfile. If the value 0 is needed to work around the issue mentioned, I would suggest including this in the documentation so new users are warned about this default behavior.

For example:

Note: the default user is set to 0 as a workaround for an upstream issue. If you want to change this behavior, set the user to null.

Happy to create a new PR if it looks good to you.

@bryantbiggs
Copy link
Member

Happy to create a new PR if it looks good to you.

Thank you but no thank you - if folks aren't reading their Terraform plan output, why do we think they'll read the obscure notes we put in the repo docs 😅

@thenom
Copy link
Author

thenom commented Aug 29, 2024

As a new greenfield deploy there is a lot to take in during a plan which could have (and obviously has a few times) been missed. It wouldn't do any harm to highlight this in the docs as a someone doing a new setup would be referencing them. Not sure why the offer to update them for you is not wanted, is there something i am missing here that would be a negative?

@bryantbiggs
Copy link
Member

Not sure why the offer to update them for you is not wanted, is there something i am missing here that would be a negative?

Its not a negative, I just don't see any value in that one line blurb. For example, we have this which is under an explicit docs/ directory and yet users fail to read it:

I am all for improvements but we have been down this path and it hasn't yielded the results expected so why would we go down that same path? if there is a different way of making this information more apparent, or in this case, a different way to resolve this specific issue around diffs with the API and needing to set a default of the root users - I am all ears. In fact, if anyone wants to submit a pull request, I would really really REALLY appreciate if you can submit one on the provider that removes this issue so we can set null as default and things "just work" as one would expect. That would be very useful

@thenom
Copy link
Author

thenom commented Aug 29, 2024

Ok, I still think a note on the user variable would be worthwhile but I do like a golang challenge and will take a look at the provider 😛

@psantus
Copy link

psantus commented Sep 21, 2024

Using root is not such a big issue if you stick to read-only file system

@gmeligio
Copy link

Hi. Thank you for maintaining this module and caring for improvements. I hope this gives a new perspective.

TLDR

The default should be null:

user                     = try(each.value.user, var.container_definition_defaults.user, null)

Context

I was in the process of creating a PR now because I've been using my fork in production for the past month and it's been working OK. The only change needed for this issue is that line mentioned in the issue description.

I was drafting a PR today because I saw a month ago this past issue #102 that was closed without resolution by the github-actions bot because it was stale. But when creating the PR and adding the description, I noticed this new issue and wanted your feedback.

I'm happy to discuss this further and try to help because I would like to continue using this great module.

Longer explanation for the module

The default should be null, causing AWS ECS to pass null to the container runtime and it will either resolve the the USER directive passed in the image or to root if it's not specified. So, essentially, the null value keeps compatibility with the ECS API. From my point of view, it's not the responsibility of this module nor Terraform to guess the container user. The possible values would be like this depending on the use case:

  • user = null: it would be detected by the container runtime. For example, this is preferred if the image build is controlled by passing the desired USER directive.
  • user != null: defined by the person consuming this module through using the existing input variables. For example, this could happen when the image build is not controlled, but it should be less common.

I quote two explanations from the AWS and Docker documentation below.

AWS ECS "Run containers as a non-root user"

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/security-tasks-containers.html#security-tasks-containers-recommendations-run-non-root-users

You should run containers as a non-root user. By default, containers run as the root user unless the USER directive is included in your Dockerfile. The default Linux capabilities that are assigned by Docker restrict the actions that can be run as root, but only marginally. For example, a container running as root is still not allowed to access devices.

Docker blog "Understanding the Docker USER Instruction"

https://www.docker.com/blog/understanding-the-docker-user-instruction

Remember, if you don’t set a USER in your Dockerfile, the user will default to root. Always explicitly set a user, even if it’s just to make it clear who the container will run as.

@bryantbiggs
Copy link
Member

hi @gmeligio - tl;dr - its an upstream issue, not a module issue. we're talking about things that are inconsequential unless its fixed at the source

@gmeligio
Copy link

But the module sets the root (value 0) by default, changing how the upstream services would work by default (USER directive). Shouldn't the module have sensible defaults without changing upstream behaviour? The default root (value 0) it's currently modifying the security best practices of non-root user, which are supported by the upstream services if we pass null.

I'm not following what's the upstream issue. Could you elaborate a bit more?

@gmeligio
Copy link

Sorry for not reading your specific comment. To me it's not the same issue as this one.

the issue is not within this module, but a conflict between the ECS API and the Terraform AWS provider. You can see here for more details hashicorp/terraform-provider-aws#11526 (comment)

Regarding this upstream issue you mentioned, there is a comment that says

Still an issue with 4.63.0. Setting values to the defaults or null helps.

I might be missing something and it could be related. Anyways, the workaround of passing null doesnt' work currently in this module. I tried that but it keeps forcing the root user (value 0) because the try function in Terraform fails if it finds null, unless is the last argument. Maybe using coalesce could work in this case.

@gmeligio
Copy link

I was analyzing more the following links:

After analyzing a bit more, I think I understand now why the change was introduced in #101, but maybe it was unintended to force root and not allow the default ECS behaviour, and a way of passing user = null was expected to work.

The tradeoff between the options I see is:

  1. Option 1 Security best practice: Having a default user = null that is compatible with upstream ECS, which will cause permanent diffs and replacement of the task definition, because of the upstream issue Terraform forcing replacement of ECS task definition - (only on task definitions with mount points) hashicorp/terraform-provider-aws#11526.
  2. Option 2 Predictable plan: Having default user = 0 will keep the plans happy by default and won't replace the task definition. But it will behave differently than ECS, enforcing the root user and not allowing to pass null to use the default ECS behavior because of the use of the try function.

For my use case, I always replace the task definition in the CI/CD pipeline because the image is rebuilt in each pipeline run, so I didn't experience the upstream issue. That's why I considered only Option 1. Maybe applying Option 2 is the best for the users of this module, and I'm OK with that. I would only need a workaround for setting user = null, until the upstream issue is resolved as you mentioned. I've tried some alternatives without success so far.

Does someone know a workaround when consuming this module to pass user = null to use the USER directive from the image and bypass the enforcement of root in the container definition in this line?

user                     = try(each.value.user, var.container_definition_defaults.user, 0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants