Skip to content
This repository has been archived by the owner on Oct 18, 2022. It is now read-only.

Custom Config reader for reading configs and secrets from environment variables #6

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

venkatvghub
Copy link

Description

This PR adds a custom config reader to read configurations from environment variables.

Why is this needed

Specifically in container environments like docker and kubernetes, we might have multiple environments and also possibly want to control secrets and configurations using kubernetes secrets / configmaps. In addition, we do not want to commit these credentials into code as well. So, in essence, instead of directly file mounting the data-sources-config.yml as a secret, we can selectively define some of the configs like hostname, port, username, password etc as kubernetes or docker secrets.
E.g.: Our default data-sources-config.yml might look like this:

...
properties:
      MySQL:
        - db:
            mysql: jdbc:mysql://?somemysqlhost:3306/?thirdeye
          user: thirdeye_user
          password: thirdeye_password
...

With this PR, we can do the following:

- className: org.apache.pinot.thirdeye.datasource.sql.SqlThirdEyeDataSource
    properties:
      MySQL:
        - db:
            mysql: jdbc:mysql://?$MYSQL_HOSTNAME:?$MYSQL_PORT/?$THIRDEYE_DATABASE
          user: ?$MYSQL_USERNAME
          password: ?$MYSQL_PASSWORD

In essence, for all the above configs, add ?$<ENVIRONMENT_VARIABLE_NAME> to ensure we can pass these as environment variables
And then the kubernetes deployment will look like the following:

...
- name: MYSQL_USERNAME
   valueFrom:
      secretKeyRef:
        key: MYSQL_USERNAME
        name: thirdeye
- name: MYSQL_PASSWORD
   valueFrom:
      secretKeyRef:
        key: MYSQL_PASSWORD
        name: thirdeye
 ....

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New Config Options : As mentioned earlier, the new data store configs are provided above
  • TODO: Update the thirdeye documentation to provide examples for the same
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

  • This Section needs a bit of update based on updates to the documentation.

Documentation

TODO

@venkatvghub venkatvghub added the enhancement New feature or request label Mar 30, 2021
@venkatvghub venkatvghub force-pushed the custom_config_reader_upstream branch 2 times, most recently from 90b30be to 6b574b4 Compare March 30, 2021 15:10
@venkatvghub venkatvghub force-pushed the custom_config_reader_upstream branch from 6b574b4 to aeb6974 Compare March 31, 2021 08:51
@suvodeep-pyne
Copy link
Contributor

suvodeep-pyne commented Apr 6, 2021

Hi @venkatvghub

I like the idea. But I was wondering if we could adhere to the pinot convention of using ${ENV_VARIABLE} instead of the above style. This would keep things consistent. @fx19880617 Let me know your thoughts here.

Also, the implementation seems to be field specific? Do all the fields respect environment variables? I think it might be confusing if one field translates the env var while the other doesn't.

@venkatvghub
Copy link
Author

venkatvghub commented Apr 6, 2021

Hi @suvodeep-pyne

  1. Regarding ${ENV_VARIABLE}: sure thats just our style internally. Can definitely change this.
  2. Regarding seems to be field specific : In essence, we are trying to cover here only the data sources config thingie. Which means, have made modifications in the sql utils related stuff and the pinot related stuff. As far as i can think, beyond these, the rest all seem like frontend specific configurations. We can surely evolve that over time. In essence, the problem i see is no single config holder across the thirdeye universe. So, once that gets addressed, these could be addressed too. Or best, this could be a starting point and we can add those alongside. Thoughts?
    Could you provide specific examples where we might need this? The sole reason for me to do this was actually dealing with kubernetes secrets across multiple environments and being able to deal with this in a more container friendly way.

@suvodeep-pyne
Copy link
Contributor

Hi @venkatvghub

we could just do data-sources-config.yaml as a start. However, I think it'd be consistent to have that file support environment variables in general rather than a specific set of fields. Also, having the implementation inside PinotThirdEyeDataSource suggests that this feature is supported only on Pinot and nothing else. That is another surprise. So I'm a bit concerned regarding the architecture of this feature and its scalability.

IMO. To support environment variables, we could start with data-sources-config.yaml; no problem there. However, I'd incline towards a generalized approach which works for all data sources and for all fields (even just inside DataSourcesConfiguration). Also, a good thing to have is a few tests to ensure that we are processing files correctly and also outline the error mechanism: if let's say the variable does not exist.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants