Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Redesign Build Deployment Process (External) (#961)
* Read secret_list from actual file if present, else use sample file. Added a try-catch block similar to how other config files are used in case where actual non-sample file is present else use the sample file. Can remove the file from internal repo since not needed anymore. * Matched conf/log files with internal repo Changed level to DEBUG and formatter set to detailed. Can keep both external and internal versions same by making external also have the same logging level as internal. Internal version given priority as detailed logging is better for detecting errors. Also, debug() statements [~1500] are much more than warning() statements [~50]. Will lose out on all these when external repo is used / run if only WARNING level set as default which is higher than DEBUG level. * Reading push.json values from environment variables Push.json.sample just has 4 key-value pairs which can be changed to ENV variables internally. Hence, need to write code to read from Env variables. Changing emission/net/ext_service/push/notify_interface.py to read from env variables. Using helper class config.py, based on existing template. First reading from .json file if exists, else read from env variables. If env variables are all None, then say that values not configured. * Choose analysis/debug.conf file based on ENV var Added new environment variable PROD_STAGE. If this is set to TRUE, then the debug.conf.internal.json will be selected as the conf file. Else, the existing try-catch block is executed which either checks for a debug.conf.json or uses the sample file if the former does not exist. The try part is still valid, since some Test files copy over the sample file as debug.conf.json, then reload the config in eac.reload_config() which would need reading from debug.conf.json as well. Hence, now three files exist: - debug.conf.json.sample - debug.conf.json - debug.conf.json.internal * Testing method to share tag between repos This is the first part of the method to share the server image tag between other repositories. It would be shared as tag.txt, which gets overwritten every time the image build runs. * Changed webserver.conf.json to Environment variables + Removed sed / jq from docker_start_script.sh Summary: Currently replaced config file values with environment variables. Expecting developers to manually set env vars instead of conf files. So, for instance, running containers using `docker-compose` or `docker run`, need to set these values similarly like push environment variables will need to be set. Key-value pairs in the webserver config file: REMOVED: log_base_dir, python_path, log_file ENV: static_path, 404_redirect, host, port, timeout, auth, aggregate_call_auth ——————————— Changes 1. webserver.conf.sample - Kept file as it is and changed how values being read in cfc_webapp.py, TestWebserver.py and docker_start_script.sh 2. TestWebserver.py - In setup(): Storing current original ENV variables, Replacing them with test values - In teardown(): Restoring original ENV variables 3. cfc_webapp.py - Removed log_base_dir, python_path, log_file as these were only used in the cfc_webapp.py to read in the value from the config file and not used elsewhere. - Added environment variables for the other config key-value pairs to avoid dependence on config file and as they were being used in the cfc_webapp.py file. 4. docker_start_script.sh - Removed sed / jq usage for editing webserver.conf.sample.json file and copying over as webserver.conf.json; simply setting the environment variable for WEB_SERVER_HOST now. ------------------- Special notes: 1. config.json <-> webserver.conf.sample.json - Some files still use config.json and these changes were made 7-8 years ago (even 10 years ago). This config.json file has the same contents as the current webserver.conf.sample. - So, webserver.conf.sample.json was actually config.json at some point! a028dec 2. Sed localhost replacement isn’t functionally correct - The default sample value for server.host is "0.0.0.0". - But the sed command replaces “localhost” with the ENV variable value. - Since the localhost keyword itself isn’t there in the sample file, the replacement will not work either. ---------------- * Corrected logic to set test webserver ENV variables Initially, was setting them in the if block if ENV variables already existed. It wasn’t being set as if condition was being evaluated as False. But if condition should be mainly to store existing values. In any case, test values must be set, hence moving it outside if block. * Changed Webserver.conf + Db.conf to Environment variable + Removed sed / jq use Details of files changed 1. Start scripts .docker/docker_start_script.sh emission/integrationTests/start_integration_tests.sh setup/tests/start_script.sh - Changed seq / jq usage to directly set Environment variable to desired value; no need of saving sample file as actual conf json file. 2. Config.py files emission/core/config.py emission/net/api/config.py emission/net/ext_service/push/config.py - Based this file on emission/analysis/config.py - Added these to read from conf files if present or environment variables instead of sample files. - Default values set are taken from sample files. - check_unset_env_vars() can be used to check whether ALL environment variables are unset. 3. DB, Webapp, Push application usage files emission/core/get_database.py emission/net/api/cfc_webapp.py emission/net/ext_service/push/notify_interface.py - Changed logic to read using the config.py files that read the non-sample actual config files if present or from the Environment variables instead of sample files. 4. Test Files emission/integrationTests/storageTests/TestMongodbAuth.py emission/tests/netTests/TestWebserver.py emission/tests/netTests/TestPush.py - Test files that exercise the functionality of the logic in the files in (3). - Earlier, config files were being replaced with test values and copied over for testing purposes. - Now, switching to using environment variables - call sent to config files in (2) indirectly via application usage files in (3) - Following flow is followed in reading from and restoring original environment variables values setup() - Sets ENV vars by storing original vars if set, then uses test ENV vars as new values TestFunc() - importing modules named in (3) causes values to be read in, which now reads in newer test values since they set the ENV vars in setup() - only those ENV vars in test values are set; unchanged ones left untouched or default values read using os.getenv(var name, default) teardown() - Unset test ENV vars by using del os.environ[var_name] - Restore original values from original dict * Reverting Natalie testing artifact changes + Adding image-push-merge * Fixes for failing TestTokenQueries print assertions test_run_script_empty () and test_run_script_show() were failing - Was occurring because I had used logging.debug() instead of print() - Hence std output stream was unable to get the print(“storage not configured”) statement Some more fixes: - db.conf incorrectly read instead of webserver.conf in config.py in emisison/net/api - TestPush had an incomplete test, that was used just for testing purposes to invoke calls on importing the pni module. - TestWebserver.py changed print() statements to logging.debug() * Fixes for failing TestTokenQueries print assertions Assertion Error this time in another CI GitHub actions Workflow name : test-with-docker Succeeded earlier when the other test (ubuntu-only-test-with-manual-install) failed: https://github.com/e-mission/e-mission-server/actions/runs/8658279606/job/23741854476 - This happened since then “storage not configured” wasn’t being printed, which is what the test wants as docker compose is used to set up mongo db container. - With docker, localhost becomes db as the DB_HOST ENV var or “url” key-value. SOLVED - Added if condition to print “storage not configured” only when url value is till equal to sample value “localhost” * Try-except block brought to the top Will first check if debug.conf exists, like other conf files. If it does not, except block then check if PROD_STAGE environment variable is set and whether debug.conf.internal is present. Else, fall back to sample conf. * Removed extraneous seed_model.json Accidentally added this after running all tests. * TODO added to change to master branch in YML file * Adding image-push-merge branch to automated CI/CD tests * Upload artifact test - 1 Removed test workflow execution. Added upload artifact. * Upload artifact test - 2 Commented image build and push for testing purposes. Fixed indentation. Learnt that YAML do not permit using tabs as indentation; spaces allowed. * Added temporary test file Temporary test file while working on automating tags modification. Checking to see if new artifacts generated even if overwrite is set to true. * Changes to actions + echo Trying to build an image off this branch instead of the other branch, so that testing is more relevant to the changes on consolidate-differences. * Upload artifact test - 3 Storing actual required tag with date timestamp. * Repository dispatch send - 1 Checked out from image-push-merge branch which has artifact method working. - Changed trigger branch on push to tags-dispatch - Dispatching repository event from server so that join page repo's workflow will be triggered. - Once this dispatch method works, might not even need artifacts since repository dispatch has a client_payload option to pass data. - This is helpful since only 500 artifacts are allowed in any repo: https://github.com/actions/upload-artifact#:~:text=has%20a%20limit-,of%20500%20artifacts.,-For%20assistance%20with * Workflow dispatch send - 1 Trying with workflow dispatch. * Workflow dispatch send - 2 Removing failed inputs * Workflow dispatch send - 3 Testing triggering workflow dispatch again. * Workflow dispatch send - 3 Changed branch name to new branch so it picks workflow events in join from this branch. * Workflow dispatch send - 4 Testing if working with updated token scopes to have only actions: write * Workflow dispatch send - 5 Testing sending docker image tags as input values in the POST request. * Workflow dispatch send - 6 Sending actual date timestamp which is used as a suffix to the docker image tag. * Workflow dispatch send - 7 Removed code related to artifact method for sending docker image tags. Workflow dispatch uses POST request via which input data values can be sent to the repository in which the workflow is to be triggered. * Matrix build send - 1 Copied over YAML file from e-mission-server repo after successfully implementing transmission of docker image tags from e-mission-server to join with artifacts and workflow dispatch. Added another dispatch job to trigger workflow via matrix strategy. * Matrix build send - 2 Updated branch name which was set for the workflow dispatch without matrix build. * Matrix build send - 3 Jobs completed successfully but nothing happens in the other three target repos. I observed in real-time that the matrix jobs - "dispatch”, start before or at the same time the main build job and get over really quickly. Hence, the error makes sense as the docker image tag is not generated yet: { "message": "Required input 'docker_image_tag' not provided", "documentation_url": "https://docs.github.com/rest/actions/workflows#create-a-workflow-dispatch-event" } Hence, the timeout-minutes parameter makes sense to use as was mentioned in the blog post: https://www.amaysim.technology/blog/using-github-actions-to-trigger-actions-across-repos 5 minutes seems too much, I’ll give it a minute for now. Alternatively, timeout not needed; can use "needs" similar to how fetch_run_id job was run first before trying to download artifact. * Matrix build send - 3 Since the date is being generated in a different jobs, must access it using needs keyword. * Matrix build send - 4 Properly added date by setting it as an output value of the 1st build job. Then storing it as an environment variable in the 2nd job by accessing it via the needs keyword. Finally using it in the curl POST request by referencing the environment variable. * Matrix build send - 5 Editing sample file as a sanity check to see if matrix build works. * Fix for "url" KeyError observed in public-dash redesign testing The bug was triggered by a .gitignore-d conf/ directory in public-dash which had a db.conf file. This was being loaded when docker-compose.dev.yml was used to test the dev version of public-dash. This was occurring since there was a nested dictionary in the db.conf.sample and db.conf files while I had initially only stored the nested keys (url, result_limit). Since it was still reading from the file, it stored the nested dictionary format with timeseries as the parent key followed by (url, result_limit) as children. Hence, fixing it by adding the same nested dictionary structure in the emission/core/config.py and emission/core/get_database.py * Fix for "url" KeyError observed in public-dash redesign testing The bug was triggered by a .gitignore-d conf/ directory in public-dash which had a db.conf file. This was being loaded when docker-compose.dev.yml was used to test the dev version of public-dash. This was occurring since there was a nested dictionary in the db.conf.sample and db.conf files while I had initially only stored the nested keys (url, result_limit). Since it was still reading from the file, it stored the nested dictionary format with timeseries as the parent key followed by (url, result_limit) as children. Hence, fixing it by adding the same nested dictionary structure in the emission/core/config.py and emission/core/get_database.py * Artifact + Matrix - 1 Added build and push to have latest image available in docker hub that is generated with updated YAML file. Removed join repo from workflow dispatch. Added artifact so that this can be used in case of push trigger event in admin-dash and public-dash repos with the help of the python script fetch_runID.py in those repos, to get the latest run_ID for a specific branch with the artifact. Workflow dispatch would still use the input parameters which contains the latest timestamp tag. * Artifact + Matrix - 2 Changed branch name to tags-combo-approach; resulted in a failed workflow in admin-dash, public-dash since branch was still set to tags-matrix. * Artifact + Matrix - 3 Changing step name to exclude join-page; included earlier for testing purposes but now focussing on the dashboard repos which actually use the server as the base image and hence need the workflow to be triggered. * Revert "Adding image-push-merge branch to automated CI/CD tests" This reverts commit 2fdb469. * Revert "TODO added to change to master branch in YML file" This reverts commit 273c2ca. * Added TODOs in github actions workflow YAML file Reminder for things to change as per master branch of e-mission-server once changes are finalized. * Artifact + Matrix - 4 Adding extra words to TODO just to trigger a workflow dispatch run and check if .env commit action works successfully for this trigger type too. * Cleanup changes Getting the server changes ready for merge. I modified the yml jobs to run on master again, since this was switched off for testing. This might trigger a cascade of events, and I think we all the other PRs are ready so we can see if their jobs are triggered too. * Update image_build_push.yml re-add run on gis-based-mode-detection branch * More cleanup + testing image build? Trying to trigger image_build_push.yml and removing more push triggers on other branches. * Hardcoded webhost Changes to set the webhost to 0.0.0.0 at all times * secret.py Removing the duplicate section * Restore intake.conf.sample Reverting the logging level to WARNING * Reverting webserver.conf.sample Restoring to WARNING logging level * Removing check_unset_env_vars Removing the functionality of check_unset_env_vars. DB_HOST should be caught in start_script.sh. When I rebuilt the image without this functionality and ran the container, all tests passed. * Removing check_unset_env_vars functionality * Update image_build_push.yml Removing consolidate differences branch * Update docker_start_script.sh Removing DB_HOST if-else functionality, as it's not necessary for the container. * Setting DB_HOST=db Setting DB_HOST=db as the default for running in a docker container * Update docker_start_script.sh Removing unnecessary echo * Rename debug.conf.internal.json to debug.conf.prod.json * Update config.py Updating to reflect the changed names of the debug.conf.json files * Push to rename My machine is acting up and not allowing me to rename this file. Going to do it manually in github, but need to make an inconsequential change first. * Update and rename debug.conf.json.sample to debug.conf.dev.json Removing the inconsequential change from the last push and renaming manually. This was my issue: desktop/desktop#13588 * common.py fix? Thought I would quickly change the name of debug.conf.json.sample to debug.conf.dev.json; this was not inconsequential. Seeing if this fixes it. * Removing redundant DB_HOST setting DB_HOST is hardcoded in the compose file that runs this script anyway. * reverting dockerfile + start script changes reverting to DB_HOST='' since setting it to db probably isnt the best idea. Also removing the DB_HOST fallback in start_script.sh since it'll be caught in config.py * Testing workflows with compose Had to revert some changes but need to test the functionality of docker compose * Triggering workflows. * Reverting image_build_push.yml * Not showing changes to branches for some reason in image_build_push.yml I removed the consolidate differences branch a long time ago, but this change is not reflected in the code review space for some reason. Trying to force it to display the change. * Adding comment to see if it resolves github display error GitHub isn't displaying the changes I've made to this file, namely removing the consolidate-differences branch. I'm not sure why it's not displaying this, since when I go to edit the file in github, it looks correct. So, I'm adding a comment to see if it updates the file and makes it display correctly. * 🩹 Don't cat the db.conf file Since it won't exist #961 (comment) * 🩹 Use the correct filename in the gitignore To be consistent with 29869cd * 🩹 Set the default value for the `DB_HOST` as well Since we now don't set it in the start script, we should do so in the `Dockerfile`, consistent with the `WEB_SERVER_HOST` set in MukuFlash03@912bd34 It is interesting that this was set earlier but then reverted in MukuFlash03@e6b388b As long as we can override it, I think we should be fine. * 🩹 Unify the supported user inputs across the debug and prod configs Before this change ``` < "userinput.keylist": ["manual/mode_confirm", "manual/purpose_confirm", "manual/trip_user_input", "manual/place_user_input"] --- > "userinput.keylist": ["manual/mode_confirm", "manual/purpose_confirm", "manual/replaced_mode", "manual/trip_user_input"] ``` After this change, no diffs in the `userinput.keylist`. Only changes are to the assertions being enabled ``` $ diff conf/analysis/debug.conf.dev.json conf/analysis/debug.conf.prod.json 3,4c3,4 < "intake.cleaning.clean_and_resample.speedDistanceAssertions": true, < "intake.cleaning.clean_and_resample.sectionValidityAssertions": true, --- > "intake.cleaning.clean_and_resample.speedDistanceAssertions": false, > "intake.cleaning.clean_and_resample.sectionValidityAssertions": false, ``` * 🩹 Remove the cat of db.conf from the integration tests as well Similar to 2067055 but in a different file * 🩹 Cleanup environment variables in the basic start script - Remove cat - Don't set WEB_HOST since it is set in the Dockerfile Consistent with: 2067055 and 51e16de * ♻️ Move the config to a different file name that makes more sense So that we can consolidate all the backwards compat code in one place. if there is a config file and the environment variable is set, we need to decide which one wins. We will pull out the code into a common class to ensure DRY. Once the backwards compat has been removed, this can be merged into the individual config files * ♻️ Refactor the backwards config file to be reusable 1. This will avoid having multiple, almost identical, copies of the same file 2. It will ensure that we read the code primarily in "the new way" from a dict 3. it should make removing the backwards compat layer easier in the future since we are reading from a dict with a default anyway Testing done: ``` $ ./e-mission-py.bash emission/tests/storageTests/TestTimeSeries.py ---------------------------------------------------------------------- Ran 8 tests in 18.819s OK ``` * 🔊 log the full backtrace if the config file is formatted incorrectly This allows us to troubleshoot the config files and fix them instead of only falling back to the defaults. * ♻️ Move the api configuration into the backwards compat as well Consistent with #961 (comment) and 7f1be92 (which was the original example for the database) * ♻️ Move the api configuration into the backwards compat as well Consistent with #961 (comment) and 7f1be92 (which was the original example for the database) Testing done: 1. Started the webapp with no overridden config ``` Config file not found, returning a copy of the environment variables instead... Finished configuring logging for <RootLogger root (WARNING)> Using auth method skip Replaced json_dumps in plugin with the one from bson Changing bt.json_loads from <function <lambda> at 0x10b2b08b0> to <function <lambda> at 0x10bd6a940> Running with HTTPS turned OFF - use a reverse proxy on production Bottle v0.13-dev server starting up (using CherootServer())... Listening on http://0.0.0.0:8080/ Hit Ctrl-C to quit. ``` 2. Started the webapp with an invalid config override ``` ERROR:root:Expecting ',' delimiter: line 12 column 5 (char 464) Traceback (most recent call last): File "/Users/kshankar/Desktop/data/e-mission/gis_branch_tests/emission/core/backwards_compat_config.py", line 28, in get_config loaded_val = pd.json_normalize(json.load(config_file)).iloc[0] ... json.decoder.JSONDecodeError: Expecting ',' delimiter: line 12 column 5 (char 464) Config file not found, returning a copy of the environment variables instead... ``` 3. Started the webapp with a valid config override ``` Finished configuring logging for <RootLogger root (WARNING)> Using auth method token_list Replaced json_dumps in plugin with the one from bson Changing bt.json_loads from <function <lambda> at 0x10fd2a8b0> to <function <lambda> at 0x1107e5940> Running with HTTPS turned OFF - use a reverse proxy on production Bottle v0.13-dev server starting up (using CherootServer())... Listening on http://0.0.0.0:8080/ Hit Ctrl-C to quit. ``` * ♻️ Move the push configuration into the backwards compat as well Consistent with #961 (comment) and - 3dea305 (example for the api) - 7f1be92 (original example for the database) Testing done: ``` $ ./e-mission-py.bash emission/tests/netTests/TestPush.py ---------------------------------------------------------------------- Ran 5 tests in 0.276s OK ``` * 🔊 Indicate that we are using the default production config if no override is found * ♻️ Pull out the code to reset the environment variable overrides to a common file This is consistent with - a0f0c6a (push config) - 3dea305 (api config) but ensures that we follow DRY * ✅ Fix the expected text while checking for tokens In 7f1be92, we changed the DB configuration to be based on environment variables. This changed the output text when importing the module and launching the script. This change fixes the test token tests that compare the output text with a baseline to reflect the new expected text. * 🔥 Remove the copied over config file Since we now use the standard backwards compat module (a0f0c6a) * ♻️ Access the environment variables from the config using `.get` So that we can get a full list of the environment variable by grepping properly ``` $ grep -r 'config.get("[A-Z]' emission/ emission//net/ext_service/push/notify_interface_impl/firebase.py: self.server_auth_token = push_config.get("PUSH_SERVER_AUTH_TOKEN") emission//net/ext_service/push/notify_interface_impl/firebase.py: self.app_package_name = push_config.get("PUSH_APP_PACKAGE_NAME") emission//net/ext_service/push/notify_interface_impl/firebase.py: self.is_fcm_format = push_config.get("PUSH_IOS_TOKEN_FORMAT") == "fcm" emission//net/ext_service/push/notify_interface.py: return NotifyInterfaceFactory.getNotifyInterface(push_config.get("PUSH_PROVIDER")) emission//net/api/cfc_webapp.py:server_port = config.get("WEBSERVER_PORT", 8080) emission//net/api/cfc_webapp.py:socket_timeout = config.get("WEBSERVER_TIMEOUT", 3600) emission//net/api/cfc_webapp.py:auth_method = config.get("WEBSERVER_AUTH", "skip") emission//net/api/cfc_webapp.py:aggregate_call_auth = config.get("WEBSERVER_AGGREGATE_CALL_AUTH", "no_auth") emission//net/api/cfc_webapp.py:not_found_redirect = config.get("WEBSERVER_NOT_FOUND_REDIRECT", "https://nrel.gov/openpath") emission//core/get_database.py:url = config.get("DB_HOST", "localhost") emission//core/get_database.py:result_limit = config.get("DB_RESULT_LIMIT", 250000) ``` * ✅ Remove environment variables that are likely to be different across runs The tests failed because we now print all environment variables, and the `hostname` changes between runs. So comparing the output with the known good output always fails. We strip out variables that are likely to change over time (hostname, conda version...) to make this test more robust. 1. We technically only need the hostname right now, but will add the conda version as well so that we don't get spurious failures when the versions change 2. this was not an issue earlier because we read the values from the config file. We now read environment variables, but that brings in variables that we did not set. So this is a new issue that we need to resolve by stripping them out for the baseline comparison. ``` $ ./e-mission-py.bash emission/tests/storageTests/TestTokenQueries.py ---------------------------------------------------------------------- Ran 21 tests in 23.591s OK ``` * ✅ Delete all irrelevant config variables Consistent with #961 (comment) Testing done: ``` ---------------------------------------------------------------------- Ran 21 tests in 23.106s OK ``` * ✅ Copy/paste the actual tests from the failed CI run --------- Co-authored-by: Mahadik, Mukul Chandrakant <[email protected]> Co-authored-by: Natalie Schultz <[email protected]> Co-authored-by: Shankari <[email protected]>
- Loading branch information