Governed operationalization of AI models is a framework that uses process, people, and technology that helps in ensuring the trustworthiness of AI solutions used for business. The approach uses data and AI technologies that are integrated with an open and diverse ecosystem and is rooted in principles of trustworthy AI ethics. Governed operationalization of AI models encompasses the entire lifecycle of ML Models, starting from inception to decommission. The diagram below captures the process and people aspects of the same.
For more detailed on Governed operationalization of AI models please refer to https://opendatascience.com/trustworthy-ai-operationalizing-ai-models-with-governance-part-1/.
The Enablement materials in this github repository takes you through End-to-End Governed ML Operationalization for a given use case in Heterogeneous platforms. In this Lab, we will be demonstrating the end to end pipeline creation using Airline Delay Dataset. The stages of the Governed ML Ops those are covered in this lab include:
- Model Governance Workflow Initiation
- Model Candidacy Validation
- Data Acquistion
- Model Development
- Model Validation
- Model Deployment
- Model Productionization and
- Model Monitoring
Step by Step Lab Instructions:
Flight delay has become a very important subject for air transportation all over the world because of the associated financial losses that the aviation industry incurs. According to data from the Bureau of Transportation Statistics (BTS) of the United States, over 20% of US flights were delayed during 2018, which resulted in a severe economic impact equivalent to 41 billion US$.
These delays not only cause inconveniences to the airlines but also to the passengers. The result is an increase in travel time which increases the expenses associated with food and lodging and ultimately causes stress among passengers. The airlines are victims of extra costs associated to their crews, aircraft repositioning, fuel consumption while trying to reduce elapse times, and many others which together tamper their reputation and often result in a loss of demand by passengers.
The reasons for these delays vary from air congestion to weather conditions, mechanical problems, difficulties while boarding passengers, and simply the airlines inability to handle the demand given its capacity. In this course, we showcase the machine learning operationalization (MLOps) workflow applied to a model which predicts flight delays.
-
There are 2 datasets we will be using for this lab.
- The
first dataset
contains flight information (FLIGHT_ID, MONTH, DAY, DAY_OF_THE_WEEK, DEPARTURE_DELAY TAXI_OUT, DISTANCE, DELAYED, YEAR).- DELAYED Field is the target variable to be predicted.
- The
second dataset
contains flight destination information (FLIGHT_ID, ORIGIN_AIRPORT, DESTINATION_AIRPORT).
- The
-
Each group will get different versions of the above dataset with same number of attributes/columns but different value of the data.
### Details of Governed MLOps lifecycle workflow
In this Lab, you will run through the following modules in order to execute Governed MlOps flow. The various phases and the steps of Goverened ModelOps flow are depicted in the diagram below.
The notebooks in the various section of this ReadMe contain instructions for executing the steps in the Governed MLOps flow.
In this part of the Enabement you shall learn how to Operationalize Models developed and deployed in IBM Platform. This part has various modules which which will take your through various stages of model lifecycle workflow for Governed Model Operationalization.
Please note following key aspects of this part -
- In this part, each Person will execute every step of the model lifecycle Workflow shown above by changing the Roles.
- Each person needs to have 1 Dataset, 1 Catalog, 2 Projects and 2 Deployment Spaces. Catalog, and Deployment Sapces are precreated for each Person in IBM tool. However, you have to create 2 Projects (instruction provided) as you go.
This module covers Governance Workflow Initiation in IBM OpenPages with Watson. The Workflow feature of IBM Openpages is used for Model Governance Lifecycle for Risk and Compliance.
Role : This step needs to be executed by ModelOwner
- Instructions are provided in the notebook: Model Governance Workflow Initiation
- After executing the instructions, you would have entered a unique model name in OpenPages. Please note the Model name which will be used for rest of the lab. Also, the Model Workflow will be assigned to
ModelApprover
for Model Candidacy Validation.
This module of the workflow captures the Model details as part of Model Governance Workflow using IBM Openpages.
Role : This step needs to be executed by ModelApprover
.
- Log into OpenPages as ModelApprover. From the task list, select the model corresponding to the unique model name as obtained in the previous module.
- Instructions are provided in the notebook: Model Candidacy Validation
- After this step, the Model Workflow would be assigned to
ModelDataEngineer
for Data Acquisition.
This module provides the steps where a data engineer can source the data needed to develop a model.
3.a In this substep of this module Data Enginner needs to review the information that is needed for Data Sourcing
Role : This step needs to be executed by ModelDataEngineer
.
- Log into OpenPages as ModelDataEngineer. From the task list, select the model corresponding to the unique model name as obtained from
Model Governance Initiation
Module. - Review the necessary information (Model Details, Model Catalog, Details about Data Source etc.) needed for data sourcing. This is for Reviewing or Information purposes only. No action needs to be taken in Openpages yet.
3.b In this sub-step, the Data Engineer creates a joined virtualized view of the raw dataset by navigating back to home page URL of the Cloud Pak For Data. The Virtualized Dataset will then be added to your respective catalog and profiled.
Role : This step needs to be executed by ModelDataEngineer
.
Note: The datasets are pre-loaded in DB2 and Postgres.
- Naming convention:
- <MONTH>_FLIGHT_INFORMATION for DB2
- <month>_airport_information for postgres
- We will do some feature engineering and join the datasets using Data Virtualization in the notebook below.
- Instructions are provided in the notebook: Data Acquistion
3.c In this sub-step, the relevant information about data sourcing is updated in model workflow.
Role : This step needs to be executed by ModelDataEngineer
-
3.c.1. Log into Openpages as ModelDataEngineer, navigate to the tasks section and click on your model name obtained from Model Governance Initiation module.
-
3.c.2. Click on Data Sourcing view, fill out the necessary fields (Training Data Asset Name, Training Data Quality Flag (indicating whether the training data quality is acceptable or not)).
- Training Data Asset Name: Provide the name of the virtualized data set that you created using Data Virtualization and added to the catalog.
- Training Data Quality Flag: Update the Training Data Quality flag as
true
if, after data profiling you find that the data doesn't have any null values and the attributes are not skewed etc.
-
3.c.3. Then, click on Save. Next, click on
Data Acquisition Verified
by navigating to actions on top right. This will move the model toData Acquisition Completed
stage.
3.d. In this sub-step, the model owner needs to take the appropriate action in order to make the model ready for development.
Role : This step needs to be executed by ModelOwner
.
- 3.d.1. Log into Openpages as ModelOwner, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
- 3.d.2. Model Owner verifies the training data quality flag and the training data set by getting the name from OpenPages and investigating the data asset in the respetive catalog by navigating to home page URL of Cloud Pak For Data.
- 3.d.3. The Model Owner then updates the
Model Life Cycle Stage
toApproved for Development
and clicks on Save. Then, Model owner clicks on actions on top right toModel Development
indicating the model is Ready for development stage. This will move the model toApproved for Development
stage.
This module provides the steps where a data scientist will develop a model. In Model Development step, the developer or data scientist will be creating the development
project.
4.a. In this substep of this module Model Developer needs to review the information that is needed for Model Development
Role : This step needs to be executed by ModelDeveloper
.
- Log into OpenPages as ModelDeveloper. From the task list, select the model corresponding to the unique model name as obtained from
Model Governance Initiation
Module. - Review the necessary information (Model Details, Model Catalog, Details about Data Source like Training Data and Data Quality etc.).
4.b. Inspect the respective Project located in Cloud Pak For Data to be used for Model Development.
Role : This step needs to be executed by ModelDeveloper
.
- Each User will create a new Development project
Airline-MLOps-Dev-User-Name
where User-name is allocated to you (eg. Airline-MLOps-Development-Cluster1-User10). You will use this project on Cloud Pak for Data to be used during model development phase. Go to "Cloud Pak for Data" Home Page, login with your credentials. - After logging in, click on
Hamburger
icon on the top left, scroll down to projects tab and click on All Projects. Then, click on new project, provide your project name and import thedev
zip file provided to you at the time of project creation.
4.c. This step involves adding the virtualized dataset to your respective project.
Role : This step needs to be executed by ModelDeveloper
.
- Instructions are provided in the notebook: Data Catalog to Project
4.d. This step converts the Training Dataset that you added to the project to csv file which will eventually be used for Model Development.
Role : This step needs to be executed by ModelDeveloper
.
- Instructions are provided in the
Feature Transformation
notebook namedVirtualization Data Format Conversion
pre-loaded in your development project after importing the zip file. - This step is typically needed for some additional feature engineering during the model development phase. In this lab, we are not doing any additional feature engineering for simplicity.
Note: After Running any particular notebook, make sure to stop the kernel in order to avoid slowing down the cluster as shown in the image below. Repeat this for every notebook after running it in Cloud Pak for Data platform
4.e. Model Development - In this sub-step, we will execute approaches for Model development using IBM Environment on Cloud Pak For Data:
Role : This step needs to be executed by ModelDeveloper
.
-
4.e.1 Model Development using UI based approach:
- 4.e.1.1. Instructions for performing Model Development using UI Based approach (AutoAI) : Model Development Using AutoAI
- 4.e.1.2. This sub-step involves accessing and saving the performance metrics and details of various experiments using AutoAI (in the form of excel file). Instructions are provided in the notebook named as
WML-AutoAI-API-Integration
pre-loaded in your development project. - 4.e.1.3. Store the model to the catalog - Publish Model to Catalog
- 4.e.1.4. Next, store the Model Experiment csv/excel file to the catalog you created in 4.e.1.2. On the Data Assets section, click on the 3 dots next to the file, and select publish to your respective catalog from the drop down.
-
4.e.2 Model Development using code based approach:
- 4.e.2.1. Instructions for performing Model Development using code based approach (Scikitlearn model) are provided in the notebook named
Airline Model Development using Scikit-Learn
provided in the development project. - 4.e.2.2. Store the model to the catalog - Publish Model to Catalog
- 4.e.2.3. Next, store the Model Experiment csv file you created using 4.e.2.1 into the catalog (similar to the step executed for AutoAI model in 4.e.1 step). On the Data Assets section, click on the 3 dots next to the file, and select publish to your respective catalog from the drop down.
- 4.e.2.1. Instructions for performing Model Development using code based approach (Scikitlearn model) are provided in the notebook named
4.f. Create and Store Necessary Artifacts for Model Validation/Monitoring in IBM Platform:
Role : This step needs to be executed by ModelDeveloper
.
Note: You must provide Watson OpenScale access to training data. Watson OpenScale needs access to your training data in order to generate contrastive explanations, display training data statistics and to create and calibrate drift detection. The Drift monitor uses training data to create and calibrate drift detection.
-
4.f.1. In this sub-step, you need to create the training Data Statistics JSON to the project and Publish to the Catalog.
- Instructions are provided in the notebook named
Create training data Statistics JSON
pre-loaded in your development project. - After executing the notebook, publish the
training_stats.json
that was stored in the project, into your respective model catalog.
- Instructions are provided in the notebook named
-
4.f.2. In this sub-step, you would be provided with a drift detection model file that would be pre-loaded in the respective development project.
- Next, publish the
Drift_Model.tar.gz
file that was stored your development project, into your respective model catalog, similar to how you pushed model/training data statstics file to the catalog previously.
- Next, publish the
4.g. In this sub-step, the relevant information about Model Development is updated in model workflow.
Role : This step needs to be executed by ModelDeveloper
-
4.g.1. Log into Openpages as ModelDeveloper, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
4.g.2. Click on
Model Development
view, fill out the necessary/key fields:- Model Asset name: Enter the name of the Model Asset that you used to save the model in 4.e step.
- Development System Name: Enter
IBM Platform
in the text field. For other platforms, you should provide the details appropriately. - Training Data Statistics File: Update the name of the Training Data Statistics File that you created in 4.f.1 step.
- Drift Detection Model File: Update the name of Drift Detection Model File that you have from 4.f.2 step stored in the catalog.
- Model Experiment FileName: Update the name of the model experiment file (csv/excel file) that you created in 4.e.1.2. (for AutoAI, similar step can be executed for Scikit learn model) step that was pushed to the catalog
- Model Quality: Update the overall performance metric of the model. In our case, you can use Accuracy. It can be Accuracy, Precision, F1-Score etc. as appropriate based on the use case. Model Quality field has to be an integer. (Go back to AutoAI and look at the accuracy result of the model in the pipeline which you saved in the project)
-
4.g.3. Then, click on Save. Next, click on
Model Development Verified
by navigating to actions on top right. This will move the model toModel Development Completed
stage.
4.h. In this sub-step, the model owner needs to take the appropriate action in order to make the model ready for validation.
Role This step needs to be executed by ModelOwner
-
4.h.1. Log into Openpages as ModelOwner, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
4.h.2. Model Owner verifies the Model development details (Model Experiment File, Model Quality, Training Data Statistics File and Drift Detection Model file) by investigating the corresponding assets in the catalog.
-
4.h.3. The Model Owner then updates the
Model Life Cycle Stage
fromModel Development Completed
toApproved for Validation
and clicks on Save. Then, Model owner clicks on actions on top right toModel Validation
indicating the model is Ready for Validation stage. This will move the model toApproved for Validation
stage.
This module provides the steps where a Model Validator will perform independent validation of the model. The Model Validator will have access to the Production (or Validation, used interchangeably) project, catalog and UAT/Pre-prod Deployment Space.
5.a. In this substep of this module Model Validator needs to review the information that is needed for Model Validation
Role : This step needs to be executed by `ModelValidator`.
-
Log into OpenPages as Model Validator. From the task list, select the model corresponding to the Model name as obtained from
Model Governance Initiation
Module. -
Review the necessary information (Model Details, Model Catalog, Model Development Details including Data Quality, Training Data Statistics File and Drift Statistics file).
5.b Model Validation:
Note: Developer has pushed the developed model to the catalog. Validator doesn’t have access to development project. Validator has to get the model from the catalog and push the model to the validation/production project.
Role : This step needs to be executed by `ModelValidator`.
-
5.b.1. In this sub-step,
CREATE
a new validation/production project namedAirline_MLOps_Prod_User-name
in IBM Cloud Pak for Data by importing the zip file provided by the instructor. The validator then pushes the model to the validation/production project.- Instructions are provided in the notebook: Pushing Model to Validation Project
-
5.b.2. In this sub-step, the validator is staging the model to UAT/Pre-prod deployment space.
- Instructions are provided in the notebook: Staging the Model in UAT/Pre-prod Deployment Space
-
5.b.3. In this sub-step, the validator smoke-tests the Model using Rest API.
- Instructions are provided in the notebook named
Getting Prediction from Model using REST
is pre-loaded in the validation project.
- Instructions are provided in the notebook named
-
5.b.4. Configuring Model in OpenScale for Independent Validation:
-
5.b.4.1. In this sub-step, we configure the model in Watson Openscale.
- Instructions are provided in the notebook: Configuring Model in OpenScale
-
5.b.4.2. In this sub-step, we configure the model monitors in Watson Openscale.
- Instructions are provided in the notebook: Model Monitors configuration in Openscale
-
5.b.4.3. In this sub-step, we evaluate the model in the staging area.
- Instructions are provided in the notebook: Pre-prod Model Evaluation and Model Approval
-
5.c. In this sub-step, the summary information (all the detailed information is already updated in the previous step) about Model Validation is updated in model workflow.
Role : This step needs to be executed by ModelValidator
.
-
5.c.1. Log into Openpages as ModelValidator, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
5.c.2. Click on
Model Validation
view, fill out the necessary/key fields (Model Validation Report, Validation Dataset).- Model Validation Report: Enter the Validation report file that was uploaded from Openscale to Openpages. In order to get the name of the report file that was sent to Openpages, navigate to
Issues and Documents
section in OpenPages Model Workflow and copy the Validation Report filename. - Validation Dataset: Enter the name of the Validation dataset that was used from the catalog for validation.
- Model Validation Report: Enter the Validation report file that was uploaded from Openscale to Openpages. In order to get the name of the report file that was sent to Openpages, navigate to
-
5.c.3. Then, click on Save. Next, click on
Model Validation Verified
by navigating to actions on top right. This will move the model toModel Validation Completed
stage.
5.d. In this sub-step, the model owner needs to take the appropriate action in order to make the model ready for deployment.
Role This step needs to be executed by ModelOwner
-
5.d.1. Log into Openpages as ModelOwner, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
5.d.2. Model Owner verifies the Model validation details by reviewing the Model Validation Report file (by navigating to
Issues and Document
section in Openpages Model workflow) and Validation Dataset stored in the catalog.Note: The Name of Model Validation report file stored in Issues and Document section in Openpages would be named different from what you saved from Openscale. Make sure you use the name fetched from Openpages.
-
5.d.3. The Model Owner then updates the
Model Life Cycle Stage
fromModel Validation Completed
toApproved for Deployment
and clicks on Save. Then, Model owner clicks on actions on top right toModel Deployment
indicating the model is Ready for Deployment stage. This will move the model toApproved for Deployment
stage.
This module provides the steps where a Model Deployer will perform independent validation of the model.
6.a. In this substep of this module Model Deployer needs to review the information that is needed for Model Deployment
Role : This step needs to be executed by ModelDeployer
.
-
Log into OpenPages as Model Deployer. From the task list, select the model corresponding to the Model name as obtained from
Model Governance Initiation
Module. -
Review the necessary information (Model Development Details, Model Catalog, Model Validation details).
Note: There are 2 approaches for pushing model from pre-prod to production environment. In this step, we will follow the following approaches to push the model created in IBM environment into Production.
6.b. Model Deployment in Production Environment:
Role : This step needs to be executed by `ModelDeployer`.
-
6.b.1. In this sub-step, we are pushing the model in production using using CI/CD approach.
- Follow the instructions in the notebook: Notebook_Instructions
- This consists of CICD approach for Model Deployment from Pre-prod to Production Environment using Code-based approach.
-
6.b.2. In this sub-step, we are pushing AutoAI model in Production using UI based Approach.
- Instructions are provided in the notebook: Model Deployment and Configuration for Monitoring
6.c. In this sub-step, the summary information (all the detailed information is already updated in the previous step) about Model Deployment is updated in model workflow.
Role : This step needs to be executed by ModelDeployer
.
-
6.c.1. Log into Openpages as ModelDeployer, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
6.c.2. Click on
Model Deployment
view, fill out the necessary/key fields.- Endpoint: Enter the Endpoint URL that you can retreieve from WML deployment.
- Model Deployment Space: Enter the model deployment space of the production environment where the model was deployment.
- Deployment Platform: Enter the deployment platform as WML from the drop down menu.
-
6.c.3. Then, click on Save. Next, click on
Model Deployment Verified
by navigating to actions on top right. This will move the model toModel Deployment Completed
stage.
6.d. In this sub-step, the model owner needs to take the appropriate action in order to make the model ready for deployment.
Role This step needs to be executed by ModelOwner
-
6.d.1. Log into Openpages as ModelOwner, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
6.d.2. Model Owner verifies the Model Deployment details by reviewing the Model Deployment endpoint.
-
6.d.3. The Model Owner then updates the
Model Life Cycle Stage
fromModel Deployment Completed
toApproved for Productionization
. Model Owner then clicks on Save. Then, Model owner clicks on actions on top right toProductionize Model
indicating the model is Ready for Productionization. This will move the model toApproved for Productionization
stage.
This module provides the steps where a Model Deployer will perform configuration for Productionization of the model.
7.a. In this substep of this module Model Deployer needs to review the information that is needed for Model Productionization
Role : This step needs to be executed by ModelDeployer
.
- Log into OpenPages as Model Deployer. From the task list, select the model corresponding to the Model name as obtained from
Model Governance Initiation
Module. - Review the necessary information (Model Development Details, Model Catalog, Model Validation details and Model Deployment Details).
7.b. Configure Models in Production Environment:
Role : This step needs to be executed by ModelDeployer
.
-
7.b.1. In this sub-step, we create WML Model Wrapper Function in Production Environment.
- Instructions are provided in the notebook : WML Model Wrapper Function
-
7.b.2. In this sub-step, we configure model for monitoring in production.
- Instructions are provided in the notebook: Openscale Configuration via API
7.c. In this sub-step, the summary information (all the detailed information is already updated in the previous step) about Model Productionization is updated in model workflow.
Role : This step needs to be executed by ModelDeployer
-
7.c.1. Log into Openpages as Model Deployer, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
7.c.2. Click on
Model Deployment
view, fill out the necessary/key fields.- Wrapper Function Endpoint: Enter the Endpoint URL that you can retreieve from production deployment space.
- Wrapper Function Name: Enter the wrapper function name you retrieve from production deployment space.
- Model Monitoring URL: Enter the model monitoring subscription URL (which is the browser URL of the monitoring tile for the wrapper function in Openscale).
-
7.c.3. Then, click on Save. Next, click on
Verify Model Readiness for Production
by navigating to actions on top right. This will move the model toPre Production Review
stage.
7.d. In this sub-step, the model owner needs to take the appropriate action in order to make the model ready for production use.
Role This step needs to be executed by ModelOwner
-
7.d.1. Log into Openpages as ModelOwner, navigate to the tasks section and click on the model name obtained from Model Governance Initiation module.
-
7.d.2. Model Owner verifies the Model Readiness details in Openpages.
-
7.d.3. The Model Owner then updates the
Model Life Cycle Stage
fromPre Production Review
toReady to Use
and clicks on Save. Then, Model owner clicks on actions on top right toReady to Use
indicating the model is Ready for use.
In this substep the MOdel Deployer will send scoring requests to the Model and investigate how Model Monitoring is happening
Role : This step needs to be executed by `ModelDeployer`
`Run the notebooks below in sequence for developed Model which have been pushed to production.
8.a. Instructions are provided in the notebook named Send Scoring Requests from Virtualized Data
pre-loaded in your validation/production project.
8.b. Model Monitoring in Production Environment
8.c. Investigate Monitoring Results using Watson OpenScale
8.d. Check usability status in OpenPages - Usability Status in Openpages should change to red
after few minutes.
Note: The usability status in Openpages is dependent on values received from Openscale. The value of usability status changes from Accept to Reject if any of the model monitoring metrics are below the threshold and appear as red.
8.e. Test model again with new scoring request from the UI as you performed during smoke-testing in the validation phase. This time, there would be error response indicating that model execution is not allowed.
In this part of the Enabement you shall learn how to Operationalize Models developed and deployed in NON-IBM Platform.
In this part you shall execute a project as a group where you shall execute various stages of the model lifecycle workflow for Governed Model Operationalization.
Please note following key aspects of this Project -
- Each group would be executing 1 end to end model lifecycle workflow.
- Each person in the group select one (or 2) Roles to perform. The Roles you would need to select from are - Model Owner, Model Approver, Model Data Engineer, Model Developer, Model Validator, Model Deployer, and Model Monitor.
- In this part, each Person will execute each step of the model lifecycle Workflow (shown above by changing the Roles) based on the Role he/she has selected.
- For the entire group there would 1 Dataset, 1 Catalog, 2 Projects and 2 Deployment Spaces to be used. Catalog, and Deployment Sapces are precreated for each Person in IBM tool. However, you have to create the Dataset and the 2 Projects (instruction provided) as you go.
Repeat the exact same steps for this Module as in Part 1. Create a new Catalog for your group. Add all your group memebers to the Catalog with Editor profile.
This time this module should be executed by the appropriate person(s) from the group who plays the respective Role(s) applicable for this Module
Repeat the exact same steps for this Module as in Part 1. \
This time this module should be executed by the appropriate person(s) from the group who plays the respective Role(s) applicable for this Module
Repeat the exact same steps for this Module as in Part 1.
This time this module should be executed by the appropriate person(s) from the group who plays the respective Role(s) applicable for this Module. The Data Engineer can use the same Dataset that he/she used for Part 1.
In this step Model Developer will develop the model outside IBM Platform.
Role : The following steps need to be executed by `ModelDeveloper`.
4.a. Follow the same instruction for this substep as in Part 1.
4.b. Create a group level Development Project for your group. Use the same zip file that you have used for creating the development project in Part 1. Add other group members (Data Engineer, Model Owner) to the project.
4.c. Follow the same instruction for this substep as in Part 1. Use the group level Catalog, the group level Dataset and the group level Development Project.
4.d. Follow the same instruction for this substep as in Part 1.
4.e. Model Development outside IBM environment(3rd party Environment) - Instructions are provided in the notebook named Airline Model development Heroku
pre-loaded in the development project. You can downlod the csv file that you created in previous step to your platform/machine to build the model. After the Model is developed you can upload the pkl file to the Development Project. Then move this pkl file to the group level Catalog.
- Note: In case you want to skip this step, the model pkl file will be provided to you through your respective catalog. Simply push the model file from the catalog to your project.
4.f. Create and Store Necessary Artifacts for Model Validation/Monitoring in the Catalog. You can use the same artifacts those you created in Part 1.
4.g. Follow the same instruction for this substep as in Part 1.
Role : The following step needs to be executed by `ModelOwner`.
4.f. Follow the same instruction for this substep as in Part 1.
In this part we will be skipping actual Model Validation step due to limited time. Follow the instructions beloe for the same.
Role : The following steps need to be executed by `ModelValidator`.
5.a. Review the same substep in Part 1. However, skip executing this step for this part.
5.b. Create a new group level Operationalization project by using the same zip file that you have used in Part 1 in this substep. Add other group members (Data Engineer, Model Owner, Model Deployer, Data Scientist) to the project.
5.c. Execute the instructions for the same substep in Part 1. You can use the same Validation Dataset name and Report as in Part 1.
Role : The following step needs to be executed by `ModelOwner`.
5.d. Execute the instructions for the same substep in Part 1.
In this step, we deploy the developed model using Heroku Platform(3rd party Environment).
Role : The following steps need to be executed by `ModelDeployer`.
6.a. Follow the instructions as in Part 1 for this substep.
6.b. Follow the instructions are provided in the notebook named Deploy model to Heroku
pre-loaded in your validation/production project.
6.c. Execute the instructions for the same substep in Part 1. You can use -
Endpoint: Enter the Endpoint URL that you got after deploying the Model in Heroku.
Deployment Platform: Enter the deployment platform as Heroku (or Data Robot - as we mimic Data Robot signture) from the drop down menu.
Model Deployment Space: Enter the model deployment space for the Wrapper function for the model deployed in Heroku.
Role : The following step needs to be executed by `ModelOwner`.
6.d. Execute the instructions for the same substep in Part 1.
This module provides the steps where a Model Deployer will perform configuration for Productionization of the model.
Role : The following steps need to be executed by `ModelDeployer`.
7.a. Follow the instructions as in Part 1 for this substep.
7.b. Configure Models in Production Environment:
-
7.b.1. In this step you create Wrapper Function in Production deployment space of IBM platform for the Model deployed in Heroku. Follow the steps provided in this Notebook - Model Wrapper Function in Heroku Instructions
-
7.b.2. In this sub-step, you configure model (the Wrapper Function) for monitoring in production.
- Instructions are provided in the notebook named
Configure Openscale via API Instructions
pre-loaded in your validation/production project.
- Instructions are provided in the notebook named
7.c. Follow the instructions as in Part 1 for this substep. Use follwoing -
Wrapper Function Endpoint: Enter the Endpoint URL that you can retreieve from production deployment space.
Wrapper Function Name: Enter the wrapper function name you retrieve from production deployment space.
Model Monitoring URL: Enter the model monitoring subscription URL (which is the browser URL of the monitoring tile for the wrapper function in Openscale).
Role : The following step needs to be executed by `ModelOwner`.
7.d. Execute the instructions for the same substep in Part 1.
In this substep the Model Deployer will send scoring requests to the Model and investigate how Model Monitoring is happening
Role : The following steps need to be executed by `ModelDeployer`.
`Run the notebooks below in sequence for developed Model which have been pushed to production.
8.a. Sending Scoring requests in Production Environment:
- 8.a.1. Instructions are provided in the notebook named Send Scoring Requests from Virtualized Data
pre-loaded in your validation/production project.
- 8.a.2. (Optional) Instructions are provided in the notebook named `Sending Mult. Scoring Req. to Deployment` pre-loaded in your validation/production project which is sending random scoring requests to the productionized model.
8.b. Model Monitoring in Production Environment
8.c. Investigate Monitoring Results using Watson OpenScale
8.d. Check usability status in OpenPages - Usability Status in Openpages should change to red
after few minutes.
Note: The usability status in Openpages is dependent on values received from Openscale. The value of usability status changes from Accept to Reject if any of the model monitoring metrics are below the threshold and appear as red.
8.e. Test model again with new scoring request from the UI as you performed during smoke-testing in the validation phase. This time, there would be error response indicating that model execution is not allowed.
- Model Development in Opensource Environment using Kedro Pipeline - Model Development with Kedro Pipeline
- Github Integration for Kedro Pipeline: You will have to integrate your Cloud pak for Data project with Github if you intend to work with Jupyter-lab as per the following instructions:` Github Integration
- Store the model to the catalog in
Model Store
- Publish Model to Catalog - This will store the respective model to the catalog.
UI based production env push
- 7.b.2. In this sub-step, Run Model Configuration in Production in Openscale:
- 7.b.2.1. This sub-step involves configuring model to production space. Insturctions are provided in the notebook: Configuring Model in OpenScale
- 7.b.2.2. Model Monitors configuration in Openscale