Skip to content

Commit

Permalink
Gradient generation tool to support GENN models in FOQUS (#1164)
Browse files Browse the repository at this point in the history
* Brady's gradient generation script

* Adding helpful docstrings and comments

* Brady's working files for gradient model regression

* Integrate Brady's regression script into main file, clean up

* fix formatting for black and pylint

* Generalize for multiple outputs, add settings optimization loop

* Add documentation for gradient tool

* Manually run spellchecker on ML AI Plugin files

* Fix links and file names

* Add option to use simple partial derivative in lieu of chain rule formula

* Fix comment

* run black again

* minor typo

* Give docs page reference name

---------

Co-authored-by: Keith Beattie <[email protected]>
  • Loading branch information
bpaul4 and ksbeattie authored Sep 29, 2023
1 parent baa13c8 commit 237022b
Show file tree
Hide file tree
Showing 7 changed files with 659 additions and 2 deletions.
83 changes: 83 additions & 0 deletions docs/source/chapt_surrogates/gradients.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
.. _gengrad:

Gradient Generation to Support Gradient-Enhanced Neural Networks
================================================================

Neural networks are useful in instances where multivariate process data
is available and the mathematical functions describing the variable
relationships are unknown. Training deep neural networks is most efficient
when samples of the variable derivatives, or gradients, are collected
simultaneously with process data. However, gradient data is often unavailable
unless the physics of the system are known and predetermined such as in
fluid dynamics with outputs of known physical properties.

These gradients may be estimated numerically using solely the process data. The
gradient generation tool described below requires a Comma-Separated Value (CSV) file
containing process samples (rows), with inputs in the left columns and outputs in the rightmost
columns. Multiple outputs are supported, as long as they are the rightmost columns, and
the variable columns may have string (text) headings or data may start in row 1. The method
produces a CSV file for each output variable containing gradients with respect to each input
variable (columns), for each sample point (rows). After navigating to the FOQUS directory
*examples/other_files/ML_AI_Plugin*, the code below sets up and calls the gradient generation
method on the example dataset *MEA_carbon_capture_dataset_mimo.csv*:

.. code:: python
# required imports
>>> import pandas as pd
>>> import numpy as np
>>> from generate_gradient_data import generate_gradients
>>>
>>> data = pd.read_csv(r"MEA_carbon_capture_dataset_mimo.csv") # get dataset
>>> data_array = np.array(data, ndmin=2) # convert to Numpy array
>>> n_x = 6 # we have 6 input variables, in the leftmost 6 columns
>>> gradients = generate_gradients(
>>> xy_data=data_array,
>>> n_x=n_x,
>>> show_plots=False, # flag to plot regression results during gradient training
>>> optimize_training=True, # will try many regression settings and pick the best result
>>> use_simple_diff=True # flag to use simple partials instead of chain rule formula; defaults to False if not passed
>>> )
>>> print("Gradient generation complete.")
>>> for output in range(len(gradients)): # save each gradient array to a CSV file
>>> pd.DataFrame(gradients[output]).to_csv("gradients_output" + str(output) + ".csv")
>>> print("Gradients for output ", str(output), " written to gradients_output" + str(output) + ".csv",)
Internally, the gradient generation methods automatically executes a series of actions on the dataset:

1. Import process data of size *(m, n_x + n_y)*, where *m* is the number of sample rows,
*n_x* is the number of input columns and *n_y* is the number of output columns. Given *n_x*,
the data is split into an input array *X* and an output array *Y*.

2. For each input *xi* and each output *yj*, estimate the gradient using a multivariate
chain rule approximation. For example, the gradient of y1 with respect to x1 is
calculated at each point as:

:math:`\frac{Dy_1}{Dx_1} = \frac{dy_1}{dx_1} \frac{dx_1}{dx_1} + \frac{dy_1}{dx_2} \frac{dx_2}{dx_1} + \frac{dy_1}{dx_3} \frac{dx_3}{dx_1} + ...`

where *D/D* represents the total derivative, *d/d* represents a partial derivative at each
sample point. *y1*, *x1*, *x2*, *x3*, and so on are vectors with values at each sample point *m*, and
this formula produces the gradients of each output with respect to each input at each sample point by iterating
through the dataset. The partial derivatives are calculated by simple finite difference. For example:

:math:`\frac{dy_1}{dx_1} (m_{1.5}) = \frac{y_1 (m_2) - y_1 (m_1)}{x_1 (m_2) - x_1 (m_1)}`

where *m_1.5* is the midpoint between sample points *m_2* and *m_1*. As a result, this scheme
calculates gradients at the points between the sample points, not the actual sample points.

3. Train an MLP model on the calculated midpoint and midpoint-gradient values. After normalizing the data
via linear scaling (see :ref:`mlaiplugin.datanorm`),
the algorithm leverages a small neural network model to generate gradient data for the actual
sampe points. Passing the argument *optimize_training=True* will train models using the optimizers
*Adam* or *RMSProp*, with activation functions *ReLu* or *Sigmoid* on hidden layers, using a *Linear*
or *ReLu* activation function on the output layer, building *2* or *8* hidden layers with *6* or *12*
neurons per hidden layer. The algorithm employs cross-validation to check the mean-squared-error (MSE) loss
on each model and uses the model with the smallest error to predict the sample gradients.

4. Predict the gradients at each sample point from the regressed model. This produces *n_y*
arrays with each having size *(m, n_x)* - the same size as the original input array *X*.

5. Concatenate the predicted gradients into a single array of size *(m, n_x, n_y)*. This is the
single object returned by the gradient generation method.
1 change: 1 addition & 0 deletions docs/source/chapt_surrogates/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Contents
.. toctree::
:maxdepth: 2

gradients
mlaiplugin
reference
tutorial/index
6 changes: 5 additions & 1 deletion docs/source/chapt_surrogates/mlaiplugin.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _mlaiplugin:

Machine Learning & Artificial Intelligence Flowsheet Model Plugins
==================================================================

Expand Down Expand Up @@ -99,7 +101,7 @@ Currently, FOQUS supports the following custom attributes:
bounds for each output variable (default: (0, 1E5))
- *normalized* – Boolean flag for whether the user is passing a normalized
neural network model; to use this flag, users must train their models with
data normalized according to a specifc scaling form and add all input and
data normalized according to a specific scaling form and add all input and
output bounds custom attributes. The section below details scaling options.
- *normalization_form* - string flag required when *normalization* is *True*
indicating a scaling option for FOQUS to automatically scale flowsheet-level
Expand All @@ -108,6 +110,8 @@ Currently, FOQUS supports the following custom attributes:
- *normalization_function* - optional string argument that is required when a
'Custom' *normalization_form* is used. The section below details scaling options.

.. _mlaiplugin.datanorm:

Data Normalization For Neural Network Models
--------------------------------------------

Expand Down
Loading

0 comments on commit 237022b

Please sign in to comment.