You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current model naively takes in the grid coordinates as input, which includes the value 0. This is bad for neural network because no learning is possible when the input is 0. To resolve this issue, we shall use other ways to encode the state. One way is to use on-hot encoding for each cell in the grid. It might also be possible to use a simple labeling system from 1 to 100 for each cell, and then normalize the label as the input. They are both worth the shot. Maybe by changing the input, we can make the on-policy learning work.
The text was updated successfully, but these errors were encountered:
The current model naively takes in the grid coordinates as input, which includes the value 0. This is bad for neural network because no learning is possible when the input is 0. To resolve this issue, we shall use other ways to encode the state. One way is to use on-hot encoding for each cell in the grid. It might also be possible to use a simple labeling system from 1 to 100 for each cell, and then normalize the label as the input. They are both worth the shot. Maybe by changing the input, we can make the on-policy learning work.
The text was updated successfully, but these errors were encountered: