Replicating Keras Autoencoder Example: understanding blocks #601

hmf · 2021-02-04T10:31:48Z

hmf
Feb 4, 2021

I am trying to replicate the simplest Autoencoder model found in this Keras example. After looking at this DJL example I don't think I understand the API well enough to do this or I simply don't know how the Autoencoder works. Hopping someone can shed some light on this.

First things first. The Autoencoder structure and set-up used in the example referred to above is as follows:

784 -> ReLU -> 32 -> Sigmoid -> 784
optimizer='adam', loss='binary_crossentropy'

So I used the following Scala code:

  val relu:JFunction[NDList, NDList] = Activation.relu
  val sigmoid:JFunction[NDList, NDList] = Activation.sigmoid

  def net1(): SequentialBlock = 
    val net = SequentialBlock()
    net.add(Blocks.batchFlattenBlock(784))
    //net.add(Activation::relu)
    net.add(relu)
    net.add(Linear.builder().setUnits(32).build())
    net.add(sigmoid)
    net.add(Linear.builder().setUnits(784).build())
    net.setInitializer( NormalInitializer(0.01f))        
    net

At this point I have 2 questions.

The first is, does the above set-up correctly represent the Keras network? Unlike the DJL example, I seem to have placed 2 lambdas one after the other (batchFlattenBlock and relu). The documentation says such functions should be interspersed between linear blocks. Is this ok?

The other question is related to the loss: I have found the following losses in the DJL API:

I assume that these loss functions are for classification only. I am using:

    DefaultTrainingConfig(Loss.softmaxCrossEntropyLoss())

But I assume that when I call:

        EasyTrain.fit(trainer, trainSetUp.numEpochs, trainingSet, validateSet)

where the data sets are of type ai.djl.training.dataset.RandomAccessDataset, the system
will use the labels to reduce the classification error. Am I correct in this? When I try to reconstruct the image of the digit, I only get a small set pixels set to high values, but nothing that resembles the MNIST characters. Note that I tried to use SigmoidBinaryCrossEntropyLoss, because the name matches the Keras example, but that resulted in a run-time error. Any suggestions as to the loss function one should use?

TIA

Answered by zachgk

Feb 4, 2021

The relu needs to be between (or after) each usage of Linear. The batchFlatten is not a linear transformation, just a reshape from a 2D image into a 1D feature vector. What the Keras (simplest possible autoencoder) is doing is {Linear, relu, Linear, sigmoid}. We would just add a reshape (batchFlatten) before all of that.

For your loss, the softmaxCrossEntropyLoss is definitely not right. It is a classification loss where each value in your prediction is a different class and the label indicates which class is correct. What you are looking for here is a pixel by pixel comparison that says that the original and encoded images should be as similar as possible. The SigmoidBinaryCrossEntropyLo…

View full answer

hmf · 2021-02-04T11:35:22Z

hmf
Feb 4, 2021
Author

I have just realized a fundamental error of mine. The ai.djl.basicdataset.Mnist class has a label that is an integer that represents the digit. This leads me to the following question:

What is the best way to set-up the MNIST dataset so that I can calculate the loss between the input and itself (i.e: the label is the input) using the EasyTrain class?

TIA

2 replies

zachgk Feb 4, 2021
Maintainer

Mnist is final so you are going to have to create a new dataset. This is really the right thing, because the dataset is a function (in the mathematical sense) that should match the function for your network.

I would create a RandomAccessDataset class as a wrapper around Mnist like:

public class EncodeMnist extends RandomAccessDataset {
  private Mnist mnist;

  public EncodeMnist(...) {...}

  protected long availableSize() {
    return mnist.availableSize();
  }

  public Record get(NDManager manager, long index) throws IOException {
    NDList data = mnist.get(manager, index);
    return new Record(data, data);
  }
}

hmf Feb 5, 2021
Author

@zachgk Just as feedback, your solution works fine and is easier to implement. There is one small glitch though. The availableSize() is protected, so the EncodeMnist must be in the same ai.djl.training.dataset package. Don't see how that can be easily solves because that value depends on the data internals.

Here is a version of the code in Scala 3 in case it is useful.

package ai.djl.training.dataset

// Data set
import ai.djl.basicdataset.Mnist
import ai.djl.training.dataset.RandomAccessDataset
import ai.djl.ndarray.NDManager

import ai.djl.util.Progress


class EncoderMnist(mnistBuilder: Mnist.Builder) extends RandomAccessDataset(mnistBuilder) {
  
  val mnist = mnistBuilder.build

  // Members declared in ai.djl.training.dataset.Dataset
  /** {@inheritDoc} */
  override def prepare(progress: Progress): Unit = mnist.prepare(progress)
  
  // Members declared in ai.djl.training.dataset.RandomAccessDataset
  /** {@inheritDoc} */
  override def availableSize(): Long = mnist.availableSize
    //size                   // causes infinite recursion
    //mnist.availableSize    // protected
    //mnist.data[0].size(0)  // protected

  // Members declared in ai.djl.training.dataset.Dataset
  /** {@inheritDoc} */
  override def get(manager: ai.djl.ndarray.NDManager, index: Long): ai.djl.training.dataset.Record = 
    val data = mnist.get(manager, index)
    Record(data.getData, data.getData)

}

And this is how to instantiate:

  def getEncoderDataset(usage: Dataset.Usage): RandomAccessDataset = 
    val mnist: Mnist.Builder =
      Mnist.builder()
        .optUsage(usage)                                            // train, validation or test data
        .setSampling(trainSetUp.batchSize, trainSetUp.randomize)    // batch size
        .optLimit(trainSetUp.maxBatches)                            // number of batches (for testing purposes)
        .optTargetPipeline( Pipeline(new ToTensor()) )
    val encoderMnist = EncoderMnist(mnist)
    encoderMnist

zachgk · 2021-02-04T20:32:41Z

zachgk
Feb 4, 2021
Maintainer

The relu needs to be between (or after) each usage of Linear. The batchFlatten is not a linear transformation, just a reshape from a 2D image into a 1D feature vector. What the Keras (simplest possible autoencoder) is doing is {Linear, relu, Linear, sigmoid}. We would just add a reshape (batchFlatten) before all of that.

For your loss, the softmaxCrossEntropyLoss is definitely not right. It is a classification loss where each value in your prediction is a different class and the label indicates which class is correct. What you are looking for here is a pixel by pixel comparison that says that the original and encoded images should be as similar as possible. The SigmoidBinaryCrossEntropyLoss may work (although double check that the Mnist dataset produces inputs from 0 to 1). You could also try the L2Loss.

1 reply

hmf Feb 5, 2021
Author

Thank you.

Just to make sure I did not misunderstand. When you say:

double check that the Mnist dataset produces inputs from 0 to 1

do you mean a float values between 0 and 1 inclusive or do you mean a binary input of only 0 or 1?

TIA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replicating Keras Autoencoder Example: understanding blocks #601

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Replicating Keras Autoencoder Example: understanding blocks #601

hmf Feb 4, 2021

Replies: 2 comments · 3 replies

hmf Feb 4, 2021 Author

zachgk Feb 4, 2021 Maintainer

hmf Feb 5, 2021 Author

zachgk Feb 4, 2021 Maintainer

hmf Feb 5, 2021 Author

hmf
Feb 4, 2021

Replies: 2 comments 3 replies

hmf
Feb 4, 2021
Author

zachgk Feb 4, 2021
Maintainer

hmf Feb 5, 2021
Author

zachgk
Feb 4, 2021
Maintainer

hmf Feb 5, 2021
Author