Create a more standard training loop interface for pretraining #8

jason-fries · 2021-05-07T03:29:14Z

Currently, clmbr_train_model conceals more familiar training loops structure from users. In most demos and APIs, the boilerplate looks like what's outlined here https://github.com/PyTorchLightning/pytorch-lightning with this structure

dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor())
train, val = random_split(dataset, [55000, 5000])

autoencoder = LitAutoEncoder()
trainer = pl.Trainer()
trainer.fit(autoencoder, DataLoader(train), DataLoader(val))

basically the form

dataloader
data splits
model architecture
training

Specific details around the loss are configured in the model architecture and the trainer class handles stuff like progress bars, choice of optimizer, etc.

What is the lift required to provide a demo and refactor to support this type of workflow?

The text was updated successfully, but these errors were encountered:

woffett · 2021-05-21T18:26:20Z

The refactor PR puts pre-training into this kind of API:

model = CLMBRFeaturizer(config, info)
dataset = PatientTimelineDataset(extract_path, ontology_path, info_path)
model.fit(dataset)

The original clmbr_train_model still works, just uses this API. I'll leave this issue open until piton_private is updated to reflect these changes.

jason-fries added the enhancement New feature or request label May 7, 2021

jason-fries assigned EthanSteinberg and woffett May 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a more standard training loop interface for pretraining #8

Create a more standard training loop interface for pretraining #8

jason-fries commented May 7, 2021

woffett commented May 21, 2021

Create a more standard training loop interface for pretraining #8

Create a more standard training loop interface for pretraining #8

Comments

jason-fries commented May 7, 2021

woffett commented May 21, 2021