Skip to content
This repository has been archived by the owner on Nov 29, 2023. It is now read-only.

Pre-train PerceiverIO #85

Open
3 tasks
jacobbieker opened this issue Aug 31, 2021 · 5 comments
Open
3 tasks

Pre-train PerceiverIO #85

jacobbieker opened this issue Aug 31, 2021 · 5 comments
Assignees
Labels
discussion enhancement New feature or request

Comments

@jacobbieker
Copy link
Member

Various ideas

Some from @JackKelly:

  • Pretrain predicting next frame from past two
  • Simulated clouds/optical flow
  • Try AutoFlow like in Perceiver paper
@jacobbieker jacobbieker added discussion enhancement New feature or request labels Aug 31, 2021
@jacobbieker jacobbieker self-assigned this Aug 31, 2021
@JackKelly
Copy link
Member

JackKelly commented Sep 1, 2021

Sounds good!

A related trick up our sleeves would be to train on the ~10 years of data available from EUMETSAT: openclimatefix/nowcasting_dataset#81

(Training on more data isn't exactly "pre-training" :) But it might be worth trying. What do you think the priority should be: training on ~ 10 years of data; or pre-training using 'auxillary' tasks? Although it'll likely take a while to download & prepare ~10 years of data, so maybe we should get that going 'in the background' soonish?)

@jacobbieker
Copy link
Member Author

I think yeah, getting it started in the background would be good, having all that data could also help if we want to try the similarity idea mentioned here #65, I think the extra data is probably a higher priority, but while that's running, trying the auxiliary tasks would be helpful.

For the simulated clouds/optical flow, more data could also help with getting real clouds that we could possibly "copy/paste" for the simulated optical flow? As in, get the cloud pixel values by subtracting the base ground data for real clouds, save out those clouds, and then paste random combos or crops of those clouds and generate the optical flow from that?

@JackKelly
Copy link
Member

get the cloud pixel values by subtracting the base ground data for real clouds, save out those clouds, and then paste random combos or crops of those clouds and generate the optical flow from that

Sounds great to me!

@JackKelly
Copy link
Member

I think the extra data is probably a higher priority

Cool, in our next meeting we can chat a bit about getting more data! I agree, it feels like a priority to grab more data!

@jacobbieker
Copy link
Member Author

The HuggingFace PerceiverIO has the weights for optical flow task and others, so we can use that and then pre-train some more on the historical satellite imagery

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
discussion enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants