Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-level calibration #26

Open
nikhilwoodruff opened this issue Sep 20, 2024 · 0 comments
Open

Implement multi-level calibration #26

nikhilwoodruff opened this issue Sep 20, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@nikhilwoodruff
Copy link
Contributor

We're going to need to calibrate weights for the following areas:

  • UK (1)
  • UK regions (12)
  • Parliamentary constituencies (650)
  • Local authorities (382)

This is (1,045 areas x ~50,000 households) = ~ 50 million weight values. In this issue, I'll outline how I suggest we implement this and what we should compromise on for speed/usability.

Firstly, and probably most importantly, I think we should derive containing regions by summing component area weights. We should run a calibration of the constituencies directly, including national targets from summed weights in the loss function and then add up the local area weights to get our UK regions and UK weights. One good reason for this is that there are lots of targets that we only have at higher geographic levels, and this approach ensures we're at least providing that information down at some level to the local weights. Another good reason is internal consistency. And another: I think this will lead to performance gains. The more that we can do in one PyTorch tensor, rather than separate processes, the more we can make use of torch's in-built parallelisation.

An issue I don't see a way around is that local authorities and Parliamentary constituencies cover the same areas and are at the same level. I think we do one calibration run that outputs constituency, region and national weights, then another that outputs local authority, region and national weights, and then just take region and national weights from the one of them. We're just going to have to run some checks that there's no large inconsistency (which there shouldn't be if we're targeting the same national statistics) and be OK with that I think.

Another thing: we currently reweight the UK for each of seven years. I think we should continue to do that for national weights, but not local areas due to the size of the weight files. So we should reweight all areas in 2022, then run separate calibration of national weights for 2023 and onwards. I don't think that using different loss functions should introduce any big inconsistencies here if we're still targeting the same national statistics.

cc @MaxGhenis

@nikhilwoodruff nikhilwoodruff added the enhancement New feature or request label Sep 20, 2024
@nikhilwoodruff nikhilwoodruff self-assigned this Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant