Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Child model search reproduction based on P100 #1

Open
Tiantian-Han opened this issue May 13, 2021 · 5 comments
Open

Child model search reproduction based on P100 #1

Tiantian-Han opened this issue May 13, 2021 · 5 comments

Comments

@Tiantian-Han
Copy link

Hi, thanks very much for your work.
I use following code to reproduce results on P100, but the result is inconsistent with the paper report.
python -u ./search.py

--train_percent=80
--bcfw_steps=10000
--initial-checkpoint=<A path to the one-shot model's weights>
--inference_time_limit 25 #ms

results:
25 ms constraints->searched model test:actual latency 62.48ms (paper report: 27ms)
acc(76.364/93.018) flops(467M) params(6.849M)

I'm very confused. Could you help me?

@soyebn
Copy link

soyebn commented May 15, 2021

@Tiantian-Han, I am also facing similar issue. I reran the search command and I am getting 72 msec. I created new issue if you want to look at what command I used?

@Tiantian-Han
Copy link
Author

Tiantian-Han commented May 16, 2021

@soyebn I think that I have found and solved this problem. You can check line 434, line 440 and line 866 in search.py. LUT file is measured in seconds, line 434 should be formulated in Milliseconds according to paper. Latency is computed in seconds in line 866. Latency constraints=25 ms is unreasonable. Latency constraints consistent with the LUT should be given. That is to say, we need to ensure the consistency of the units in the entire project.

@soyebn
Copy link

soyebn commented May 17, 2021

Hi @Tiantian-Han, I think what you are saying makes sense. There is some inconsistency in the assumed latency unit in the code. I think most of the latency search code uses msec as unit but the predicted latency computed in extract_expected_latency(), using LUT, returns latency in second. So I made the following change after line #429 in search.py. I assume line #434 and #440 expect latency in msec.

    list_alphas, fixed_latency = extract_structure_param_list(model, file_name=args.lut_filename,
                                                          batch_size=args.lut_measure_batch_size,
                                                          repeat_measure=args.repeat_measure,
                                                          target_device=args.target_device)
    convert_sec_to_ms = True
    unit_scale = 1000.0 if convert_sec_to_ms else 1.0
    fixed_latency = fixed_latency * unit_scale

I reran the search command with the above change but I still didn't get close to target latency. This is what I got,
Latency_predicted=0.06463486208752146, latency_measured=0.04632913827896118, diff=-0.018305723808560284

Do you think some more changes needed? If you were able to fix this then can you pls share your code changes?
Thanks. This was nice discussion.

@Tiantian-Han
Copy link
Author

Hi @soyebn, you check again if the units of "list_alphas" and "fixed_latency" in search.py are the same. I guess that you only modify "fixed_latency" to be in milliseconds. However, line 429 in search.py --> line 255 in nas/nas_utils/general_purpose.py --> line 226~236 in nas/nas_utils/general_purpose.py --> line 244 in nas/nas_utils/general_purpose.py show that "list_alphas" contains latency values in seconds.

 list_betas.append(entry)
 list_betas += alpha_entries

dict_latency = compute_latency(model, list_betas, file_name=file_name, batch_size=batch_size,
                               repeat_measure=repeat_measure, target_device=target_device)
fixed_latency = dict_latency['general']
for entry in list_betas:
    if isinstance(entry, dict) and 'alpha_entries' in entry.keys():
        alpha_entries = entry['alpha_entries']
        for alpha_entry in alpha_entries:
            alpha_entry['latency'] = torch.tensor([dict_latency[s][1] for s in alpha_entry['submodules']])

return list_betas, fixed_latency

In my code, I use the unit "milliseconds" for all latency values.
e.g. @inference_time_limit=27
Latency_predicted=26.91327567897737, latency_measured=27.861732769012452, diff=0.948457090035081

@soyebn
Copy link

soyebn commented May 17, 2021

Hi @Tiantian-Han, You indeed got very close to target latency. Is it possible for you to share your code changes with me? Looks like you have already done it carefully so it can save some time for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants