Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

GhostScreaming · 2022-11-08T08:26:06Z

When FP32 and FP16 model runs on A100 machine, it can be accelerated using TensorCore. Although NVIDIA declares that fp32 computation will be transferred to TensorCore automatically on A100, we detect that it's not the case. As a result, we set flag FLAGS_enable_cublas_tensor_op_math True in default manually.

sneaxiy

LGTM.

CLAassistant · 2024-09-28T07:18:36Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Set flag FLAGS_enable_cublas_tensor_op_math on in default.

0ccab14

sneaxiy approved these changes Nov 8, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

GhostScreaming commented Nov 8, 2022

sneaxiy left a comment

CLAassistant commented Sep 28, 2024

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

Are you sure you want to change the base?

Set flag FLAGS_enable_cublas_tensor_op_math True in default. #896

Conversation

GhostScreaming commented Nov 8, 2022

sneaxiy left a comment

Choose a reason for hiding this comment

CLAassistant commented Sep 28, 2024