Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Block weight quantize tool for LLM [skip ci] #13758

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hseok-oh
Copy link
Contributor

@hseok-oh hseok-oh commented Aug 26, 2024

  • Block quantization for LLM: FullyConnected, Gather
  • Decide quantize type by circle-quantizer parameter: --block_quantize_weights (Q4_0, Q8_0)
  • Skip quantization by circle-quantizer parameter: --skipsize_block_quantize (default: 0)

Caution: It's for PoC of circle format and test model generation. Not for compiler implementation.
#13742 #13743

@hseok-oh hseok-oh added the PR/NO TEST Tell CI to not run test label Aug 26, 2024
@hseok-oh hseok-oh changed the title PoC: Blckwise weight quantize tool for LLM [skip ci] PoC: Chunk weight quantize tool for LLM [skip ci] Aug 26, 2024
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch 3 times, most recently from b23a54b to 47eede8 Compare August 27, 2024 05:33
@hseok-oh hseok-oh changed the title PoC: Chunk weight quantize tool for LLM [skip ci] PoC: Block weight quantize tool for LLM [skip ci] Aug 27, 2024
@hseok-oh hseok-oh force-pushed the draft/weight_quant_llm branch 2 times, most recently from efac650 to 750278f Compare October 11, 2024 08:11
- Blockwise quantization for LLM: FullyConnected, Gather
- Decide quantize type by circle-quantizer parameter: `--quantize_weights_chunk` (Q4_0, Q8_0)
- Skip quantization by circle-quantizer parameter: `--skip_chunkquant_size` (default: 0)

ONE-DCO-1.0-Signed-off-by: Hyeongseok Oh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR/NO TEST Tell CI to not run test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant