Tensor parallelism generates non-sensical outputs #1663

rasbt · 2024-08-08T15:26:37Z

Bug description

For some reason, the tensor parallel implementation generates non-sensical outputs

⚡ python-api-tensor-parallel ~/litgpt litgpt generate_tp checkpoints/microsoft/phi-2 
...
Instruct: What food do llamas eat?
Output: When the
.

The first

.

The first

.

Time for inference 1: 1.31 sec total, 15.23 tokens/sec

Expected output (e.g., via base or sequential generation):

Instruct: What food do llamas eat?
Output: Llamas eat grass, shrubs, and other vegetation.

What operating system are you using?

Linux

LitGPT Version

Current main branch

The text was updated successfully, but these errors were encountered:

rasbt · 2024-08-08T20:32:44Z

It seems to be related to the MLP class:

Has problem:

microsoft/phi-2
- GptNeoxMLP
EleutherAI/pythia-2.8b
- GptNeoxMLP
stabilityai/stablelm-base-alpha-7b
- GptNeoxMLP
google/gemma-2-2b
- GemmaMLP

Is fine:

meta-llama/Meta-Llama-3.1-8B-Instruct
- LLaMAMLP
openlm-research/open_llama_3b
- LLaMAMLP
microsoft/Phi-3-mini-4k-instruct
- LLaMAMLP
garage-bAInd/Platypus2-7B
- LLaMAMLP

It could be that this could automatically get fixed via #1421

rasbt added the bug Something isn't working label Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor parallelism generates non-sensical outputs #1663

Tensor parallelism generates non-sensical outputs #1663

rasbt commented Aug 8, 2024 •

edited

Loading

rasbt commented Aug 8, 2024

Tensor parallelism generates non-sensical outputs #1663

Tensor parallelism generates non-sensical outputs #1663

Comments

rasbt commented Aug 8, 2024 • edited Loading

Bug description

What operating system are you using?

LitGPT Version

rasbt commented Aug 8, 2024

Has problem:

Is fine:

rasbt commented Aug 8, 2024 •

edited

Loading