[Embedding] Add inf-cl in embedding trainer #9673

jie-z-0607 · 2024-12-23T08:29:08Z

PR types

Function optimization

PR changes

Others

Description

在embedding训练中增加inf_cl_loss，在超大batch_size下能有效节省显存消耗。

经测试，inf-cl算子能够与原有损失函数有效对齐：

以数据类型设置bf16，group_size设置1，gradient_accumulation_steps设置4为例，inf_cl_loss与原有contrastive_loss的收敛曲线如下：

经测试，在超大batch_size下，inf-cl算子能够有效降低embedding训练时的显存消耗：

在8张A100（80G）显卡下，以数据类型设置bf16，group_size设置4，gradient_accumulation_steps设置4096为例，inf_cl_loss与原有contrastive_loss的显存占用对比如下：

参数设置	显存占用	首个step完成耗费时间
不使用inf-cl；embedding_negatives_cross_device=True	42238MiB；42526MiB； 42526MiB；42470MiB； 42470MiB；42526MiB； 42526MiB；42182MiB	48min42s
使用inf-cl；embedding_negatives_cross_device=Flase	29630MiB；28392MiB； 28372MiB；28308MiB； 28320MiB；28384MiB； 28316MiB；28070MiB	49min56s

在8张A100（80G）显卡下，以数据类型设置bf16，group_size设置1，gradient_accumulation_steps设置16384（总计batch_size 128K）为例，inf_cl_loss与原有contrastive_loss的显存占用对比如下：

参数设置	显存占用	首个step完成耗费时间
不使用inf-cl；embedding_negatives_cross_device=True	超出显存限制
使用inf-cl；embedding_negatives_cross_device=Flase	46324MiB；45192MiB； 44926MiB；45180MiB； 44674MiB；45022MiB； 45032MiB；44904MiB	2h23min46s

paddle-bot · 2024-12-23T08:29:12Z

Thanks for your contribution!

codecov · 2024-12-23T09:02:15Z

Codecov Report

Attention: Patch coverage is 13.95349% with 37 lines in your changes missing coverage. Please review.

Project coverage is 52.76%. Comparing base (1842d6d) to head (8f55e52).
Report is 7 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/transformers/contrastive_loss.py	18.18%	27 Missing ⚠️
paddlenlp/trl/embedding_trainer.py	0.00%	10 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9673      +/-   ##
===========================================
- Coverage    53.18%   52.76%   -0.43%     
===========================================
  Files          718      718              
  Lines       113340   112338    -1002     
===========================================
- Hits         60282    59276    -1006     
- Misses       53058    53062       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ZHUI · 2024-12-23T09:13:11Z

ops/src/paddlenlp_kernel/triton/inf_cl/inf_cl_loss.py

+__all__ = ["Simple_Inf_cl_loss", "Matryoshka_Inf_cl_loss"]
+
+
+class Simple_Inf_cl_loss(nn.Layer):


加一些注释

ZHUI · 2024-12-23T09:15:02Z

paddlenlp/trl/embedding_trainer.py

@@ -18,6 +18,10 @@
 from paddle.base import core
 from paddle.distributed import fleet

+from ops.src.paddlenlp_kernel.triton.inf_cl.inf_cl_loss import (


Suggested change

from ops.src.paddlenlp_kernel.triton.inf_cl.inf_cl_loss import (

from paddlenlp_kernel.triton.inf_cl.inf_cl_loss import (

ZHUI · 2024-12-23T09:26:05Z

paddlenlp/trl/embedding_trainer.py

@@ -18,6 +18,10 @@
 from paddle.base import core
 from paddle.distributed import fleet

+from ops.src.paddlenlp_kernel.triton.inf_cl.inf_cl_loss import (


这个没有默认安装,需要 try except一下

ZHUI · 2024-12-24T08:29:06Z

paddlenlp/transformers/contrastive_loss.py

+        group_size = p_reps.shape[0] // q_reps.shape[0]  # Number of keys per query
+        labels = paddle.arange(q_reps.shape[0], dtype="int64")  # Generate labels for queries
+        labels = labels * group_size  # Adjust labels based on group size
+        loss = cal_inf_loss(q_reps, p_reps, labels=labels, scale=None, head_dim=self.head_dim)


你把import 的代码放到这里吧, 然后没有包的话，直接报错。

try: from paddlenlp_kernel.triton.inf_cl import cal_inf_loss except ImportError: logger.warning( "Paddlenlp_kernels are not available, which means the inf_cl loss cannot be used. If you wish to use the inf_cl loss, please follow the instructions in the README.md on the `ops`." )

add inf-cl in embedding trainer

dd7fc8a

paddle-bot bot added the contributor label Dec 23, 2024

paddle-bot bot assigned KB-Ding Dec 23, 2024

ZHUI reviewed Dec 23, 2024

View reviewed changes

jie-z-0607 added 4 commits December 23, 2024 18:00

add annotations and fix import

3b06655

rename inf_cl_loss and fix warning

e3c55c3

rename simple_inf_cl

6b6a108

Change inf_cl location

d69ac4a

ZHUI reviewed Dec 24, 2024

View reviewed changes

jie-z-0607 added 2 commits December 24, 2024 16:38

Change import location

f05dd61

Change error information

8f55e52

jie-z-0607 requested a review from ZHUI December 24, 2024 09:00

DesmonDay approved these changes Dec 25, 2024

View reviewed changes

ZHUI changed the title ~~add inf-cl in embedding trainer~~ [Embedding] Add inf-cl in embedding trainer Dec 25, 2024

ZHUI merged commit 40fa402 into PaddlePaddle:develop Dec 25, 2024
8 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Embedding] Add inf-cl in embedding trainer #9673

[Embedding] Add inf-cl in embedding trainer #9673

jie-z-0607 commented Dec 23, 2024

paddle-bot bot commented Dec 23, 2024

codecov bot commented Dec 23, 2024 •

edited

Loading

ZHUI Dec 23, 2024

ZHUI Dec 23, 2024

ZHUI Dec 23, 2024

ZHUI Dec 24, 2024

		__all__ = ["Simple_Inf_cl_loss", "Matryoshka_Inf_cl_loss"]


		class Simple_Inf_cl_loss(nn.Layer):

	from ops.src.paddlenlp_kernel.triton.inf_cl.inf_cl_loss import (
	from paddlenlp_kernel.triton.inf_cl.inf_cl_loss import (

[Embedding] Add inf-cl in embedding trainer #9673

[Embedding] Add inf-cl in embedding trainer #9673

Conversation

jie-z-0607 commented Dec 23, 2024

PR types

PR changes

Description

在embedding训练中增加inf_cl_loss，在超大batch_size下能有效节省显存消耗。

经测试，inf-cl算子能够与原有损失函数有效对齐：

经测试，在超大batch_size下，inf-cl算子能够有效降低embedding训练时的显存消耗：

paddle-bot bot commented Dec 23, 2024

codecov bot commented Dec 23, 2024 • edited Loading

Codecov Report

ZHUI Dec 23, 2024

Choose a reason for hiding this comment

ZHUI Dec 23, 2024

Choose a reason for hiding this comment

ZHUI Dec 23, 2024

Choose a reason for hiding this comment

ZHUI Dec 24, 2024

Choose a reason for hiding this comment

codecov bot commented Dec 23, 2024 •

edited

Loading