[CP] CP Lm head fp32 and temp_logprob to release/2.1 (#3766)

* [Feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing (#3552)

* [feature] Add temp_scaled_logprobs and top_p_normalized_logprobs parameters for logits and logprobs post processing

* infer engine support temp_scaled_logprobs and top_p_normalized_logprobs

* delete some code

* code check

* code check and add doc

* fix tokenizer.decoder(-1), return 'Invalid Token'

* add ci for temp_scaled and top_p logprobs

* check test

* check seq len time shape

* logprob clip inf

---------

Co-authored-by: sunlei1024 <sunlei5788@gmail.com>

* [Precision] Support lm_head layer running in float32 (#3597)

* support lm_head fp32 bf16 fp16

* support lm_head fp32 bf16 fp16

* add doc and check code

* lm_head_fp32 specify lm_head as fp32

* code check

* check doc

* code check

---------

Co-authored-by: sunlei1024 <sunlei5788@gmail.com>
This commit is contained in:
chen
2025-09-01 19:56:54 +08:00
committed by GitHub
parent 4da603daec
commit 1e19833ba5
22 changed files with 188 additions and 54 deletions

View File

@@ -621,7 +621,7 @@ class DeepseekV3ForCausalLM(ModelForCasualLM):
def compute_logits(self, hidden_states: paddle.Tensor):
""" """
logits = self.lm_head(hidden_states)
logits = paddle.cast(logits, paddle.float32)
logits = logits.astype(paddle.float32)
logits[:, self.ori_vocab_size :] = -float("inf")
return logits