FastDeploy/draft_model at c294fc8139af20d0f1bef24eb74964f74e8b70c6 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-10-06 00:57:33 +08:00

Files

History

freeliuzc 52eda7fdb3 [Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

..

draft_model_postprocess.cu

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

draft_model_preprocess.cu

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

draft_model_set_value_by_flags.cu

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

draft_model_update.cu

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

eagle_get_base_model_hidden_states.cu

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00

eagle_get_self_hidden_states.cu

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

hydra_fetch_hidden_states.cu

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

mtp_save_first_token.cc

Sync v2.0 version of code to github repo

2025-06-29 23:29:37 +00:00

mtp_step_paddle.cu

[LLM] First commit the llm deployment code

2025-06-09 19:20:15 +08:00

ngram_match_mixed.cu

[Feature][MTP]support new speculative decoding method named hybrid mtp with ngram (#3610 )

2025-08-26 14:29:22 +08:00