FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

周周周 ddb10ac509 [Inference, rename] remove padding_offsets from atten use batch_id_per_token (#2880 )

* remove padding_offsets from atten

2025-07-17 18:41:31 +08:00

__init__.py

…

dcu_worker.py

[BugFix] Fix Configs (#2849 )

2025-07-15 19:50:36 -07:00

eplb.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

experts_manager.py

[Sync] Update to latest code (#2679 )

2025-07-03 15:43:53 +08:00

gcu_model_runner.py

[BugFix] Fix Configs (#2849 )

2025-07-15 19:50:36 -07:00

gcu_worker.py

[GCU] Support gcu platform (#2702 )

2025-07-08 13:00:52 +08:00

gpu_model_runner.py

[Inference, rename] remove padding_offsets from atten use batch_id_per_token (#2880 )

2025-07-17 18:41:31 +08:00

gpu_worker.py

[LLM] Update Multinode Deployment (#2830 )

2025-07-16 23:42:54 +08:00

iluvatar_model_runner.py

[Feature] support prompt repetition_penalty (#2806 )

2025-07-17 12:05:52 +08:00

iluvatar_worker.py

Adapt for iluvatar gpu (#2684 )

2025-07-07 16:53:14 +08:00

model_runner_base.py

…

output.py

Merge vl execution path into normal execution path (#2829 )

2025-07-15 22:20:03 +08:00

utils.py

…

worker_base.py

…

worker_process.py

[LLM] fix serval bugs (#2878 )

2025-07-17 14:21:05 +08:00

xpu_model_runner.py

[XPU] Update doc and add scripts for downloading dependencies (#2845 )

2025-07-16 11:05:56 +08:00

xpu_worker.py

[BugFix] Fix Configs (#2849 )

2025-07-15 19:50:36 -07:00