This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2025-12-24 13:28:13 +08:00
Code
Issues
Actions
2
Packages
Projects
Releases
Wiki
Activity
Files
d60f7c46619ba45044d1fcd866d09f0e96bd88a2
FastDeploy
/
fastdeploy
/
worker
History
AIbin
a7392a0ff9
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
...
* support MLA chunk_size auto search & cuda_graph
2025-09-11 10:46:09 +08:00
..
__init__.py
…
dcu_model_runner.py
…
dcu_worker.py
…
eplb.py
…
experts_manager.py
…
gcu_model_runner.py
[Excutor] Experiment Feature-Support Prefill in cudagraph (
#3459
)
2025-09-08 13:12:24 +08:00
gcu_worker.py
…
gpu_model_runner.py
【Inference Optimize】DeepSeek-V3-model MLA Optimize (
#3886
)
2025-09-11 10:46:09 +08:00
gpu_worker.py
[BugFix] Fix the abnormal memory usage caused by shape errors in the triton moe backend (
#4026
)
2025-09-09 20:05:54 -07:00
iluvatar_model_runner.py
…
iluvatar_worker.py
…
metax_model_runner.py
…
metax_worker.py
…
model_runner_base.py
…
output.py
…
utils.py
…
worker_base.py
…
worker_process.py
[V1 Loader] Ernie kv cache quant support v1 loader (
#3899
)
2025-09-09 05:25:08 -07:00
xpu_model_runner.py
[XPU]Fixed the issue of performance degradation caused by enabling ENABLE_V1_KVCACHE_SCHEDULER (
#3897
)
2025-09-08 10:34:46 +08:00
xpu_worker.py
…