李泳桦
eca8fc7ca6
[feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client ( #3077 )
...
* [feat] extra parameters are all passed directly via http payload now, or in extra_body if using openai client
* [fix] delete ci test case for enable_thinking
* [fix] add reasoning_parser when server starts
* [doc] update docs related to metadata
* [fix] fix ci consistency test error with reasoning parser
* [fix] cancel enable_thinking default value
2025-07-30 19:25:39 +08:00
Yzc216
980126b83a
[Feature] multi source download ( #3005 )
...
* multi-source download
* multi-source download
* huggingface download revision
* requirement
* style
* add revision arg
* test
* pre-commit
* Change default download
* change requirements.txt
* modify English Documentation
* documentation
2025-07-24 17:42:09 +08:00
lizexu123
67990e0572
[Feature] support min_p_sampling ( #2872 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* Fastdeploy support min_p
* add test_min_p
* fix
* min_p_sampling
* update
* delete vl_gpu_model_runner.py
* fix
* Align usage of min_p with vLLM
* fix
* modified unit test
* fix test_min_sampling
* pre-commit all files
* fix
* fix
* fix
* fix xpu_model_runner.py
2025-07-20 23:17:59 -07:00
Zero Rains
25698d56d1
polish code with new pre-commit rule ( #2923 )
2025-07-19 23:19:27 +08:00
RAM
bbe2c5c968
Update GraphOptimizationBackend docs ( #2898 )
2025-07-17 21:38:18 +08:00
yulangz
c8c280c4d3
[XPU][Doc] fix typo ( #2892 )
2025-07-17 19:13:54 +08:00
yulangz
7dfd2ea052
[XPU][doc] Update minimal fastdeploy required ( #2863 )
...
* [XPU][doc] update minimal fastdeploy required
2025-07-17 11:33:22 +08:00
yulangz
17314ee126
[XPU] Update doc and add scripts for downloading dependencies ( #2845 )
...
* [XPU] update xvllm download
* update supported models
* fix xpu model runner in huge memory with small model
* update doc
2025-07-16 11:05:56 +08:00
zhenwenDang
5fc659b900
[Docs] add enable_logprob parameter description ( #2850 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* add enable_logprob parameter description
* add enable_logprob parameter description
* add enable_logprob parameter description
* add enable_logprob parameter description
* add enable_logprob parameter description
* add enable_logprob parameter description
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-07-15 19:47:45 +08:00
AIbin
b7858c22d9
【Update Docs】update supported_models doc ( #2836 )
...
* update supported_models doc
2025-07-14 16:01:34 +08:00
Sunny-bot1
240d6236bc
[Fix]fix top_k_top_p sampling ( #2801 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix topk-topp
* update
* add base_non_truncated
2025-07-10 22:35:10 +08:00
LiqinruiG
ce5adec877
[Doc] modify offline-inerence docs ( #2800 )
...
* modify offline-inerence docs
* [bug] remove tool_call_content
2025-07-10 19:41:12 +08:00
yulangz
830de5a925
[XPU] Supports TP4 deployment on 4,5,6,7 ( #2794 )
...
* 支持通过 XPU_VISIBLE_DEVICES 指定 4,5,6,7 卡运行
* 修改 XPU 文档中多卡说明
2025-07-10 16:48:08 +08:00
Sunny-bot1
1e2319cbef
Rename top_p_sampling to top_k_top_p_sampling ( #2791 )
2025-07-10 00:09:25 -07:00
Sunny-bot1
e45050cae3
[Feature] support top_k_top_p sampling ( #2753 )
...
* support top_k_top_p sampling
* fix
* add api param
* add api para
* fix
* fix
* fix
* fix
* fix
* fix
* fix
2025-07-09 20:58:58 -07:00
LiqinruiG
54affdc44b
[Doc] modify offline_inference docs ( #2787 )
...
Deploy GitHub Pages / deploy (push) Has been cancelled
* modify reasoning_output docs
* modify offline inference docs
* modify offline inference docs
* modify offline_inference docs
* modify offline_inference docs
2025-07-10 01:06:14 +08:00
LiqinruiG
4ccd1696ab
[Doc] modify offline inference docs ( #2747 )
...
* modify reasoning_output docs
* modify offline inference docs
* modify offline inference docs
2025-07-09 20:53:26 +08:00
chen
888780ffde
[Feature] block_wise_fp8 support triton_moe_backend ( #2767 )
2025-07-09 19:22:47 +08:00
lifulll
1f28bdf994
dcu adapter ernie45t ( #2756 )
...
Co-authored-by: lifu <lifu@sugon.com >
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-07-09 18:56:27 +08:00
zhink
b89180f1cd
[Feature] support custom all-reduce ( #2758 )
...
* [Feature] support custom all-reduce
* add vllm adapted
2025-07-09 16:00:27 +08:00
EnflameGCU
d0f4d6ba3a
[GCU] Support gcu platform ( #2702 )
...
baseline: e7fa57ebae
Co-authored-by: yongqiangma <xing.wo@163.com >
2025-07-08 13:00:52 +08:00
chen
66b321d9ec
Update eb45-0.3B cuda memory ( #2686 )
2025-07-07 11:31:15 +08:00
LiqinruiG
b38823bc66
modify reasoning_output docs ( #2696 )
2025-07-04 11:30:02 +08:00
kevin
3d3bccdf79
[doc] update docs ( #2690 )
2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
d222248d00
Update README.md
2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun
e5b94d4117
Update README.md
2025-07-03 15:28:05 +08:00
handiz
b8a8a19689
add wint2 performance ( #2673 )
2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun
97ac82834f
Update nvidia_gpu.md
2025-07-02 16:54:14 +08:00
Jiang-Jia-Jun
685265a97d
Update nvidia_gpu.md
2025-07-02 15:43:35 +08:00
Jiang-Jia-Jun
fc4d643634
Update nvidia_gpu.md
2025-07-02 15:39:48 +08:00
liddk1121
865e856a94
update iluvatar gpu fastdeploy whl ( #2675 )
2025-07-02 14:47:21 +08:00
Jiang-Jia-Jun
9f4a65d817
Update README.md
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-02 10:04:58 +08:00
freeliuzc
2b7f74d427
fix docs ( #2669 )
...
Co-authored-by: liuzichang01 <liuzichang01@baidu.com >
2025-07-01 18:02:44 +08:00
Jiang-Jia-Jun
164b83ab0b
[Doc] Update nvidia gpu installation description
2025-07-01 15:22:19 +08:00
Jiang-Jia-Jun
01d5d66d95
[Doc] Update nvidia gpu installation description
2025-07-01 15:20:40 +08:00
Jiang-Jia-Jun
8f1dddcf35
[Doc] Update nvidia gpu installation description
2025-07-01 15:20:21 +08:00
hong19860320
8e335db645
Update kunlunxin_xpu.md ( #2662 )
2025-07-01 15:10:45 +08:00
AIbin
1bb296c5ad
update quantization doc ( #2659 )
2025-07-01 15:05:02 +08:00
hong19860320
92428a5ae4
Update kunlunxin_xpu.md ( #2657 )
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-07-01 12:28:49 +08:00
hong19860320
6b95b42986
Update kunlunxin_xpu.md
2025-06-30 15:49:32 +08:00
hong19860320
b0d3a630ba
Merge branch 'develop' of https://github.com/hong19860320/FastDeploy into hongming/fix_xpu_doc
2025-06-30 15:42:29 +08:00
hong19860320
ef72873695
Update kunlunxin_xpu.md
2025-06-30 15:27:48 +08:00
kevin
4f7b42ce3e
update docs
2025-06-30 14:45:41 +08:00
qingqing01
90a5b18742
Update disaggregated.md
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-06-30 11:57:12 +08:00
qingqing01
7c43500060
Update disaggregated.md
2025-06-30 11:56:33 +08:00
Jiang-Jia-Jun
ea29b01a68
Update quick_start.md
2025-06-30 11:52:05 +08:00
yongqiangma
f9431106d8
Merge branch 'develop' into doc
2025-06-30 11:42:43 +08:00
mayongqiang
0d39e23ab9
fix format
2025-06-30 11:39:59 +08:00
changwenbin
634d3c3642
update wint2 doc
2025-06-30 11:36:15 +08:00
Jiang-Jia-Jun
50c5bc1e9d
Update nvidia_gpu.md
2025-06-30 08:59:41 +08:00