FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-09-26 20:41:53 +08:00

Author	SHA1	Message	Date
Jiang-Jia-Jun	e421d51001	[Feature] Support include_stop_str_in_output (#2919 ) Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com> v2.0.2	2025-07-18 19:43:19 +08:00
sg263	c71d955e9c	[Trace]fix opentelemetry can not work in uvicorn (#2907 ) * add opentelemetry * add opentelemetry * add opentelemetry on dequeue * add opentelemetry on dequeue * add opentelemetry on dequeue * fix opentelemetry-instrumentation-fastapi * fix annotation * fix opentelemetry-bootstrap * fix opentelemetry-bootstrap * fix opentelemetry can not work in uvicorn * remove unless import * move conf to env * fix useless commit --------- Co-authored-by: shige <shige@baidu.com>	2025-07-17 23:16:29 +08:00
gaoziyuan	2d2468ae72	fix config get (#2883 )	2025-07-17 15:03:26 +08:00
sg263	7deac64233	[Bug Fix] fix opentelemetry-bootstra (#2875 ) * add opentelemetry * add opentelemetry * add opentelemetry on dequeue * add opentelemetry on dequeue * add opentelemetry on dequeue * fix opentelemetry-instrumentation-fastapi * fix annotation * fix opentelemetry-bootstrap * fix opentelemetry-bootstrap --------- Co-authored-by: shige <shige@baidu.com>	2025-07-17 00:51:02 +08:00
sg263	5a5f17cf97	fix put opentelemetry-instrumentation-fastapi in requierment (#2874 ) * add opentelemetry * add opentelemetry * add opentelemetry on dequeue * add opentelemetry on dequeue * add opentelemetry on dequeue * fix opentelemetry-instrumentation-fastapi * fix annotation --------- Co-authored-by: shige <shige@baidu.com>	2025-07-17 00:41:53 +08:00
sg263	0d61c65de1	[Trace] Support trace log (#2864 ) * add opentelemetry * add opentelemetry * add opentelemetry on dequeue * add opentelemetry on dequeue * add opentelemetry on dequeue	2025-07-16 15:35:44 +08:00
Jiang-Jia-Jun	e5de28bff2	Update setup.py	2025-07-15 10:11:26 +08:00
AIbin	b9eede57b6	cp PR#2820 to release/2.0.2 (#2839 )	2025-07-14 17:05:56 +08:00
lddfym	94e1a895e3	fix spelling error (#2826 ) * fix spelling error * fix scheduler reset error	2025-07-14 13:13:08 +08:00
zhenwenDang	87203ec87b	After enabling "top_logprobs supports passing 0 and fix max_completion_tokens", an incorrect finish_reason was returned. (#2815 ) * /v1/chat/completions endpoint now supports max_completion_tokens and fixes the return value of finish_reason * top_logprobs supports passing 0	2025-07-11 16:53:12 +08:00
Sunny-bot1	4596dd7248	[FIX 2.0.2]fix topp topk default value (#2810 ) * fix topp topk default value * update topk	2025-07-11 16:12:02 +08:00
lddfym	ec986642df	Global scheduler supports configuring hot updates (#2812 )	2025-07-11 13:39:30 +08:00
chen	94691bcd90	fix enable_logprob not in rl_config (#2808 )	2025-07-11 11:52:48 +08:00
Sunny-bot1	4025ea7e5b	[FIX 2.0.2] Topk topp sampling fix (#2805 ) * fix topk-topp * fix	2025-07-10 06:15:03 -07:00
lizexu123	e681e1e719	[BugFix] fix RMSNorm rms_norm_esp (#2804 )	2025-07-10 05:39:02 -07:00
chen	823a47e64a	[Feature] Support return logprob of generated tokens (#2784 ) * online chat support logprobs * check xpu * check vl_gpu_model_runner * only cuda support logprob * get_worker() check platform --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>	2025-07-10 15:47:42 +08:00
gaoziyuan	39d2a1de46	fix num_blocks_local when small size model in TP2 running mode (#2793 )	2025-07-10 13:44:56 +08:00
Sunny-bot1	1107e08cd9	[Feature 2.0.2] support top_k_top_p sampling (#2789 ) * support top_k_top_p sampling * fix * add api param * add api para * fix * fix * fix * fix * fix * fix * fix * fix * change func name	2025-07-09 21:01:51 -07:00
Jiang-Jia-Jun	1fe37cb7e8	[BugFix] Fix vocab size error for ernie model	2025-07-09 22:33:04 +08:00
gaoziyuan	337d76f094	[sync fix] (#2759 ) * add rl qwen model support * fix * fix * add_commit_config * fix	2025-07-08 19:29:23 +08:00
gaoziyuan	ae2f78184d	【Sync develop】 add commit info (#2755 ) * add rl qwen model support * fix * fix * add_commit_config	2025-07-08 17:02:50 +08:00
gaoziyuan	6851489425	【Sync】Release/2.0.1 (#2745 ) * add rl qwen model support * fix * fix	2025-07-08 14:38:18 +08:00
Jiang-Jia-Jun	ea787d8f62	fix bug. (#2718 ) (#2720 ) Co-authored-by: Ting <wtmlon@foxmail.com>	2025-07-05 09:00:01 +08:00
Ting	90ef28d982	spec token map lazy. (#2715 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-05 00:14:54 +08:00
YuBaoku	b37585e693	[BugFix] fix paddle_git_commit_id error (#2714 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * set git identity to avoid merge failure in CI * add ci cases * [CI] Add validation for MTP and CUDAGraph * [BugFix] fix paddle_git_commit_id error	2025-07-04 22:16:37 +08:00
lizexu123	9cb08e71e8	add support QWQ enable_thinking (#2706 ) * add support QWQ enable_thinking * add stream=True * fix stream=true * fix qwen --------- Co-authored-by: lizexu <lizexu@baidu.com>	2025-07-04 20:55:23 +08:00
YuBaoku	dacc46f04c	[CI] Add validation for MTP and CUDAGraph (#2710 ) * set git identity to avoid merge failure in CI * add ci cases * [CI] Add validation for MTP and CUDAGraph	2025-07-04 18:13:54 +08:00
Jiang-Jia-Jun	09ded7715f	Update mkdocs.yml	2025-07-04 17:55:52 +08:00
LQX	11cfdf5d89	添加XPU CI, test=model (#2701 ) * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model * 添加XPU CI, test=model	2025-07-04 16:16:06 +08:00
GoldPancake	e7fa57ebae	Extract eh_proj Layer from ParallelLMHead for MTP to Avoid Weight Transposition Issue (#2707 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * fix mtp eh_proj layer * fix mtp update_cfg function * fix stringdoc * simplify class name	2025-07-04 14:15:04 +08:00
gaoziyuan	a5ae88ded9	[feature]add fd whl version info (#2698 )	2025-07-04 14:12:42 +08:00
ltd0924	87e638498c	[RL] update reschedule finish reason (#2709 )	2025-07-04 13:47:36 +08:00
freeliuzc	667547be59	support chunk_prefill in MTP (#2705 )	2025-07-04 11:55:48 +08:00
LiqinruiG	b38823bc66	modify reasoning_output docs (#2696 )	2025-07-04 11:30:02 +08:00
Divano	050d9658a5	Update requirements.txt	2025-07-04 09:53:03 +08:00
Divano	be5cabaf80	add quick benchmark (#2703 ) 测试脚本不需要过CI	2025-07-04 09:32:36 +08:00
Yuanle Liu	240bdac2a4	[feat] support fa3 backend for pd disaggregated (#2695 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * support fa3 backend run in pd disaggregated * delete use_fast_ffn	2025-07-03 22:33:27 +08:00
ltd0924	00863c43fd	[Bug] fix logger format (#2689 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-03 19:58:03 +08:00
kevin	3d3bccdf79	[doc] update docs (#2690 )	2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun	9fd74f75bd	Update dynamic_weight_manager.py	2025-07-03 15:55:22 +08:00
Jiang-Jia-Jun	05c670e593	[Sync] Update to latest code (#2679 ) * [Sync] Update to latest code * Add new code files * Add new code files * update code * Try to fix build.sh * Try to fix build.sh * Update code * Update requirements.txt * Update code --------- Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>	2025-07-03 15:43:53 +08:00
Jiang-Jia-Jun	d222248d00	Update README.md	2025-07-03 15:28:28 +08:00
Jiang-Jia-Jun	e5b94d4117	Update README.md	2025-07-03 15:28:05 +08:00
Jiang-Jia-Jun	87e2e58a22	Update gh-pages.yml	2025-07-03 15:26:21 +08:00
Jiang-Jia-Jun	de20e5a992	Update Dockerfile.xpu Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-03 10:14:50 +08:00
Jiang-Jia-Jun	2f9c0618f0	Update Dockerfile.gpu	2025-07-03 10:14:39 +08:00
Yuanle Liu	9a14ab6572	add --force-reinstall --no-cache-dir when pip install fastdeploy*.whl (#2682 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-07-02 05:32:20 -07:00
Divano	d1cb3ed571	Update gh-pages.yml (#2680 )	2025-07-02 17:36:18 +08:00
handiz	b8a8a19689	add wint2 performance (#2673 )	2025-07-02 17:10:01 +08:00
Jiang-Jia-Jun	97ac82834f	Update nvidia_gpu.md	2025-07-02 16:54:14 +08:00

1 2 3 4 5 ...

2640 Commits