FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Author	SHA1	Message	Date
kevin	8e4e3ff510	[Feature] support eplb in api_server (#4782 ) * support eplb in api_server * update code * add eplb test case * update eplb * support tp+dp eplb * update test cese * update code * update code * fix bug * update copilot review * update test case name	2025-11-24 20:22:29 +08:00
Jiaxin Sui	5ff93d4998	[XPU][CI] change VL model to 28B-VL-thinking (#5169 ) * Enhance run_ci_xpu.sh with caching and prefill options * Update model path and configuration in run_ci_xpu.sh * Add '北朝' keyword to assertion in run_45vl.py * Enhance process termination logic in run_ci_xpu.sh * Set timeout for CI_XPU job to 60 minutes * Remove extra newline in stop_processes function	2025-11-24 16:50:18 +08:00
xunyoyo	7bac016c77	[CI] 【Hackathon 9th Sprint No.18】NO.18 功能模块单测补充 (#5064 ) * Add unit tests for DeepEP buffer functionality This file contains unit tests for the DeepEP buffer helpers and runners, including various test cases for buffer allocation, cleanup, and dispatching processes. * Refactor DeepEP tests to use scoped stubs * Add licensing information to test_ep.py Added licensing information to the test file.	2025-11-24 15:52:34 +08:00
YuBaoku	98f1ab46a9	[CI] add output for last_token in test_streaming_with_stop_str (#5170 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details	2025-11-24 10:49:17 +08:00
周周周	e297406263	[Others] unitest tests/layers/test_attention_layer.py (#5174 )	2025-11-23 22:21:01 +08:00
kevin	cceaba1c8d	[Feature] remove to_numpy (#5162 ) * remove to_numpy * update code * update name * update code * update code * update code	2025-11-21 21:54:26 +08:00
kevin	c068a4f642	[Feature] dyc8 support prefixcache (#5125 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details * dyc8 support prefixcache * fix cache_trans test case * update code	2025-11-21 19:46:26 +08:00
chenjian	3ea1b44a58	[Optimization] Improve perf for fd response token with internal adapter (#4992 ) * [Optimize] Improve perf for fd response token with internal adapter * fix * fix bug * fix ci * fix ci * fix ci * fix ci	2025-11-21 19:02:03 +08:00
xiaoxiaohehe001	6ca2651995	[Feature] Support noaux for eplb (#5143 ) * support noaux eplb * noaux_eplb * noaux_eplb * noaux_eplb	2025-11-21 14:10:32 +08:00
essos	79f18331b6	[CI]【Hackathon 9th Sprint No.51】NO.51 功能模块 fastdeploy/scheduler/dp_scheduler.py 单测补充 (#5046 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * update test utils * Add comprehensive unit tests for DP scheduler functionality - Add test_dp_scheduler.py with full-featured unit tests supporting both normal and standalone modes - Add test_dp_scheduler_simple.py with lightweight mock-based tests for easy execution - Add comprehensive README.md documenting test architecture and usage - Tests cover DPLocalScheduler and DPScheduler classes with focus on: - Request lifecycle management and TTL support - Response handling and routing - Resource-based scheduling and constraint handling - Multi-threading and concurrent operations - Splitwise role support (prefill vs decode) - Error handling and edge cases - Thread-safe operations with proper synchronization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Remove tests/multimodal/test_utils.py This file appears to be duplicate or misplaced, removing it to clean up the test structure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * update * fix * rm unused file --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-11-21 10:52:33 +08:00
kevin	7454480e07	[Feature] support bos download retry (#5137 ) * support bos download retry * update code * update code	2025-11-21 10:18:32 +08:00
Yonghua Li	43097a512a	[BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol (#5132 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * [fix] fix v1 scheduler profile run for append attention in prefill node * [fix] skip send_signal if kv signal not inited for gpu and xpu * [fix] extend fix to flash_attn & mla_attn * [fix] fix v1 pd run in ipc transfer protocol * [ci] add test for v1 pd profile run using ipc transfer protocol * [style] fix code style check * [style] fix code style again * [fix] fix profile run * [update] remove --num-gpu-blocks-override in example script * [chore] rename forward_meta is_profiling to is_dummy_or_profile_run	2025-11-20 21:39:22 +08:00
周周周	385fe6dade	[Others] clean code (#5133 )	2025-11-20 18:44:08 +08:00
周周周	6fa34102e8	[Others]get_block_shape_and_split_kv_block clean code (#5123 )	2025-11-20 16:40:04 +08:00
yangjianfengo1	af715db763	[Scheduler] Support chunk prefill for video input (#5107 ) * add video chunk prefill * add vit_merge=True for test_tokenizer_client.py --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-11-20 16:29:13 +08:00
kevin	109d48e456	[Feature] support async download features (#5003 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * support async download features * add test case * update code	2025-11-19 22:23:36 +08:00
Zhang Yulong	be9541a97b	[CI] add metrics case (#5115 ) * add case * add case	2025-11-19 11:50:12 +08:00
Winters Montagne	4694ed2a43	[CI]【Hackathon 9th Sprint No.31】NO.31 功能模块 fastdeploy/input/ernie4_5_processor.py 单测补充 (#5097 ) * Add unit tests for ernie4_5_processor * update * update	2025-11-19 10:51:02 +08:00
Daci	eab8384da6	[Feature] ThreadPoolExecutor async fill_token_bitmask (#5083 ) * ThreadPoolExecutor async fill_token_bitmask * ThreadPoolExecutor async fill_token_bitmask logging * fix test_guided_decoding * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add fill_bitmask_parallel_batch_size ENV * FD_FILL_BITMASK_BATCH fastdeploy.envs --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-19 10:04:16 +08:00
kxz2002	97189079b9	[BugFix] unify max_tokens (#4968 ) * unify max tokens * modify and add unit test * modify and add unit test * modify and add unit tests --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-11-18 20:01:33 +08:00
周周周	6584ee90e8	[unitest]clean code (#5094 )	2025-11-18 17:21:35 +08:00
lizhenyun01	d11235333e	format flash_mask_attn	2025-11-18 17:18:12 +08:00
Echo-Nie	abc9fd31c7	【Hackathon 9th No.76】supplementary unit test for XGrammarChecker (#4075 ) * supplementary unit test for XGrammarChecker * mock the xgrammer,torch	2025-11-17 22:05:53 +08:00
FocusLuo	c2c1942db9	[INTEL_HPU] [CI] enabled fastdeploy PR testing (#4596 ) * [INTEL HPU] added hpu ci work flow support Signed-off-by: Luo, Focus <focus.luo@intel.com> * [INTEL HPU] added run ci hpu test scripts Signed-off-by: Luo, Focus <focus.luo@intel.com> * [INTEL HPU] enabled HPU ernie test case Signed-off-by: Luo, Focus <focus.luo@intel.com> * [INTEL HPU] updated Intel Gaudi Readme with Warmup disable cmdline Signed-off-by: Luo, Focus <focus.luo@intel.com> * Modify paddlepaddle installation command Updated paddlepaddle installation command to use a specific index URL. * Update run_ci_hpu.sh * Rename json directory to nlohmann_json Rename extracted json directory to nlohmann_json. * Update ci_hpu.yml * Set pip global index URL to Tsinghua mirror * Update CI workflow to use self-hosted runner and paths * Update Docker image in CI workflow * Modify HPU installation URLs in run_ci_hpu.sh Updated the installation URL for paddle_intel_hpu and added paddlenlp_ops installation. * Fix paddle_intel_hpu installation URL Corrected the URL for paddle_intel_hpu wheel installation. --------- Signed-off-by: Luo, Focus <focus.luo@intel.com> Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>	2025-11-17 19:24:41 +08:00
周周周	b23e684b67	revert group size 3 (#5079 )	2025-11-17 18:54:13 +08:00
plusNew001	7f94d77e08	[XPU][CI] fix ci case bug (#5084 ) * Ignore markdown and text files in CI workflow * Change GPU_ID to XPU_ID in run_ci_xpu.sh * Change GPU_ID to XPU_ID in test configuration * Change GPU_ID to XPU_ID for service port calculation * Change GPU_ID to XPU_ID for device identification * Change GPU_ID to XPU_ID in test_ep function * Update run_w4a8.py * Redirect stop_processes output to kill.log Redirect output of stop_processes to kill.log to capture logs. * Log server output for failed test cases Added logging of server.log for failed tests. * Add '-s' option to pytest commands in run_ci_xpu.sh * Refactor assertion to validate multiple keywords Updated assertion to check for multiple keywords in response. * Fix assertany to assert any in run_45vl.py	2025-11-17 16:01:27 +08:00
LiqinruiG	33f96ff93a	[BugFix] rollback max_tokens and min_tokens when continue to infer (#5052 ) Co-authored-by: liqinrui <liqinrui@baidu.com>	2025-11-17 14:31:26 +08:00
Winters Montagne	ff26158f20	Add unit tests for triton_utils_v2 (#5073 )	2025-11-17 11:46:38 +08:00
Winters Montagne	02c83d65db	[CI]【Hackathon 9th Sprint No.13】NO.13 功能模块 fastdeploy/model_executor/ops/triton_ops/triton_utils.py 单测补充 (#5035 ) * Add unit tests for triton_utils.py * update name * update * update * update	2025-11-17 11:43:31 +08:00
qwes5s5	36216e62f0	[Log] Add trace log and add loggingInstrumentor tool (#4692 ) * add trace logger and trace print * trigger ci * fix unittest * translate notes and add copyright --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-11-17 11:08:57 +08:00
zhouchong	5444af6ff6	[APIServer] metrics use port the same as api_port (#5016 ) * metrics use port the same as api_port * Be tolerant to tests that monkeypatch/partially mock args. * Reduce code redundancy --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-11-17 10:42:45 +08:00
plusNew001	0e819cd596	[CI][XPU] Optimize CI logs and variable names (#5025 ) * Ignore markdown and text files in CI workflow * Change GPU_ID to XPU_ID in run_ci_xpu.sh * Change GPU_ID to XPU_ID in test configuration * Change GPU_ID to XPU_ID for service port calculation * Change GPU_ID to XPU_ID for device identification * Change GPU_ID to XPU_ID in test_ep function * Update run_w4a8.py * Redirect stop_processes output to kill.log Redirect output of stop_processes to kill.log to capture logs. * Log server output for failed test cases Added logging of server.log for failed tests. * Add '-s' option to pytest commands in run_ci_xpu.sh	2025-11-14 19:35:35 +08:00
Daci	5fc12eddfe	[Optimization] xgrammar async compile, multi thread, speed up (#4835 ) * xgrammar async compile, multi thread, speed up * fix test_sampler.py & pre-commit err * add redis version check && fix request.llm_engine_recv_req_timestamp * xgrammar prefill & decode & v0 * fix test_gpu_prompt_logprobs.py * add test_guided_decoding.py * Update fastdeploy/scheduler/splitwise_scheduler.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/model_executor/guided_decoding/xgrammar_backend.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix torch xgrammar unittest env --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-14 18:05:26 +08:00
Winters Montagne	b925533051	add test_process_video.py (#5011 )	2025-11-14 17:23:30 +08:00
周周周	c0a4393d72	[ATTENTION] unitest (#4962 )	2025-11-14 13:45:53 +08:00
essos	191a597d9f	[CI]【Hackathon 9th Sprint No.56】NO.56 功能模块 fastdeploy/multimodal/utils.py 单测补充 (#4954 ) * update test utils * update test utils code * update test file name	2025-11-14 10:37:27 +08:00
Juncai	36822fa49c	[PD Disaggregation] remove splitwise deployment on single node and refine the code (#4891 ) * remove splitwise deployment on single node and refine the code * up * up * up * add test * up	2025-11-14 09:56:53 +08:00
kxz2002	9703108c28	[BugFix] adjust max_tokens and min_tokens when continue to generate tokens (#5010 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details * fix max and min tokens initial commit * fix double subtraction * add unit tests	2025-11-13 23:52:54 +08:00
yangjianfengo1	ae7bee8122	【New Feature】W4afp8 supports per group quantization (#4987 ) * w4afp8 支持per group * code style * fix transpose * revert fast hardmard --------- Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com> Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>	2025-11-13 19:17:27 +08:00
zccjjj	88da9d9788	[XPU] [CI] Change CI ep test from offline to online (#4885 ) * change CI ep test from offline to online * add ep all2all ci's changes, from offline to online * change env var in ep-all2all ci test * add expected response for ep8tp8 all2all * Adapt to CI refactoring and support dual-concurrent code execution * Adapt to CI refactoring and support dual-concurrent, second * Explicitly specify the #port * change the startup method of all2all * Modify the command of all2all * Update assertion to check multiple keywords * Update assertion to check multiple keywords * Update run_w4a8.py * Update run_w4a8.py --------- Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>	2025-11-13 16:15:45 +08:00
ltd0924	303c986cc7	[FDConfig] add block number verfied (#4983 ) * Update config.py * fix * update unit test --------- Co-authored-by: ltd0924 <luotingdan@baidu.com>	2025-11-13 09:48:44 +08:00
YuBaoku	1c0b0b08b7	[CI] set DG_NVCC_OVERRIDE_CPP_STANDARD in test_quantized_linear (#4995 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FD Image Build (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details Publish Job / Run Stable Tests (push) Has been cancelled Details CI Images Build / FD-Clone-Linux (push) Has been cancelled Details CI Images Build / Show Code Archive Output (push) Has been cancelled Details CI Images Build / CI Images Build (push) Has been cancelled Details CI Images Build / BUILD_SM8090 (push) Has been cancelled Details CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled Details CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details CI Images Build / Run Base Tests (push) Has been cancelled Details CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled Details	2025-11-12 23:03:21 +08:00
bukejiyu	f0189292df	[CI] fix test_model_cache (#4982 ) * ci * update	2025-11-12 20:26:49 +08:00
qwes5s5	a2d06118e1	[Logprobs]Support prompt_logprobs and max_logprobs (#4897 ) * add prompt logprobs * trigger ci * fix unitest * Update fastdeploy/config.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/entrypoints/llm.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update fastdeploy/engine/sampling_params.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tests/engine/test_sampling_params.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update tests/engine/test_sampling_params.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix max_logprobs --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-12 19:29:48 +08:00
ltd0924	5bf48de999	[KVCache] support unified cache backend (#4903 ) * [Feature] support unified cache backend * fix * fix * fix * fix * Update metax_model_runner.py * fix * update * Update test_moba_attention_backend.py --------- Co-authored-by: ltd0924 <luotingdan@baidu.com>	2025-11-12 14:54:52 +08:00
yzwu	76e60e98f8	[Iluvatar][CI] fix safetensors_rust.SafetensorError: framework paddle is invalid (#4972 )	2025-11-12 14:13:40 +08:00
Sunny-bot1	35bd2afab3	[Benchmark] Add GEMM & MoE kernel bench (#4809 )	2025-11-12 11:56:40 +08:00
YuBaoku	8a96944a0a	[CI] Update PORT range to avoid conflict with system ports (#4953 )	2025-11-12 11:17:49 +08:00
Echo-Nie	ff653503ff	[Docs] Add License in Unittest (#4957 ) * add copyright * add CopyRight	2025-11-12 10:44:09 +08:00
Echo-Nie	2aabaecbc2	[CI] Add five unittest (#4958 ) * add unittest * Update test_logger.py	2025-11-12 10:43:33 +08:00

1 2 3 4 5 ...

437 Commits