Commit Graph

  • e4e3cede7f [Feature] Support Paddle-OCR (#4396) ming1753 2025-10-24 23:34:30 +08:00
  • 822dea8d5f [XPU]Moe uses a new operator (#4585) yyssys 2025-10-24 23:01:46 +08:00
  • f42ed6d5f2 [Graph Optimization] Add dy_runnable and introduce cudagraph_switch_threshold for cudagraph mode switching (#4578) Ryan 2025-10-24 18:36:52 +08:00
  • e02a812880 [CLI]Update parameters in bench latecy cli tool and fix collect-env cli tool (#4558) qwes5s5 2025-10-24 16:46:45 +08:00
  • 83d45af1f3 fix import image_ops error on some platforms (#4559) JYChen 2025-10-24 16:09:20 +08:00
  • 5fbc653238 fix v1 hang bug (#4573) yinwei 2025-10-24 15:35:10 +08:00
  • b60ce4922b [EP] fix adapter bugs (#4572) ltd0924 2025-10-24 12:30:08 +08:00
  • 8edc5cca91 [BugFix] fix create_cache_tensor for ep (#4542) 李泳桦 2025-10-24 11:31:13 +08:00
  • f7069b8057 [Metax] adapt DeepSeek (#4498) xiaozude 2025-10-24 10:14:53 +08:00
  • 08853711b4 support static C8 (#4565) Sunny-bot1 2025-10-23 22:09:07 +08:00
  • 8718fa34b2 support static C8 (#4568) Sunny-bot1 2025-10-23 22:01:03 +08:00
  • e36343d807 [FDConfig]Turn on the CUDAGraph + PD Disaggregation switch (#4530) RAM 2025-10-23 21:05:14 +08:00
  • 9dc5c3e370 [Graph Optimization] Support CUDAGraph Padding + MTP (#4545) RAM 2025-10-23 20:57:26 +08:00
  • 5a8c60454e [BugFix] Fix decode_type which has been deleted in req and optimize token client retry scheme (#4564) RichardWooSJTU 2025-10-23 20:08:10 +08:00
  • 3a43dbf82d [XPU] merge apply_tp, ops support token_num = 0 (#4507) zhupengyang 2025-10-23 19:09:58 +08:00
  • 4ffe41a747 WINT4/WINT8 dense gemm default use Machete (#4451) Sunny-bot1 2025-10-23 17:57:59 +08:00
  • a240425db9 [CI] Optimize coverage upload reporting (#4547) YuBaoku 2025-10-23 17:01:48 +08:00
  • 5443b2cffb [XPU] xpu support think length limit (#4539) ddchenhao66 2025-10-23 15:58:11 +08:00
  • fdce0d2299 add test case kevin 2025-10-23 10:22:20 +08:00
  • dd7fe27152 add hasher and ImagePosition kevin 2025-10-21 18:59:53 +08:00
  • 2676a918f0 [Doc]fix deepseek ce (#4560) tianlef 2025-10-23 14:09:11 +08:00
  • bbf06b9ff7 [BugFix]Fix finish reason (#4543) luukunn 2025-10-23 14:04:43 +08:00
  • ac4f5ca272 delete useless code (#4544) YuanRisheng 2025-10-23 13:40:34 +08:00
  • 07182103d3 Support CUDAGraph Padding + MTP (#4546) RAM 2025-10-23 11:09:50 +08:00
  • 8a02ab43a8 [FDConfig]Turn on the CUDAGraph + RL switch (#4508) RAM 2025-10-23 11:08:07 +08:00
  • 918e4e9850 [XPU] Change XPU stable third-party version (#4524) plusNew001 2025-10-22 19:43:03 +08:00
  • 3a6883ac1a c++ code format (#4527) zhupengyang 2025-10-22 17:59:50 +08:00
  • d7bcedf421 small change in test_fusedmoe.py (#4538) 周周周 2025-10-22 17:49:18 +08:00
  • 8e02a509c3 [CI] stable test_rollout_model.py (#4536) Yuanle Liu 2025-10-22 16:59:44 +08:00
  • dce988824d [Feature] Support AsyncLLM (#4458) zhouchong 2025-10-22 15:50:12 +08:00
  • 1531004085 fix image token output (#4487) guozhuangzhuang 2025-10-22 14:59:05 +08:00
  • b6cd3aec70 [Feature] support fd return decode response (#4407) guozhuangzhuang 2025-10-22 14:22:08 +08:00
  • cd9195d54c [XPU]Modify the xpu memory display unit of log (#4534) yyssys 2025-10-22 12:46:01 +08:00
  • f69c9cd122 [CI] Remove redundant .coveragerc file (#4521) YuBaoku 2025-10-21 23:24:05 +08:00
  • 3b58310c26 enhance set_stop_value_multi_ends and standardize the registration of some operators (#4525) Yuanle Liu 2025-10-21 22:06:06 +08:00
  • dc7facaa7f [Iluvatar GPU] fix ci error caused by rebuild_padding param and cuda graph (#4504) yzwu 2025-10-21 21:41:41 +08:00
  • d70aacfbdc [FDConfig] Turn on the CUDAGraph + MultiModel switch (#4512) RAM 2025-10-21 21:21:26 +08:00
  • 809c1ac7ec feat: add post-processing step for pool_output (#4462) SunLei 2025-10-21 20:24:26 +08:00
  • 31ac6c62f8 [CI] update ernie-4_5-vl baseline (#4495) (#4505) YuBaoku 2025-10-21 20:06:39 +08:00
  • 2bd3fb6315 [XPU]add xpu ci ep case (#4432) plusNew001 2025-10-21 19:19:40 +08:00
  • 175391389f Add comprehensive unit tests for limit_thinking_content_length operators (#4510) Copilot 2025-10-21 18:55:57 +08:00
  • 7cbe6b2472 [FDConfig] Turn on the CUDAGraph + Speculative Decoding switch (#4511) RAM 2025-10-21 18:34:16 +08:00
  • 153f15db39 [Doc]add deepseek wint4 ce (#4517) tianlef 2025-10-21 16:41:51 +08:00
  • fb76cdfb4f [Fearture] Support mm model close prefix cache (#4459) ltd0924 2025-10-21 15:37:59 +08:00
  • 2b53c4d684 【CI】Add test cases for n parameter and streaming validation (#4503) Divano 2025-10-21 15:33:29 +08:00
  • ee915220bd [Speculative Decoding] Add draft_logprobs Support for Speculative Decode MTP (#4467) SunLei 2025-10-21 14:57:50 +08:00
  • 775edcc09a [Executor] Default use CUDAGraph (#3594) RAM 2025-10-21 14:25:45 +08:00
  • 99564349a7 [XPU] bind block_attn kernel with pybind (#4499) Lucas 2025-10-21 10:58:52 +08:00
  • d85ef5352a 【BugFix】fix ep buffer clear (#4450) gaoziyuan 2025-10-21 10:56:00 +08:00
  • ec499a0104 [Cherry-pick] fix requests & block metrics (#4500) 李泳桦 2025-10-21 10:43:33 +08:00
  • 70a29ec49e [CI] update ernie-4_5-vl baseline (#4495) YuBaoku 2025-10-21 10:18:29 +08:00
  • 3cd9d3060a [Fearture] Support mm model close prefix cache (#4502) ltd0924 2025-10-21 09:56:47 +08:00
  • a498736af5 [APIServer] support define gunicorn timeout (#4496) ltd0924 2025-10-20 23:36:07 +08:00
  • cef3164c3b Optimizing the performance of think length limit using custom operators (#4279) Yuanle Liu 2025-10-20 21:09:13 +08:00
  • 36af88ff3f [BugFix][CI] Clean up SOT code cache using tearDown in CINN unitest (#4491) Ryan 2025-10-20 20:45:00 +08:00
  • bf03b6fcea fix vl bug (#4485) yinwei 2025-10-20 20:13:34 +08:00
  • 97ee3c403a [XPU]Fix w4a8 garbled code issue (#4493) yyssys 2025-10-20 19:41:11 +08:00
  • 10e85daf15 update benchmark scripts (#4497) Zhang Yulong 2025-10-20 17:03:10 +08:00
  • b8d235445e [fix] remove cache tensor creation for cache_transfer_manager (#4420) 李泳桦 2025-10-20 16:19:56 +08:00
  • f6f9c12b87 [fix] fix ipc signal suffix for ep (#4324) 李泳桦 2025-10-20 16:19:29 +08:00
  • 561a7ebc0b [Cherry-Pick] support model_id as metric labels by redefining metric update interface (#4480) 李泳桦 2025-10-20 16:11:19 +08:00
  • de2eaf4f81 add qwen-2.5-7B-PRM/ernie-rm (#4319) bukejiyu 2025-10-20 15:31:03 +08:00
  • 8d2aaf3ba4 [cp][Loader] 2.2 check paddle version for v1 loader (#4478) chen 2025-10-20 15:27:59 +08:00
  • c13e6ae481 [CI] Lock paddlepaddle-xpu==3.2.0 in release/2.2 (#4490) YuBaoku 2025-10-20 15:19:14 +08:00
  • 47595a2480 [Feature] support mtp logprob (#4464) GoldPancake 2025-10-20 15:18:12 +08:00
  • 9558912475 [CI] update paddlepaddle-xpu==3.2.0 in 0908 (#4489) YuBaoku 2025-10-20 15:13:27 +08:00
  • 1b9f351d21 Support GPT-OSS-BF16 (#4240) Haonan Luo 2025-10-20 14:44:58 +08:00
  • 80a16c4c87 [fix] adjust mctlass moe api (#4474) SuperNova 2025-10-20 14:23:54 +08:00
  • 1e59905e34 Optimization of ‘tools’ in request fields (#4380) zhuzixuan 2025-10-20 11:04:08 +08:00
  • 528c55776e [Graph Optimization][Speculative Decoding] Fix the bug of CUDAGraph + MTP + EP (#4456) RAM 2025-10-20 10:38:55 +08:00
  • 9c7187998c [Feature] support mtp logprob (#4457) GoldPancake 2025-10-20 10:18:00 +08:00
  • c4fc0073cf [CI] Handle unit test issues (#4483) YuBaoku 2025-10-20 10:13:21 +08:00
  • 817210e47f [ATTN]delete code and add ffn and moe layer level test (#4440) 周周周 2025-10-19 16:23:11 +08:00
  • beaec373c0 [Fix] add einops for requirements (#4477) yangjianfengo1 2025-10-17 22:50:26 +08:00
  • b5b993e48e 【feature】support n parameter (#4273) kxz2002 2025-10-17 20:51:59 +08:00
  • 8ccfd975b5 LLM.chat add "tools" param (#4415) kxz2002 2025-10-17 20:25:03 +08:00
  • 329d074326 [Docx] fix the broken link (#4479) yangjianfengo1 2025-10-17 18:28:50 +08:00
  • a64c0408b9 [XPU]Fix w4a8 precision bug && rollback moe algo (#4463) yinwei 2025-10-17 18:27:53 +08:00
  • 63ef593450 check paddle version for v1 loader (#4473) chen 2025-10-17 17:25:03 +08:00
  • 4b661512ca [Iluvatar GPU] Adapt VL model (#4313) yzwu 2025-10-17 16:13:38 +08:00
  • f660188a85 [cp][BugFix]2.2_fix_custom_ar_unstable_result (#4436) chen 2025-10-17 16:04:54 +08:00
  • ba5c2b7e37 [Docx] add language (en/cn) switch links (#4470) yangjianfengo1 2025-10-17 15:47:41 +08:00
  • 631a1e2339 fix mtp quant param (#4469) feature/online/45T_20250730 GoldPancake 2025-10-17 14:53:01 +08:00
  • a3e0a15495 fix seqlen sync (#4442) Ayakouji 2025-10-17 14:37:52 +08:00
  • 920df5be5a [Graph Optimization][Speculative Decoding] Fix the bug of CUDAGraph + MTP + EP (#4430) RAM 2025-10-17 14:22:05 +08:00
  • 720697e265 add environment variables (#4466) xiaolei373 2025-10-17 14:20:01 +08:00
  • 01510876ab [CI] Fix partial instability issues (#4461) YuBaoku 2025-10-17 14:17:06 +08:00
  • 14785eb65d [XPU] abstract a hardware-agnostic operator wrapper for prefix cache and specify xpu device id definition (#4455) ddchenhao66 2025-10-17 14:05:33 +08:00
  • c234b995ab [Feature] support pooling model dummy_run (#4345) lizexu123 2025-10-17 13:30:55 +08:00
  • 15b6b8dc25 [CINN] Remove the restriction of automatically falling back to SOT after enabling CINN (#4411) Ryan 2025-10-17 12:51:07 +08:00
  • b134e6afe6 [BugFix]Dev fix custom ar unstable result (#4437) chen 2025-10-17 11:47:16 +08:00
  • 6160145f82 [SOT] Change warnings to errors and remove fallback operations (#4378) Ryan 2025-10-17 11:27:04 +08:00
  • 0413c32b8f [Optimize] Set preempted schedule log as info level (#4453) chenjian 2025-10-17 11:25:46 +08:00
  • 5885953211 [Others] add PR Template (#4452) Zero Rains 2025-10-17 11:09:51 +08:00
  • 930f7b781c [Optimization] Put get_block_shape_and_split_kv_block in cuda graph for append attention backend (#4443) Sunny-bot1 2025-10-17 10:59:56 +08:00
  • 49cea8fb1c [SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694) Ryan 2025-10-17 10:57:55 +08:00
  • a37c9416ac [FDConfig]Remove reasoning_parser/guided_decoding_backend/disable_any_whitespace/device_ids in FDConfig (#4362) YuanRisheng 2025-10-17 10:40:59 +08:00
  • d1637db86a modify_comment (#4460) xiaolei373 2025-10-17 10:10:09 +08:00
  • db82e9a022 [BugFix]Fix wfp8afp8 triton moe group_topk renormalized=True (#4449) chen 2025-10-16 23:17:48 +08:00
  • dbca63f862 [bugfix] kill cache_transfer_manager process (#4401) xiaolei373 2025-10-16 20:45:24 +08:00