Commit Graph

  • 00d0ef5134 check (#5237) chen 2025-11-26 17:07:26 +08:00
  • 214942e1ae fix kernel output extract (#5208) freeliuzc 2025-11-26 16:48:42 +08:00
  • df427ba06d [Docs] add request params (#5207) LiqinruiG 2025-11-26 15:04:22 +08:00
  • cead6b26fa [Metrics] Update time_to_first_token to include tokenization & queue time, and remove redundant metrics (#4993) Yonghua Li 2025-11-26 14:42:17 +08:00
  • 287751f19d [Docs] add docs of base64 or local file mm inputs (#5193) ApplEOFDiscord 2025-11-26 14:41:43 +08:00
  • fb0a2edc25 Enhance PR template with Cherry Pick instructions Jiang-Jia-Jun 2025-11-26 13:32:12 +08:00
  • 49be443d02 [Cherry-Pick][CI] Add check trigger and logic(#5191) (#5227) YuBaoku 2025-11-26 13:20:38 +08:00
  • f25ee3a26f [Feature] enable guided decoding ENABLE_V1_KVCACHE_SCHEDULER = 1 (#5140) Daci 2025-11-26 10:22:35 +08:00
  • 2d787590c4 [Feature] The 45VL supports prompt_token_ids + messages input. (#5148) kxz2002 2025-11-25 23:11:44 +08:00
  • 66e096d509 [FDConfig] disable use_sequence_parallel_moe default (#5222 Yuanle Liu 2025-11-25 21:49:10 +08:00
  • e6b4b1fa6c [CI] Add Cherry-Pick PR check logic (#5191) YuBaoku 2025-11-25 20:47:27 +08:00
  • e0c7ebff29 [BugFix][Cherry Pick] fix ds type bug (#5220) kevin 2025-11-25 20:37:09 +08:00
  • eae34a416c [benchmark]add qwen3-235b pd+ep yaml (#5225) xiegegege 2025-11-25 19:53:30 +08:00
  • df2be1cf16 [BugFix] fix mm_positions type error (#5182) kevin 2025-11-25 19:28:18 +08:00
  • 09379183e2 [BugFix] fix work metrics not returned by metrics api (#4912) Yonghua Li 2025-11-25 19:12:29 +08:00
  • b9bdf82ce3 [XPU] [CI] Xpu ci lock PaddlePaddle Version (#5218) Jiaxin Sui 2025-11-25 16:11:22 +08:00
  • a11d17cee9 [Speculative Decoding][Cherry Pick]Update extract_mtp_weight script and optimize config (#5213) freeliuzc 2025-11-25 14:42:55 +08:00
  • e581b7d7d9 fix kernel output extract (#5212) freeliuzc 2025-11-25 14:25:20 +08:00
  • c499bd9e90 Remove lock in get_task/put_task feature/optimize_worker_process_comm_1125 root 2025-11-25 06:24:39 +00:00
  • 5c8c2d47eb [Speculative Decoding][MTP]Update extract_mtp_weight script and optimize config (#5183) freeliuzc 2025-11-25 14:09:03 +08:00
  • 9e29f3b4ea Update worker_process.py Jiang-Jia-Jun 2025-11-25 12:00:03 +08:00
  • 210c87915e Update common_engine.py Jiang-Jia-Jun 2025-11-25 11:59:18 +08:00
  • 7cada8627f [Optimize] Optimize worker process comm timecost root 2025-11-25 03:57:05 +00:00
  • edf0d09257 [CI] 【Hackathon 9th Sprint No.24】NO.24 功能模块单测补充 (#5055) xunyoyo 2025-11-25 11:34:57 +08:00
  • daf8b386eb [CI] 【Hackathon 9th Sprint No.17】NO.17 功能模块单测补充 (#5054) xunyoyo 2025-11-25 11:32:27 +08:00
  • a418d7b60b [CI] Add Unittest (#5187) Echo-Nie 2025-11-25 11:00:34 +08:00
  • 717da50b40 Update pull_request_template.md Jiang-Jia-Jun 2025-11-25 10:32:19 +08:00
  • 86d6ee90be Update pull request template warning message Jiang-Jia-Jun 2025-11-25 10:11:33 +08:00
  • 6b111ef900 Update pull_request_template.md Jiang-Jia-Jun 2025-11-25 10:10:17 +08:00
  • ea3bc5b4ca [XPU] Fix the error in MoeExpertFFN operator when valid_token_num=0 (#5196) zccjjj 2025-11-25 10:07:20 +08:00
  • 09b47c7111 [Bug fix] Send first token in D instance (#5199) chenjian 2025-11-24 23:42:20 +08:00
  • 95b39317a9 [CI] Update redis download source (#5198) YuBaoku 2025-11-24 21:14:59 +08:00
  • f69e0839f7 dummy import fd (#5192) Yuanle Liu 2025-11-24 20:23:07 +08:00
  • 8e4e3ff510 [Feature] support eplb in api_server (#4782) kevin 2025-11-24 20:22:29 +08:00
  • d5bd64336a [Metax] support ENABLE_V1_KVCACHE_SCHEDULER (#5163) xiaozude 2025-11-24 19:19:49 +08:00
  • e150a418d4 support moe offline quant (#5142) xiaoxiaohehe001 2025-11-24 18:59:18 +08:00
  • 5ff93d4998 [XPU][CI] change VL model to 28B-VL-thinking (#5169) Jiaxin Sui 2025-11-24 16:50:18 +08:00
  • af03da5127 [BugFix] fix release block ids (#5184) Juncai 2025-11-24 16:48:09 +08:00
  • 7bac016c77 [CI] 【Hackathon 9th Sprint No.18】NO.18 功能模块单测补充 (#5064) xunyoyo 2025-11-24 15:52:34 +08:00
  • 95f3c8c641 [Fix] Fix eplb bug and support fp8 load weight (#5178) xiaoxiaohehe001 2025-11-24 15:31:37 +08:00
  • cc588b70ab [CP][BugFix]Dev fix custom ar unstable result (#5186) chen 2025-11-24 15:28:01 +08:00
  • f5c1066245 [XPU]Update documentation (#5180) qw86972190 2025-11-24 14:00:51 +08:00
  • 98f1ab46a9 [CI] add output for last_token in test_streaming_with_stop_str (#5170) YuBaoku 2025-11-24 10:49:17 +08:00
  • b9e76f1a7a [Coverage] Ignore new custom ops stub file in codecov (#5177) Nyakku Shigure 2025-11-23 22:33:28 +08:00
  • e297406263 [Others] unitest tests/layers/test_attention_layer.py (#5174) 周周周 2025-11-23 22:21:01 +08:00
  • 5daa8d0686 [CI] fix coverage_report in daily test (#5175) YuBaoku 2025-11-23 21:48:11 +08:00
  • c06cfe2447 【Hackathon 9th No.109】[CppExtension] 添加 fastdeploy_ops 目录到 package_data 以支持现代打包方式 (#5156) megemini 2025-11-22 01:32:06 +08:00
  • cceaba1c8d [Feature] remove to_numpy (#5162) kevin 2025-11-21 21:54:26 +08:00
  • c068a4f642 [Feature] dyc8 support prefixcache (#5125) kevin 2025-11-21 19:46:26 +08:00
  • ab3a2e45ff fix mtp reschedule (#5165) GoldPancake 2025-11-21 19:31:35 +08:00
  • fde827f95d fix mtp reschedule (#5164) GoldPancake 2025-11-21 19:08:33 +08:00
  • 3ea1b44a58 [Optimization] Improve perf for fd response token with internal adapter (#4992) chenjian 2025-11-21 19:02:03 +08:00
  • 5bcf79d780 [BugFix] fix num of rdma_comm_ports check (#5168) Yuanle Liu 2025-11-21 18:31:14 +08:00
  • d2298dcb0c [Polish] Simplify __repr__ method in Request class (#5153) Jiang-Jia-Jun 2025-11-21 17:21:06 +08:00
  • 6471dade4a [Fix] Fix noaux ep test (#5161) xiaoxiaohehe001 2025-11-21 16:36:41 +08:00
  • f9b0545a7f [PD Disaggregation] [Refine] Refine splitwise deployment (#5151) Juncai 2025-11-21 15:30:24 +08:00
  • 2d1dade5e2 [Speculative Decoding][MTP] Support static CacheKV C8 quantization and optimize memory usage (#5155) freeliuzc 2025-11-21 15:10:13 +08:00
  • 3c36283d7d [ENV] support AK SK ENCPOINT while get the multi_modal's feature (#5159) lizhenyun01 2025-11-21 15:07:57 +08:00
  • 2093d35716 [CI] Temporarily lock paddlepaddle-gpu as of 20251118 (#5157) YuBaoku 2025-11-21 15:07:36 +08:00
  • 34f59d9800 [RL]Fix missing is_distributed attribute (#5150) bukejiyu 2025-11-21 14:14:25 +08:00
  • 6ca2651995 [Feature] Support noaux for eplb (#5143) xiaoxiaohehe001 2025-11-21 14:10:32 +08:00
  • e70e2279ce [PD Disaggregation][XPU] Add XPU support for PD disaggregation (#5113) ddchenhao66 2025-11-21 14:09:01 +08:00
  • 79f18331b6 [CI]【Hackathon 9th Sprint No.51】NO.51 功能模块 fastdeploy/scheduler/dp_scheduler.py 单测补充 (#5046) essos 2025-11-21 10:52:33 +08:00
  • 0b0f2e320e [CI] Unified diff coverage upload logic (#5127) YuBaoku 2025-11-21 10:50:57 +08:00
  • 7454480e07 [Feature] support bos download retry (#5137) kevin 2025-11-21 10:18:32 +08:00
  • 43097a512a [BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol (#5132) Yonghua Li 2025-11-20 21:39:22 +08:00
  • a7740e56c4 Simplify __repr__ method in Request class (#5154) Jiang-Jia-Jun 2025-11-20 21:31:02 +08:00
  • 67da16bd7c fix mtp reschedule (#5144) GoldPancake 2025-11-20 21:28:21 +08:00
  • 01c30f6b87 Fix schedule error in splitwise deployment (#5149) Juncai 2025-11-20 21:18:10 +08:00
  • 147b2e5eb0 [BugFix] Fix zero workspace returned by CUB size query under CUDA Graph in MoE dispatch (#5087) Jundong Liu 2025-11-20 20:00:29 +08:00
  • 0857099191 mv import (#5146) Ryan 2025-11-20 19:25:56 +08:00
  • c3994750b1 [CI][XPU] Add XPU chunked_prefill && prefix_caching case (#5139) Jiaxin Sui 2025-11-20 18:51:50 +08:00
  • 385fe6dade [Others] clean code (#5133) 周周周 2025-11-20 18:44:08 +08:00
  • 7ac25935c7 [Optimization] default compile rdma, reduce cudagraph buffer size in mm, fix some config bug (#5121) Yuanle Liu 2025-11-20 17:19:47 +08:00
  • 6fa34102e8 [Others]get_block_shape_and_split_kv_block clean code (#5123) 周周周 2025-11-20 16:40:04 +08:00
  • af715db763 [Scheduler] Support chunk prefill for video input (#5107) yangjianfengo1 2025-11-20 16:29:13 +08:00
  • 0edda75a56 [Metax] optimize cutlass moe and flash attention backend (#5128) Neil Zhu 2025-11-20 16:12:35 +08:00
  • f1e36ff2f7 [Speculative Decoding][MTP]Support stop_seqs and pd-split mode (#5029) freeliuzc 2025-11-20 15:26:01 +08:00
  • 3e3558f492 [HPU][CI]Hpu ci update (#5116) plusNew001 2025-11-20 14:12:52 +08:00
  • e021048318 [CI] Temporarily lock paddlepaddle-gpu as of 20251118 (#5136) YuBaoku 2025-11-20 11:15:08 +08:00
  • 109d48e456 [Feature] support async download features (#5003) kevin 2025-11-19 22:23:36 +08:00
  • bde97e09f7 support dynamic activation quant for w4afp8 (#5117) Sunny-bot1 2025-11-19 21:11:16 +08:00
  • 73d46e1b1b Fix decoding speed slowly bug in 20251106 (#5102) feature/online/unify_code_20251106 chenjian 2025-11-19 20:27:20 +08:00
  • 966297e5d6 [Feature] mm prefix cache (#4554) kevin 2025-11-19 19:32:14 +08:00
  • 2716da4220 [CI] Add workflow to auto-remove skip-ci labels after new commits (#5129) YuBaoku 2025-11-19 19:22:06 +08:00
  • b319fa72e4 Revert "[Cherry-Pick][CI] Temporarily lock paddlepaddle-gpu as of 20251112(#5…" (#5099) YuBaoku 2025-11-19 19:08:02 +08:00
  • 9a640b3d6b [BugFix] unify max_tokens (#4968) (#5119) kxz2002 2025-11-19 19:05:03 +08:00
  • b2b7881cca [fix] add more logger info: max_tokens (#5122) LiqinruiG 2025-11-19 18:59:43 +08:00
  • 2e4bab35fb [fix] add more logger info: max_tokens (#5126) LiqinruiG 2025-11-19 18:44:27 +08:00
  • a5cd7c9039 [BugFix] rollback max_tokens and min_tokens when continue to infer (#5082) LiqinruiG 2025-11-19 18:43:42 +08:00
  • 43f0c7557e [Feature] Add an unquantized option for MoE and Dense quant type (#4813) Sunny-bot1 2025-11-19 16:24:03 +08:00
  • 9ff418db73 check METAX_GPU (#5114) chen 2025-11-19 16:02:21 +08:00
  • de43577a7c [Docs] add ebvlthinking yaml (#5120) tianlef 2025-11-19 15:27:46 +08:00
  • c5510e9b43 [Metrics] move prompt_tokens_total to main process metrics (#5118) feature/online/unify_code_20250922 Yonghua Li 2025-11-19 14:12:15 +08:00
  • 3c8c0f0d6c 【Hackathon 9th No.109】[CppExtension] [XPU] Support build Custom OP in setuptools 80+ -part (#5106) megemini 2025-11-19 13:33:39 +08:00
  • be9541a97b [CI] add metrics case (#5115) Zhang Yulong 2025-11-19 11:50:12 +08:00
  • 24e9e2d9c8 [CI]Exclude abstract methods and irrelevant backend files (#5031) YuBaoku 2025-11-19 11:48:28 +08:00
  • a82f25ea7b [RL]Resolve shape mismatch problems in RL-related modules (#5032) bukejiyu 2025-11-19 11:12:48 +08:00
  • 4694ed2a43 [CI]【Hackathon 9th Sprint No.31】NO.31 功能模块 fastdeploy/input/ernie4_5_processor.py 单测补充 (#5097) Winters Montagne 2025-11-19 10:51:02 +08:00
  • 857d152464 [XPU][Docs]Update document (#5091) qw86972190 2025-11-19 10:20:14 +08:00