Commit Graph

  • 359dec7431 process transparent image (#4832) ApplEOFDiscord 2025-11-06 13:42:36 +08:00
  • bbae094cb9 [Optimization] Reduce memory allocate for cudaGraph (#4838) freeliuzc 2025-11-06 13:32:47 +08:00
  • 782818c031 fix: ci port conflict (#4840) SunLei 2025-11-06 11:56:17 +08:00
  • 5bdd40da5d [BugFix] Fix ernie_vl_reasoning_parsers.py 'end_token' to 'think_end_token' (#4805) kxz2002 2025-11-06 11:28:55 +08:00
  • 69fa741763 remove seq_lens_this_time (#4821) 周周周 2025-11-06 11:06:28 +08:00
  • 0df488c7bb support wint8 & wint4 (#4837) yinwei 2025-11-06 10:54:34 +08:00
  • a218291831 [Cherry-Pick] Fix reasoning parser register name (#4795) (#4816) kxz2002 2025-11-06 10:51:40 +08:00
  • 62dfad4a5f [PD Disaggregation] Support Qwen3-MoE use PD + EP inference. (#4691) K11OntheBoat 2025-11-06 10:32:15 +08:00
  • e8c3e20ee6 [CI] fix docker_build error and add tag-base (#4810) YuBaoku 2025-11-05 21:57:54 +08:00
  • e0d98d00bc [Cherry-Pick] Modify follow-up push parameters and Modify the verification method for thinking length (#4086) (#4826) kxz2002 2025-11-05 21:36:28 +08:00
  • 93fcf7e4ec 【New Feature】W4afp8 supports per group quantization (#4272) yangjianfengo1 2025-11-05 21:00:23 +08:00
  • ee37882a26 [NewFeature] support eplb noaux (#4725) xiaoxiaohehe001 2025-11-05 20:59:12 +08:00
  • 7e7a91855b [BugFix] fix messages being inplace modified in offline chat api (#4830) 李泳桦 2025-11-05 20:46:55 +08:00
  • fcd2f05dff [BugFix] fix messages being inplace modified in offline chat api (#4831) 李泳桦 2025-11-05 20:46:33 +08:00
  • 6f95df1777 Fix formatting of news section in README_EN.md Jiang-Jia-Jun 2025-11-05 19:47:34 +08:00
  • 5db1a26340 Update README_CN.md Jiang-Jia-Jun 2025-11-05 19:46:51 +08:00
  • aec1a84886 [Doc] Update docs for v2.3.0rc0 (#4828) Jiang-Jia-Jun 2025-11-05 19:45:53 +08:00
  • 4c2ad15258 add paddleocr_vl benchmark (#4833) zhang-prog 2025-11-05 19:37:45 +08:00
  • 131d76dd64 [Bug Fix] process transparent image (#4807) ApplEOFDiscord 2025-11-05 17:15:24 +08:00
  • ea1dd0e735 [XPU]Support V1 loader in weight_only Model (#4808) yinwei 2025-11-05 17:09:11 +08:00
  • cc8f5312f5 [Feature] Add timestamp for profiler (#4726) chenjian 2025-11-05 12:04:59 +08:00
  • 876e4a8935 remove input_ids from ForwardMeta (#4793) 周周周 2025-11-05 11:55:51 +08:00
  • 9676cc87d6 fix parser register name (#4795) kxz2002 2025-11-05 11:27:30 +08:00
  • 2fd254e5b7 support ep+tp at op layer (#4688) zhupengyang 2025-11-05 11:15:57 +08:00
  • 1689f7ef86 [Cherry-Pick] Fix ernie4_5_vl_processor.py and qwen_vl_processor.py can not disable thinking (#4762) (#4798) kxz2002 2025-11-05 10:59:47 +08:00
  • 937eb3c6ed [get_padding_offset.] clean get_padding_offset.cu (#4777) 周周周 2025-11-05 10:47:40 +08:00
  • 1c3ca48128 [Feature][Executor] GPU Model Runner Supports prompt_logprobs and max_logprobs (#4769) chen 2025-11-05 10:43:25 +08:00
  • 7cee8030af [CI] Disable unstable test jobs and cases (#4799) YuBaoku 2025-11-05 10:28:53 +08:00
  • 74722308f2 [Metax] adapt cutlass moe and fix mla attention (#4602) xiaozude 2025-11-05 10:03:49 +08:00
  • 2c281e617c Update Unit Test for PaddleOCR-VL (#4802) Haonan Luo 2025-11-04 22:40:15 +08:00
  • 61856e55ce [fix] fix v0 pd, let worker step_shm_value create=False (#4781) 李泳桦 2025-11-04 20:38:01 +08:00
  • 1b61d62ecf [fix] fix v0 pd, let worker step_shm_value create=False (#4780) 李泳桦 2025-11-04 20:37:57 +08:00
  • 73252641dc updata mkdocs.yml (#4804) yangjianfengo1 2025-11-04 19:30:26 +08:00
  • 722110a952 [CI] Refactor CE wheel upload for multiple target paths (#4790) YuBaoku 2025-11-04 18:56:38 +08:00
  • 1e88754133 support set dy-C8 from args (#4475) Yuan Xiaolan 2025-11-04 17:01:35 +08:00
  • 9547fa204e [Docs] Add new support models (#4801) ming1753 2025-11-04 16:49:51 +08:00
  • 08abb0dd1c [Docs] Add new support models (#4800) ming1753 2025-11-04 16:49:40 +08:00
  • 3e9dda39ab supports pd partn (#4615) lzy 2025-11-04 16:36:35 +08:00
  • af7e0f27f3 supports internode_ll_two_stage (#4162) lzy 2025-11-04 16:35:40 +08:00
  • 8a40374bfe [BugFix] Fix ernie4_5_vl_processor.py and qwen_vl_processor.py can not disable thinking (#4762) kxz2002 2025-11-04 16:00:32 +08:00
  • 6502d19e97 [XPU] add deploy doc for PaddleOCR-VL in XPU (#4792) Lucas 2025-11-04 15:07:04 +08:00
  • 007ee71208 [XPU] add deploy doc for PaddleOCR-VL in XPU (#4784) Lucas 2025-11-04 15:06:19 +08:00
  • bffa08b74b [XPU] fix thinking bug where output only contains reasoning_content (#4761) ddchenhao66 2025-11-04 14:32:35 +08:00
  • 4a4948764d Update mkdocs.yml Jiang-Jia-Jun 2025-11-04 14:31:32 +08:00
  • 41bfa1090d [CI]delete test_common_model (#4794) bukejiyu 2025-11-04 13:57:55 +08:00
  • 855a2a609a fix attn_params (#4787) freeliuzc 2025-11-04 13:01:38 +08:00
  • 78a1451eb7 [XPU] fix thinking bug where output only contains reasoning_content (#4760) ddchenhao66 2025-11-04 12:47:34 +08:00
  • 741a012d15 [Graph Optimization] cherry-pick other spec padding kernel (#4776) RAM 2025-11-04 11:03:51 +08:00
  • d65e00f8fb [Feature] support get_task with tensor (#4751) lizhenyun01 2025-11-04 11:00:13 +08:00
  • ffa57dbfac Modify base_response_104 for better clarity (#4789) plusNew001 2025-11-03 22:08:32 +08:00
  • 9887025926 Update run_w4a8.py (#4783) plusNew001 2025-11-03 21:41:00 +08:00
  • 5233825562 test scheduler (#4739) kevin 2025-11-03 20:12:14 +08:00
  • cf5e545a73 test scheduler (#4757) kevin 2025-11-03 20:12:02 +08:00
  • 35a6969a44 [Docs] PaddleOCR-VL add RTX3060 server param (#4765) ming1753 2025-11-03 19:55:05 +08:00
  • e5cbe0d6a1 [Docs] PaddleOCR-VL add RTX3060 server param (#4766) ming1753 2025-11-03 19:54:59 +08:00
  • 7df7035055 【DataProcessor】add options thinking_mode (#4735) (#4759) luukunn 2025-11-03 18:14:39 +08:00
  • 8690cf8569 fix Cfp8 for RL load (#4144) Yuan Xiaolan 2025-11-03 17:51:51 +08:00
  • c95d0740ec [Metax] adapt cutlass moe for ernie-vl (#4685) Neil Zhu 2025-11-03 17:44:27 +08:00
  • a7562ddf4b http get retry (#4770) ApplEOFDiscord 2025-11-03 16:52:26 +08:00
  • 69c2f3cda1 [CI]test common model (#4697) bukejiyu 2025-11-03 16:48:36 +08:00
  • 9ec29f6cf8 [Docs]fix error (#4768) yyssys 2025-11-03 16:41:47 +08:00
  • aa7a926931 [Bug fix] Robust cache messager send cache when send cache slower than prefill (#4659) chenjian 2025-11-03 16:37:13 +08:00
  • 25498efcf3 [Optimize] Support and robust for tpN for PD (#4595) chenjian 2025-11-03 15:38:31 +08:00
  • 7b35488779 【DataProcessor】add options thinking_mode (#4735) luukunn 2025-11-03 14:30:07 +08:00
  • 377f3bf5f2 [XPU] add v1 support for bf16 (#4744) yinwei 2025-11-03 14:13:17 +08:00
  • f83d0cf127 [Feature] Support eplb for fd (#4599) chenjian 2025-11-03 14:08:15 +08:00
  • 6bfa4fed6e [Docs]Update XPU document version to 2.3.0 (#4754) yyssys 2025-11-03 13:36:12 +08:00
  • c657f8d16a [Docs] fix PaddleOCR-VL docs bug (#4702) ming1753 2025-11-03 12:12:14 +08:00
  • b1dd508965 [Docs]Add parameter (#4755) yyssys 2025-11-03 11:57:32 +08:00
  • 4d4c13f1b5 [CherryPick] Fix thinking bug cp (#4736) v2.3.0-rc0 Yuanle Liu 2025-11-03 11:41:03 +08:00
  • 44ce91adea [Docs]Add parameter to the start service command (#4753) yyssys 2025-11-03 11:14:07 +08:00
  • d1d3876c16 [FDConfig] [PD Disaggregation] [Graph Optimization] Close Cudagraph for P node when PD Disaggregation (#4632) (#4734) Jundong Liu 2025-11-03 10:59:34 +08:00
  • 11398790d3 [Speculative Decoding][MTP]Support attn mask offset (#4641) freeliuzc 2025-11-03 10:08:01 +08:00
  • f44f4bafd1 support mtp in splitewise and scheduler_v1 mode (#4743) freeliuzc 2025-11-03 10:07:15 +08:00
  • b8bf57138f [Docs]Update XPU document version to 2.3.0 (#4741) yyssys 2025-11-03 09:54:51 +08:00
  • 8632b778f5 [CI] update paddlepaddle_gpu==3.2.1 and fix rollout_model test logic (#4738) YuBaoku 2025-11-02 21:30:23 +08:00
  • 9eff788658 [CI] fix some ci yaml (#4747) YuBaoku 2025-11-02 21:28:04 +08:00
  • 6e01be28e0 format code (#4720) 周周周 2025-11-01 19:13:50 +08:00
  • b4aa189483 [XPU] Support V1 Loader in Bf16 (#4746) yinwei 2025-11-01 16:13:25 +08:00
  • 0861eb027f [Cherry-Pick] Fix finish reason in _create_chat_completion_choice (#4582) (#4716) kxz2002 2025-11-01 15:50:47 +08:00
  • 24b85b752b [Cherry-Pick] Unify the registration name recognition for tool_parser and reasoning_parser to “-” (#4668) (#4737) kxz2002 2025-10-31 23:27:21 +08:00
  • d11e27a188 [Bugfix] fix test_get_save_output_v1 (#4732) Longzhi Wang 2025-10-31 22:52:04 +08:00
  • 4ac6de9a3c [Feature] support pooling model runner (#4590) lizexu123 2025-10-31 22:32:05 +08:00
  • acef624049 [CI] Fix rollout_model test logic (#4730) YuBaoku 2025-10-31 22:25:24 +08:00
  • b301bd6c31 [BugFix] fix thinking bug (#4710) Yuanle Liu 2025-10-31 22:00:31 +08:00
  • 10358bf1a0 fix noaux (#4731) 周周周 2025-10-31 21:25:11 +08:00
  • 96a44c8574 Skip building native architecture when specifying arch list (#4728) ming1753 2025-10-31 20:45:49 +08:00
  • 27746026c1 Skip building native architecture when specifying arch list (#4727) ming1753 2025-10-31 20:32:46 +08:00
  • e4463c37fe [XPU][CI] Release ci fix bug (#4742) plusNew001 2025-10-31 20:30:35 +08:00
  • ce53cdccd2 [XPU] xpu support neox style ROPE (#4723) ddchenhao66 2025-10-31 18:17:21 +08:00
  • 3cbca75cc8 [XPU] xpu support neox style ROPE (#4719) ddchenhao66 2025-10-31 18:14:25 +08:00
  • 00d0da0c18 [Graph Optimization] Add the CUDAGraph usage switch for Draft Model (#4669) RAM 2025-10-31 17:34:09 +08:00
  • 88a94c821b [FDConfig] [PD Disaggregation] [Graph Optimization] Close Cudagraph for P node when PD Disaggregation (#4632) Jundong Liu 2025-10-31 16:44:25 +08:00
  • 9a647cb61c [OP] Add InferShape&InferDtype for per_token_quant_padding (#4667) (#4683) Ryan 2025-10-31 16:42:52 +08:00
  • 316f784016 fix wint2 config (#4721) AIbin 2025-10-31 15:44:14 +08:00
  • 7b013c63e2 [XPU ][CI] Release XPU ci update (#4722) plusNew001 2025-10-31 15:36:14 +08:00
  • c801d31c9c add checker (#4711) kevin 2025-10-31 15:26:35 +08:00
  • 9835697163 [Feature] Check bos url (#4677) kevin 2025-10-31 15:26:30 +08:00
  • 139342d953 fix bug (#4680) kevin 2025-10-31 15:23:33 +08:00
  • 096d87d335 fix bug (#4679) kevin 2025-10-31 14:59:18 +08:00