Commit Graph

  • f75697c2d1 [Feature] support clear data (#4185) ltd0924 2025-09-21 20:41:27 +08:00
  • 91912cc2e1 fix t2i (#4163) RichardWooSJTU 2025-09-19 18:07:13 +08:00
  • cc6e14d2ec 【Hackathon 9th No.46】add test_fused_rotary_position_encoding (#3848) Echo-Nie 2025-09-19 17:50:19 +08:00
  • 24180fba0a [FDConfig]Remove splitwise_role and engine_worker_queue_port in FDConfig (#4147) YuanRisheng 2025-09-19 17:01:52 +08:00
  • ee9d8a840a [fix]Modify follow-up push parameters and Modify the verification method for thinking length (#4086) luukunn 2025-09-19 14:26:01 +08:00
  • 66a98b44ed ep support logprob (#4089) (#4151) chen 2025-09-19 14:07:31 +08:00
  • a685e5ad35 Each module should have its own plugins_loaded (#4164) Yuanle Liu 2025-09-19 14:06:10 +08:00
  • bba279cf38 [Feature] support rdma IB transfer (#4123) ltd0924 2025-09-19 12:54:49 +08:00
  • ddf5606263 Bugfix test exception (#4171) xiaolei373 2025-09-19 11:48:49 +08:00
  • 4f460db556 [CP2.2] Machete support group scale & wint8 & v1 loader (#4166) Sunny-bot1 2025-09-19 11:13:12 +08:00
  • 1e86418c4a optimize dy_cfp8's performance (#4145) Yuan Xiaolan 2025-09-19 09:35:28 +08:00
  • c3b8ebeb18 [Optimize] Machete using group scale default (#4121) Sunny-bot1 2025-09-18 13:51:11 +08:00
  • 62b8b02e08 fix_unitest (#4159) qwes5s5 2025-09-18 11:17:15 +08:00
  • 74d7b9151d fix mtp (#4153) JYChen 2025-09-18 10:53:07 +08:00
  • 98447beb4d Add param valid log (#4113) xiaolei373 2025-09-18 10:39:24 +08:00
  • 0fa28b1068 [fix] fix ep group all-reduce (#4140) 李泳桦 2025-09-18 10:34:49 +08:00
  • 618ccdbfba [Feature] Support mixed deployment with yiyan adapter in develop (#3976) chenjian 2025-09-18 01:52:20 +08:00
  • 2745f37017 [CI] enhance clean port and add waiting time (#4152) YuBaoku 2025-09-17 20:31:49 +08:00
  • 896e3bb606 [NewFeture]add ep rollout model init and update/clear ep buffer (#4039) gaoziyuan 2025-09-17 20:24:53 +08:00
  • 0d3a57a2c6 fix unittest (#4155) YuanRisheng 2025-09-17 20:20:26 +08:00
  • b52971749c Print KV Cache available memory and block memory usage in GB format (#4148) qw86972190 2025-09-17 20:01:55 +08:00
  • cffde70949 Add assertion for ENABLE_V1_KVCACHE_SCHEDULER (#4146) Jiang-Jia-Jun 2025-09-17 16:02:56 +08:00
  • 2adca04f1f Reconstruct streaming data transfer with zmq (#3836) RichardWooSJTU 2025-09-17 14:30:39 +08:00
  • 5027ed7239 【BugFif】fix ep decode (#4138) gaoziyuan 2025-09-17 14:18:31 +08:00
  • f9766f917b [BugFix] Forbiden FD_DISABLED_RECOVER while ENABLE_V1_KVCACHE_SCHEDULER (#4142) Jiang-Jia-Jun 2025-09-17 14:11:44 +08:00
  • 7f9a9b37f3 Support limit thinking lengths (#4070) K11OntheBoat 2025-09-17 12:40:08 +08:00
  • 25aa2d94aa cp dynamic Cfp8 (#4120) Yuan Xiaolan 2025-09-17 11:55:47 +08:00
  • 2e9e53ff7e [FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116) YuanRisheng 2025-09-17 10:43:35 +08:00
  • b41988f4bc fix gid (#4038) gaoziyuan 2025-09-16 20:56:36 +08:00
  • c01a756912 mv test to tests (#4129) YUNSHEN XIE 2025-09-16 20:45:40 +08:00
  • cd09913552 Update test_w4a8_model.py (#4125) Zhang Yulong 2025-09-16 20:43:10 +08:00
  • 67e6d8c691 [Feature] Set prefix caching as default (#3814) chenjian 2025-09-16 20:34:27 +08:00
  • de8638b1e9 fix dynamic Cfp8 computing error (#4119) Yuan Xiaolan 2025-09-16 20:21:49 +08:00
  • 4f8901489c ci: Increase compilation task time limit (#4098) YUNSHEN XIE 2025-09-16 20:05:45 +08:00
  • e79a1a7938 x1_a3b config (#4135) tianlef 2025-09-16 19:44:46 +08:00
  • b6caf6e622 suppoort expert num 3 pre rank (#4133) gaoziyuan 2025-09-16 17:34:44 +08:00
  • d682c97dd3 [benchmark]add lite-vl and x1 yaml (#4130) xiegegege 2025-09-16 16:38:36 +08:00
  • 8e49d99009 Addcase (#4112) Divano 2025-09-16 16:12:14 +08:00
  • 83bf1fd5aa [Doc]add plas attention config (#4128) tianlef 2025-09-16 15:55:12 +08:00
  • b70ca35c0b 【Hackathon 9th No.52】add test_dynamic_per_token_scaled_fp8_quant (#4015) co63oc 2025-09-16 14:11:29 +08:00
  • befe463f01 【Hackathon 9th No.37】add test_top_k_renorm_probs (#3755) Echo-Nie 2025-09-16 11:12:46 +08:00
  • 7ccbcc5a62 [feat] support prefix cache clearing when /clear_load_weight is called (#4091) 李泳桦 2025-09-16 11:11:20 +08:00
  • 442543cd6b fix ep wint8 (#4102) Sunny-bot1 2025-09-16 11:05:33 +08:00
  • d381fa8194 fix reasoning parsers plugin (#4104) Yuanle Liu 2025-09-15 22:30:16 +08:00
  • ed2dcec829 add ignore=all for deepgemm (#4118) Yuanle Liu 2025-09-15 21:52:00 +08:00
  • a04365a0c7 Update api_server.py Jiang-Jia-Jun 2025-09-15 21:31:33 +08:00
  • 03b3d6175d fix mtp (#4105) YuanRisheng 2025-09-15 20:26:07 +08:00
  • fbb4e0f8d1 [CP]Glm45 air 2.2 (#4073) chen 2025-09-15 18:52:58 +08:00
  • 17a27170bc fix typos (#4093) co63oc 2025-09-15 18:33:30 +08:00
  • 113e330030 fix bf16 and add comments (#4106) bukejiyu 2025-09-15 17:23:07 +08:00
  • 69aa2781a1 [MTP]Support mtp reshard (#4099) freeliuzc 2025-09-15 17:13:53 +08:00
  • 46911f903d [MTP]update hybrid-mtp-with-ngram (#4047) freeliuzc 2025-09-15 17:13:31 +08:00
  • b1b33211e8 [CUDAGraph] Support multi output buffers and merge some fixes from feature/exp_0908 (#4062) Yuanle Liu 2025-09-15 16:21:30 +08:00
  • 9409665713 [xpu] support ep (#4067) zhupengyang 2025-09-15 13:53:11 +08:00
  • 29ed617f0f [v1 loader]qwen Offline fp8 (#4036) bukejiyu 2025-09-15 13:44:11 +08:00
  • b1a5b756a3 [Optimize] Support WINT8 and group scale for Machete (#3905) Sunny-bot1 2025-09-15 12:01:34 +08:00
  • d2ab369427 [MTP]Support RL reshard (#4074) freeliuzc 2025-09-15 11:47:06 +08:00
  • 4e8ba62241 [setup optimize]Support git submodule (#4033) (#4080) YuanRisheng 2025-09-15 11:41:55 +08:00
  • 4408dc7f67 【Hackathon 9th No.49】add test_pre_cache_len_concat (#3847) Echo-Nie 2025-09-15 11:20:14 +08:00
  • ef4a1aa2da 【Hackathon 9th No.61、65】add test_draft_model_update (#3940) co63oc 2025-09-15 11:19:50 +08:00
  • f213ae1e86 [Bug Fix]fix the bug for cache_messager signal loss (#3879) Zero Rains 2025-09-15 11:16:24 +08:00
  • 553adb299e 【FastDeploy CLI】collect-env subcommand (#4044) qwes5s5 2025-09-15 10:31:23 +08:00
  • 958abebeab Support offline inference with streaming output (#4071) zhouchong 2025-09-15 10:27:03 +08:00
  • 2883746132 fix model_weights_signal (#4092) Yuanle Liu 2025-09-13 11:55:25 +08:00
  • 2485333f71 ep support logprob (#4089) chen 2025-09-12 21:11:16 +08:00
  • 4871f18dad fix(CE): update concurrency to stop CE tasks from canceling each other (#4083) YUNSHEN XIE 2025-09-12 19:16:26 +08:00
  • 987609c894 [BugFix] Fix image_feature 0-Size causing insert failed (#4042) Ayakouji 2025-09-12 19:13:08 +08:00
  • 9ac539471d [format] Valid para format error info (#4035) xiaolei373 2025-09-12 19:05:17 +08:00
  • 88ea565aba [BugFix]Fix load kv cache quant scale (#4077) YuanRisheng 2025-09-12 17:44:03 +08:00
  • c86b3357ce 【Hackathon 9th No.78】add test_chat.py (#3958) co63oc 2025-09-12 16:53:27 +08:00
  • 06f4b49ca3 【Hackathon 9th No.25】add test_fused_get_rotary_embedding (#3892) Echo-Nie 2025-09-12 15:38:43 +08:00
  • 805f29a06c [Feature] refactor metax_gpu attention and moe and remove some useless code (#3688) SuperNova 2025-09-12 14:40:25 +08:00
  • 10768a4d79 [NewFeture]add ep rollout model init and update/clear ep buffer (#3927) gaoziyuan 2025-09-12 14:15:13 +08:00
  • cab7a633fe [CI] add multi api server test (#4049) ltd0924 2025-09-12 11:18:38 +08:00
  • 58e0785bab [metrics] update metrics markdown file (#4061) qwes5s5 2025-09-12 11:13:43 +08:00
  • 8466219ec8 fix typos (#3840) co63oc 2025-09-12 11:04:38 +08:00
  • 82dab8a91a Add token processor plugin support (#4059) RichardWooSJTU 2025-09-12 10:17:23 +08:00
  • 37f1632732 [Optimize] optimize prefix cache in develop (#3890) chenjian 2025-09-12 10:15:59 +08:00
  • 7e3148ed81 [CI] update paddlepaddle==3.2.0 in release/2.2 (#3997) YuBaoku 2025-09-11 22:04:40 +08:00
  • c64ceac34d Update ce_job.yml (#4060) Zhang Yulong 2025-09-11 20:44:09 +08:00
  • 4859f40b20 [Feature] GLM-45-AIR Support Mix Quantization(Dense wfp8afp8 and wint8 triton_moe_backend) (#4051) chen 2025-09-11 20:08:09 +08:00
  • 2056a428bd [bug fix] Fix the placeholder in qwen prompt and add some unittests (#4065) lddfym 2025-09-11 20:00:02 +08:00
  • 850465e8ed [Feature] add cli command chat,complete (#4037) memoryCoderC 2025-09-11 19:53:14 +08:00
  • a47976e82d [Echo] Support more types of prompt echo (#4022) zhuzixuan 2025-09-11 19:34:44 +08:00
  • abdcef30aa [BugFix] mm_post_fix (#4005) xiaoxiaohehe001 2025-09-11 19:09:46 +08:00
  • d2ec7f6aa2 update ci (#4064) Zhang Yulong 2025-09-11 18:36:25 +08:00
  • fec58639db [CI] skip test_structured_outputs* temporarily (#4055) YuBaoku 2025-09-11 18:07:50 +08:00
  • d2d04c2d5e [setup optimize]Support git submodule (#4033) YuanRisheng 2025-09-11 17:41:16 +08:00
  • d60f7c4661 fix import tests.utils error in tests/model_loader/test_load_mtp.py (#4027) SuperNova 2025-09-11 16:47:16 +08:00
  • 447297a7b5 fix gid (#4054) gaoziyuan 2025-09-11 16:08:00 +08:00
  • 63d24b2210 [Executor] Adjust signal sending order in RL training (#3773) (#4066) RAM 2025-09-11 15:41:32 +08:00
  • e4c64a71cc [BugFix] qwen2.5vl enable_thinking=true and image_patch_id bug fix (#3921) CSWYF3634076 2025-09-11 15:08:24 +08:00
  • 2650f58740 [docs] Update environment variables documentation (#3957) bukejiyu 2025-09-11 12:17:06 +08:00
  • 48f2ab3fb3 support cuda graph (#4056) Yuanle Liu 2025-09-11 11:38:32 +08:00
  • 2af0f671b1 【Hackathon 9th No.55】add test_update_inputs_v1.py (#3992) co63oc 2025-09-11 11:34:22 +08:00
  • a7392a0ff9 【Inference Optimize】DeepSeek-V3-model MLA Optimize (#3886) AIbin 2025-09-11 10:46:09 +08:00
  • 637d96c6ae [Feature] Support zai-org/GLM-4.5-Air BF16 model (#3928) chen 2025-09-10 19:36:10 +08:00
  • 7ee100903f support rope_3d in spec mode (#4034) freeliuzc 2025-09-10 18:15:05 +08:00
  • 684e93269b [Fix] fix multi api server log dir (#3967) ltd0924 2025-09-10 17:15:30 +08:00
  • 749f074e44 Update multi_api_server.py (#4023) ltd0924 2025-09-10 17:15:01 +08:00