RAM
|
fc5cd1adb1
|
[BugFix] Fix graph opt test case (#4634)
* fix bug and refine code
* add debug count
* refine code
* fix ci test case
|
2025-10-29 13:28:04 +08:00 |
|
Ryan
|
6160145f82
|
[SOT] Change warnings to errors and remove fallback operations (#4378)
* Change warnings to errors and remove fallback operations
* fix unitest
* fix codestyle
|
2025-10-17 11:27:04 +08:00 |
|
YuanRisheng
|
a2ec2c4152
|
[FDConfig]Remove max_model_len in FDConfig (#4350)
* modify max_model_len
* fix unittest
* fix unittest
---------
Co-authored-by: root <root@yqlcc01-sys-rpm12rzmwjd.yqlcc01.baidu.com>
|
2025-10-11 14:04:17 +08:00 |
|
YuanRisheng
|
24180fba0a
|
[FDConfig]Remove splitwise_role and engine_worker_queue_port in FDConfig (#4147)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* remove splitwise_role and engine_worker_queue_port
* fix xpu
* fix xpu
* fix xpu
* fix unittest
* resolve conflct
|
2025-09-19 17:01:52 +08:00 |
|
YuanRisheng
|
2e9e53ff7e
|
[FDConfig]Remove max_num_batched_tokens/max_num_seqs in parallel config (#4116)
* remove max_num_batched_tokens in parallel config
* remove max_num_seqs
* update test case
* fix test
* fix
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
|
2025-09-17 10:43:35 +08:00 |
|
Jundong Liu
|
3d0aaa5923
|
[Excutor] Experiment Feature-Support Prefill in cudagraph (#3459)
* Support prefill in Cudagraph
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.1
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.2
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.3
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.4
* Refactor GetBlockShapeAndSplitKVBlock Kernel V2.5
* Solve problem about encoder_num_blocks_x_cpu
* Add early-exit mechanism for attention kernel
* fix test case about append-attention
* Update testcode, Add annotations to related tensors
* move get_input_length_list
* solve test_code
* Add annotations about early-exit for attention kernel
* Add annotations about early-exit for attention kernel2
* solve comment
* solve mtp
---------
Co-authored-by: RAM <gstian5555@outlook.com>
|
2025-09-08 13:12:24 +08:00 |
|
co63oc
|
5441538173
|
rename fused_get_rope.cu (#3752)
* rename fused_get_rope.cu
* fix
* fix typos
* fix
* fix
|
2025-09-03 10:54:34 +08:00 |
|
Ryan
|
a5b4866ff1
|
[CudaGraph][SOT] Add unit tests for splitting the static graph into piecewise graphs that support cuda_graph (#3590)
* add unitest
* change sot_warmup_sizes
* wtf; add missed commit
|
2025-08-26 11:25:04 +08:00 |
|