copilot-swe-agent[bot]
53f4a9ad27
Simplify implementation: use inline acquire/release with shared memory counter
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 11:09:17 +00:00
copilot-swe-agent[bot]
6f9b25902a
Address code review feedback: improve IPCSignal initialization, remove unused function, fix formatting
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 10:56:26 +00:00
copilot-swe-agent[bot]
fa43c5f83e
Fix grammar in log message
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 09:49:48 +00:00
copilot-swe-agent[bot]
f35fa87e5f
Fix connection release logic and add bounds checking to prevent negative counter
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 09:47:39 +00:00
copilot-swe-agent[bot]
041a361f8a
Address code review feedback: move imports, fix race condition, improve exception handling
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 09:45:57 +00:00
copilot-swe-agent[bot]
cd844979e9
Remove unused connection_semaphore and fix manual release calls
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 09:42:24 +00:00
copilot-swe-agent[bot]
581fed5f8e
Use shared memory to enforce global concurrency limit across workers
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 09:39:34 +00:00
copilot-swe-agent[bot]
10b9f19441
Fix concurrency control logic to not divide by workers
...
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
2025-12-17 09:23:24 +00:00
fmiao2372
404cf0ece4
[Intel HPU] enable tensor_wise_fp8 ( #5324 )
...
* [Intel HPU] enable tensor_wise_fp8
* update code based on comments
* fix code style issue
* fix bug about RP 5138
* mv kv_cache modifications to HPU backend
* fix FP8 Precision Issues
* fix FP8 Precision Issues
* Add quantization UT
---------
Co-authored-by: yanfeich <yanfei.cheng@intel.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-17 16:45:03 +08:00
freeliuzc
15f5112ecb
[Speculative Decoding]Support different inferseed in speculate decoding ( #5568 )
...
* fix mtp entropy drop in RL
* optimize usage and fix unit test
* optimize padding_sampling_params speed(vectorized)
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
2025-12-17 16:14:29 +08:00
Yonghua Li
0c8c6369ed
[Feature] [PD Disaggregation] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports ( #5415 )
...
* [feat] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports
* [fix] fix some bugs
* [fix] fix rdma port for cache manager/messager
* [fix] temporarily cancel port availability check to see if it can pass ci test
* [feat] simplify args for multi api server
* [fix] fix dp
* [fix] fix port for xpu
* [fix] add tests for ports post processing & fix ci
* [test] fix test_multi_api_server
* [fix] fix rdma_comm_ports args for multi_api_server
* [fix] fix test_common_engine
* [fix] fix test_cache_transfer_manager
* [chore] automatically setting FD_ENABLE_MULTI_API_SERVER
* [fix] avoid api server from creating engine_args twice
* [fix] fix test_run_batch
* [fix] fix test_metrics
* [fix] fix splitwise connector init
* [test] add test_rdma_transfer and test_expert_service
* [fix] fix code syntax
* [fix] fix test_rdma_transfer and build wheel with rdma script
2025-12-17 15:50:42 +08:00
周周周
e29b005520
[Others] Clean code && remove GPU sync code ( #5548 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-16 21:09:37 +08:00
Yuanle Liu
867803ae10
[BugFix] fix speculate_limit_thinking_content_length ( #5590 )
...
* fix speculate_limit_thinking_content_length
* update
2025-12-16 04:31:45 -08:00
kevin
7140939c51
[BugFix] fix video bug ( #5557 )
...
* fix video bug
* add eb5 moe model
2025-12-16 20:06:50 +08:00
Jiang-Jia-Jun
2ad3bff4ff
[Optim] Optimize costtime in checking tasks in engine-worker-queue ( #5580 )
...
* [Optim] Optimize costtime in checking tasks in engine-worker-queue
* Update fastdeploy/engine/common_engine.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-16 19:27:31 +08:00
Yonghua Li
eeb99d2af5
[BugFix] skip model executing after clearing/updating is done ( #5527 )
...
* [fix] fix ep loop
* [fix] another try
* [fix] again
2025-12-16 17:39:03 +08:00
RAM
6fc5eccf83
[RL] R3 Support RDMA Store ( #5467 )
...
* [RL] R3 support rdma store
* refine notes
* refine code
* disable prefix cache
* support preempted task and put cpu tensor
2025-12-16 16:50:13 +08:00
xiaolei373
a30b4da260
[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 ( #5458 )
2025-12-16 16:36:09 +08:00
kevin
c9b47f90ce
[BugFix] fix cpu prefix cache bug ( #5544 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* fix_dy_c8_bug
* add block_num check
* fix test case
* update ci case
2025-12-16 14:21:42 +08:00
Jiang-Jia-Jun
021399f7c9
Revert "[Feature] Use paddle.compat.enable_torch_proxy in `fastdeploy/__ini…" ( #5579 )
...
This reverts commit ff45ac078e .
2025-12-16 13:55:27 +08:00
gaoziyuan
5db08cc1d5
【NewFeature】support load fp8 weight ( #5565 )
2025-12-16 11:23:57 +08:00
Jiang-Jia-Jun
8b6395478a
Revert "[BugFix] reschedule_preempt_task append waiting & PREEMPTED blocksize…" ( #5575 )
...
This reverts commit dbedb0797b .
2025-12-16 11:12:57 +08:00
Jiang-Jia-Jun
9058cc712d
Update gpu_model_runner.py
2025-12-16 11:12:07 +08:00
Jiang-Jia-Jun
075bd71272
Remove GPUMemoryChecker initialization
...
Removed memory checker initialization from GPU model runner.
2025-12-16 11:09:27 +08:00
Jundong Liu
ff45ac078e
[Feature] Use paddle.compat.enable_torch_proxy in fastdeploy/__init__.py ( #5211 )
...
* test feature
* fix xgrammar
* fix paddleformer
* try whitelist
* manual patch PaddlePaddle/Paddle#76706 for test
* remove triton version
* add comment
* Update scripts/run_ci_xpu.sh
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* remove use-triton-in-paddle in requirement.txt
---------
Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-16 11:05:30 +08:00
Yuanle Liu
b8e4828373
[BugFix] fix dynamic c8 in v1 loader ( #5562 )
2025-12-15 04:07:54 -08:00
MingkunZhang
5265d844e9
[Metax] fix GetStopFlagsMulti kernel crash issue ( #5556 )
2025-12-15 01:56:20 -08:00
chenjian
0100ee885f
Fix bug for caching output when preempted ( #5502 )
2025-12-15 17:25:35 +08:00
chenjian
7b0fdf7055
add check health in FD ( #5534 )
2025-12-15 15:14:45 +08:00
zhang-chenyi
77f8ba06e7
[Metax] fix release2.4 and support cudagraph ( #5547 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Co-authored-by: xiaozude <xiaozude@outlook.com >
2025-12-15 14:23:33 +08:00
周周周
722de5ace1
[Others] Clean code ( #5543 )
2025-12-15 10:57:59 +08:00
Ryan
d01cb274d6
[Graph Optimization][CI] Add ERNIE45T 21B sot test ( #5538 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-13 00:43:15 +08:00
kevin
bebd722b5d
fix encoder cache bug ( #5528 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-12 19:25:03 +08:00
Daci
dbedb0797b
[BugFix] reschedule_preempt_task append waiting & PREEMPTED blocksize ( #5506 )
...
* bugfix reschedule_preempt_task append waiting & PREEMPTED blocksize
* bugfix reschedule_preempt_task append waiting & PREEMPTED blocksize
* 注释
* [bugfix] PREEMPTED task blocksize
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-12-12 17:43:29 +08:00
Lucas
888c4b992d
[XPU] refactor of block_attn param 'pos_emb_type' ( #5511 )
2025-12-12 14:30:09 +08:00
Ryan
4eb55332f6
[Models] Add forward_meta to VocabParallelEmbedding of all models ( #5524 )
2025-12-12 14:11:31 +08:00
cmcamdy
6cc3cb4bcf
fix mtp multi batch ( #5521 )
2025-12-12 14:11:20 +08:00
GoldPancake
909059c60a
[Feature] Support for request-level speculative decoding metrics monitoring. ( #5518 )
...
* support spec metrics monitor per request
* fix bug
* remove debug log
* fix ut bugs
2025-12-12 12:22:18 +08:00
kevin
954a145d57
[Optimization] support mm prefill batch ( #5313 )
...
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support mm prefill batch
* update code
* update code
* update code
* update code
* fix encoder cache bug
* update code
* update code
* fix bug
* fix paddle ocr bug
* fix xpu bug
* update code
2025-12-11 22:21:14 +08:00
chen
747b16e021
[BugFix] Fix MTP no logprobs when enable_logprob ( #5499 )
2025-12-11 19:57:22 +08:00
bukejiyu
4066dfb4a6
RL fix ( #5503 )
2025-12-11 19:25:27 +08:00
Ryan
e58fed3665
[Graph Optimization][BugFix][CI] Fix 0size bug && add unitest ( #5495 )
2025-12-11 16:25:26 +08:00
周周周
ff353b922f
[Others] update tbo related code ( #5485 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-11 12:34:46 +08:00
chen
6289cbc434
[BugFix] fix hung when n>1 and --enable-logprob ( #5492 )
2025-12-11 10:46:27 +08:00
Jiang-Jia-Jun
4b3e41c665
[Optim] Improve task-checking performance in engine-worker-queue ( #5376 )
...
* [Optim] Optimize costtime in checking tasks in engine-worker-queue
* Update fastdeploy/engine/common_engine.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update fastdeploy/inter_communicator/engine_worker_queue.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* [Docs] Add docstring to set_exist_tasks method (#5382 )
* Initial plan
* Add docstring to set_exist_tasks method
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* [Docs] Add docstring documentation to exist_tasks() method (#5381 )
* Initial plan
* Add comprehensive docstring to exist_tasks() method
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* [Optimization] Conditionally initialize shared memory for single-node deployments only (#5383 )
* Initial plan
* Conditionally initialize exist_tasks_intra_signal for single-node deployments
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Use is_single_node flag for consistent deployment type checking
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* Remove redundant None checks in exist_tasks methods
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com >
* format code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com >
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com >
2025-12-11 10:33:32 +08:00
Yonghua Li
2ec76352da
[BugFix] fix instability after clearing weight ( #5493 )
...
* [BugFix] fix instability after clearing weight
* [chore] add todo
2025-12-11 10:22:35 +08:00
qwes5s5
d79438bb86
add detoken switch ( #5463 )
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-10 21:44:02 +08:00
Daci
a2ab1f4462
[BugFix] fix mix splitwise pickle load error ( #5488 )
...
* RouterArgs port str -> int
* fix race condition [is_fetching] causing multiple fetch requests
* bugfix: Delete duplicate input_ids tensor creation
* mm pd splitwise json -> pickle5; multimodal_inputs only pos id;
debuglog f to %s
* fix ENABLE_V1_KVCACHE_SCHEDULER=0 mm model lack pos_id, ...
* update cr
* Apply suggestions from code review
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* pre-commit fix
* rm multimodal_inputs deepcopy & fix rdma_cache_transfer.py tpsize=0
* fix mix splitwise pickle dump
---------
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-10 19:05:50 +08:00
Neil Zhu
4403a21d4b
[Metax] refactor cutlass moe and optimize flash attention ( #5361 )
...
* [Metax] refactor moe and flash attention backend
---------
Co-authored-by: zhangchenyi_dl <16219492+zhangchenyidl@user.noreply.gitee.com >
2025-12-10 17:15:17 +08:00
luukunn
fbc9bce1e9
[Feature]Optimization of Thinking Pattern Framework ( #4302 )
...
* add model status in vl
* add x1 parser
* add model_status
* fix parser
* fix parser
* fix parser
* fix parser
* Revert "fix parser"
This reverts commit 300f446d8a .
* fix parser
* fix
* fix
* fix
* fix
* fix parser
* fix unit test
* fix unit test
* add unit test
* fix
* fix
* add unit test
* fix unit test
* add unit test
* add unit test
* fix unit test
* fix unit test
* fix bug
* fix unit test
* x1 tool parser
* fix unit test
* fix unit test
* fix unit test
* fix n
* fix unit test
* add unit test
* add unit test
* remove pring
2025-12-10 16:17:06 +08:00