周周周
|
95243f012c
|
[Others] add PADDLE_ENFORCE (#5288)
|
2025-11-28 14:23:35 +08:00 |
|
Ryan
|
e25c067f70
|
[OP] Add InferShape&InferDtype for per_token_quant_padding (#4667)
* add InferShape&InferDtype for per_token_quant_padding
* fix codestyle
|
2025-10-30 10:28:26 +08:00 |
|
周周周
|
76513f6416
|
Support 45t fp8 8 GPU (#3659)
|
2025-08-28 10:52:53 +08:00 |
|
RichardWooSJTU
|
e39159f3bd
|
Add switch to apply fine-grained per token quant fp8 (#3192)
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
|
2025-08-04 19:54:03 -07:00 |
|
Jiang-Jia-Jun
|
05c670e593
|
[Sync] Update to latest code (#2679)
* [Sync] Update to latest code
* Add new code files
* Add new code files
* update code
* Try to fix build.sh
* Try to fix build.sh
* Update code
* Update requirements.txt
* Update code
---------
Co-authored-by: Jiang-Jia-Jun <jiangjiajun@baidu.com>
|
2025-07-03 15:43:53 +08:00 |
|
MARD1NO
|
ac5f860536
|
use shfl_xor_sync to reduce redundant shfl broadcast
|
2025-06-30 13:12:21 +08:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|