FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Files

fmiao2372 404cf0ece4 [Intel HPU] enable tensor_wise_fp8 (#5324 )

* [Intel HPU] enable tensor_wise_fp8

* update code based on comments

* fix code style issue

* fix bug about RP 5138

* mv kv_cache modifications to HPU backend

* fix FP8 Precision Issues

* fix FP8 Precision Issues

* Add quantization UT

---------

Co-authored-by: yanfeich <yanfei.cheng@intel.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>

2025-12-17 16:45:03 +08:00

test_kv_cache.py

[BugFix]Fix load kv cache quant scale (#4077 )

2025-09-12 17:44:03 +08:00

test_tensor_wise_fp8.py

[Intel HPU] enable tensor_wise_fp8 (#5324 )

2025-12-17 16:45:03 +08:00

test_w4a8.py

[Docs] Add License in Unittest (#4957 )

2025-11-12 10:44:09 +08:00

test_w4afp8.py

[Others] remove add_bias option (#5425 )

2025-12-09 17:39:35 +08:00