Files
FastDeploy/fastdeploy/model_executor/layers/quantization
AIbin 41aee08982 【Inference Optimize】Update MergedReplicatedLinear for DSK qkv_a_proj_with_mqa. (#3673)
* support MergedReplicatedLinear

* update MergedReplicatedLinear to support DSK_wint4 V1_load

* update model name

* update linear class

* fix

* fix v0 moe_bias load

---------

Co-authored-by: bukejiyu <52310069+bukejiyu@users.noreply.github.com>
2025-09-04 21:16:05 -07:00
..
2025-07-24 12:00:52 +08:00
2025-08-26 02:42:46 -07:00
2025-07-31 19:58:05 +08:00
2025-08-25 11:27:45 +08:00
2025-09-03 10:57:26 +08:00
2025-08-06 14:45:27 +08:00