* w4afp8 支持per group * code style * fix transpose * revert fast hardmard --------- Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com> Co-authored-by: plusNew001 <95567040+plusNew001@users.noreply.github.com>
MoeFastHardamardImplWrapper
* autogen MoeFastHardamardImplWrapper template_instantiation * fix codestyle * fix codestyle * add impl cu files