* qk norm for speculate decode C16
* support mtp in v1_scheduler mode
* support mtp rope_3d
* support mtp features
* add unit test && del some log
---------
Co-authored-by: yuanxiaolan <yuanxiaolan01@baidu.com>
Co-authored-by: xiaoxiaohehe001 <hiteezsf@163.com>