Yuanle Liu
|
2f74e93d7e
|
use dist.all_reduce(min) to sync num_blocks_local (#2933)
* pre-commit all files check
* reduce min num_blocks_local
* fix nranks=1
* pre-commit when commit-msg
|
2025-07-21 01:23:36 -07:00 |
|
lizexu123
|
67990e0572
|
[Feature] support min_p_sampling (#2872)
Deploy GitHub Pages / deploy (push) Has been cancelled
* Fastdeploy support min_p
* add test_min_p
* fix
* min_p_sampling
* update
* delete vl_gpu_model_runner.py
* fix
* Align usage of min_p with vLLM
* fix
* modified unit test
* fix test_min_sampling
* pre-commit all files
* fix
* fix
* fix
* fix xpu_model_runner.py
|
2025-07-20 23:17:59 -07:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
ming1753
|
1f15ca21e4
|
[Feature] support prompt repetition_penalty (#2806)
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-07-17 12:05:52 +08:00 |
|
littledgg
|
59071268b6
|
[Executor] Move forward_meta.py to fastdeploy/model_executor (#2774)
* Use PEP 563 in attention.py and fix conflict
* merge commit
* Change what was left out last time
|
2025-07-10 20:36:51 +08:00 |
|
RAM
|
03a74995b8
|
Clear dead code And supplementary notes (#2757)
Deploy GitHub Pages / deploy (push) Has been cancelled
* 1.supplementary notes 2.delete dead code
* fix bug of forward meta
* Global modification of forward meta
* fix vl model_runner bug
|
2025-07-09 16:17:34 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|
jiangjiajun
|
684703fd72
|
[LLM] First commit the llm deployment code
|
2025-06-09 19:20:15 +08:00 |
|