Commit Graph

15 Commits

Author SHA1 Message Date
chen
ce9c0917c5 [Precision] Support lm_head layer running in float32 (#3597)
Some checks failed
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* support lm_head fp32 bf16 fp16

* support lm_head fp32 bf16 fp16

* add doc and check code

* lm_head_fp32 specify lm_head as fp32

* code check

* check doc
2025-08-27 11:34:53 +08:00
zhink
df7c31012b Modified to support custom all reduce by default (#3538) 2025-08-22 16:59:05 +08:00
luukunn
371fb3f853 [Feature] add tool parser (#3483)
* add tool parser

* add x1 enable_thinking

* restart ci

* fix vl reasoning parser

* modify call style

* modify call style

* add offline enablethinking

* fix completion

* fix

* fix unit test

* fix unit test

* fix unit test

* fix vl reasoning parser

* fix vl reasoning parser
2025-08-21 17:25:44 +08:00
Yzc216
466cbb5a99 [Feature] Models api (#3073)
* add v1/models interface related

* add model parameters

* default model verification

* unit test

* check model err_msg

* unit test

* type annotation

* model parameter in response

* modify document description

* modify document description

* unit test

* verification

* verification update

* model_name

* pre-commit

* update test case

* update test case

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update tests/entrypoints/openai/test_serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/entrypoints/openai/serving_models.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: LiqinruiG <37392159+LiqinruiG@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-21 17:02:56 +08:00
luukunn
9c129813f9 [Feature] add custom chat template (#3251)
* add custom chat_template

* add custom chat_template

* add unittest

* fix

* add docs

* fix comment

* add offline chat

* fix unit test

* fix unit test

* fix

* fix pre commit

* fix unit test

* add unit test

* add unit test

* add unit test

* fix pre_commit

* fix enable_thinking

* fix pre commit

* fix pre commit

* fix unit test

* add requirements
2025-08-18 16:34:08 +08:00
RAM
154308102e [Docs]Updata docs of graph opt backend (#3442)
* Updata docs of graph opt backend

* update best_practices
2025-08-15 21:30:32 +08:00
ltd0924
31d4fcb425 [BugFix] fix too many open files problem (#3256)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* Update cache_messager.py

* fix too many open files problem

* fix too many open files problem

* fix too many open files problem

* fix ci bugs

* Update api_server.py

* add parameter

* format

* format

* format

* format

* Update parameters.md

* Update parameters.md

* Update serving_completion.py

* Update serving_chat.py

* Update envs.py

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-08 20:10:11 +08:00
ApplEOFDiscord
b71cbb466d [Feature] remove dependency on enable_mm and refine multimodal's code (#3014)
* remove dependency on enable_mm

* fix codestyle check error

* fix codestyle check error

* update docs

* resolve conflicts on model config

* fix unit test error

* fix code style check error

---------

Co-authored-by: shige <1021937542@qq.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-08-01 20:01:18 +08:00
Zero Rains
25698d56d1 polish code with new pre-commit rule (#2923) 2025-07-19 23:19:27 +08:00
RAM
bbe2c5c968 Update GraphOptimizationBackend docs (#2898) 2025-07-17 21:38:18 +08:00
zhenwenDang
5fc659b900 [Docs] add enable_logprob parameter description (#2850)
Some checks failed
Deploy GitHub Pages / deploy (push) Has been cancelled
* add enable_logprob parameter description

* add enable_logprob parameter description

* add enable_logprob parameter description

* add enable_logprob parameter description

* add enable_logprob parameter description

* add enable_logprob parameter description

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2025-07-15 19:47:45 +08:00
zhink
b89180f1cd [Feature] support custom all-reduce (#2758)
* [Feature] support custom all-reduce

* add vllm adapted
2025-07-09 16:00:27 +08:00
kevin
3d3bccdf79 [doc] update docs (#2690) 2025-07-03 19:33:19 +08:00
Jiang-Jia-Jun
92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00
jiangjiajun
684703fd72 [LLM] First commit the llm deployment code 2025-06-09 19:20:15 +08:00