mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-12-24 13:28:13 +08:00
remove load default_v1 since already been as default (#4980)
This commit is contained in:
@@ -15,8 +15,6 @@ export FD_MODEL_SOURCE=AISTUDIO # "AISTUDIO", "MODELSCOPE" or "HUGGINGFACE"
|
||||
export FD_MODEL_CACHE=/ssd1/download_models
|
||||
```
|
||||
|
||||
> ⭐ **Note**: Models marked with an asterisk can directly use **HuggingFace Torch weights** and support **FP8/WINT8/WINT4** as well as **BF16**. When running inference, you need to enable **`--load-choices "default_v1"`**.
|
||||
|
||||
> Example launch Command using baidu/ERNIE-4.5-21B-A3B-PT:
|
||||
```
|
||||
python -m fastdeploy.entrypoints.openai.api_server \
|
||||
@@ -25,8 +23,7 @@ python -m fastdeploy.entrypoints.openai.api_server \
|
||||
--metrics-port 8181 \
|
||||
--engine-worker-queue-port 8182 \
|
||||
--max-model-len 32768 \
|
||||
--max-num-seqs 32 \
|
||||
--load-choices "default_v1"
|
||||
--max-num-seqs 32
|
||||
```
|
||||
|
||||
## Large Language Models
|
||||
|
||||
Reference in New Issue
Block a user