mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2025-12-24 13:28:13 +08:00
41 lines
2.8 KiB
Markdown
41 lines
2.8 KiB
Markdown
# FastDeploy CLI User Guide
|
|
|
|
## Introduction
|
|
|
|
**FastDeploy CLI** is a command-line tool provided by the FastDeploy inference framework, designed for **running, deploying, and testing AI model inference tasks**. It allows developers to quickly perform model loading, API calls, service deployment, performance benchmarking, and environment information collection directly from the command line.
|
|
|
|
With FastDeploy CLI, you can:
|
|
|
|
* 🚀 **Run and validate model inference**: Generate chat responses or text completions directly in the command line (`chat`, `complete`).
|
|
* 🧩 **Deploy models as services**: Start an OpenAI-compatible API service with a single command (`serve`).
|
|
* 📊 **Perform performance and evaluation tests**: Conduct latency, throughput, and task benchmarks (`bench`).
|
|
* ⚙️ **Collect environment information**: Output system, framework, GPU, and FastDeploy version information (`collect-env`).
|
|
* 📁 **Run batch inference tasks**: Supports batch input/output from files or URLs (`run-batch`).
|
|
* 🔡 **Manage model tokenizers**: Encode/decode text and tokens, or export vocabulary (`tokenizer`).
|
|
|
|
---
|
|
|
|
### View Help Information
|
|
|
|
```bash
|
|
fastdeploy --help
|
|
```
|
|
|
|
### Available Commands
|
|
|
|
```bash
|
|
fastdeploy {chat, complete, serve, bench, collect-env, run-batch, tokenizer}
|
|
```
|
|
|
|
---
|
|
|
|
| Command Name | Description | Detailed Documentation |
|
|
| ------------- | ------------------------------------------------------------------------------------------------ | -------------------------------------------------- |
|
|
| `chat` | Run interactive chat generation tasks in the command line to verify chat model inference results | [View chat command details](chat.md) |
|
|
| `complete` | Perform text completion tasks and test various language model outputs | [View complete command details](complete.md) |
|
|
| `serve` | Launch a local inference service compatible with the OpenAI API protocol | [View serve command details](serve.md) |
|
|
| `bench` | Evaluate model performance (latency, throughput) and accuracy | [View bench command details](bench.md) |
|
|
| `collect-env` | Collect and print system, GPU, dependency, and FastDeploy environment information | [View collect-env command details](collect-env.md) |
|
|
| `run-batch` | Run batch inference tasks with file or URL input/output | [View run-batch command details](run-batch.md) |
|
|
| `tokenizer` | Encode/decode text and tokens, and export vocabulary | [View tokenizer command details](tokenizer.md) |
|