diff --git a/docs/supported_models.md b/docs/supported_models.md index 6dd601666..7eaac75df 100644 --- a/docs/supported_models.md +++ b/docs/supported_models.md @@ -13,7 +13,7 @@ export FD_MODEL_CACHE=/ssd1/download_models | Model Name | Context Length | Quantization | Minimum Deployment Resources | Notes | | :--------- | :------------- | :----------- | :-------------------------- | :---- | -| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT2 | 1*96G GPU VRAM/1T RAM | Chunked Prefill required for 128K | +| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT2 | 1*141G GPU VRAM/1T RAM | Chunked Prefill required for 128K | | baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT4 | 4*80G GPU VRAM/1T RAM | Chunked Prefill required for 128K | | baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT8 | 8*80G GPU VRAM/1T RAM | Chunked Prefill required for 128K | | baidu/ERNIE-4.5-300B-A47B-Paddle | 32K/128K | WINT4 | 4*64G GPU VRAM/600G RAM | Chunked Prefill required for 128K | diff --git a/docs/zh/supported_models.md b/docs/zh/supported_models.md index a7164bd14..48fa8f05d 100644 --- a/docs/zh/supported_models.md +++ b/docs/zh/supported_models.md @@ -14,7 +14,7 @@ export FD_MODEL_CACHE=/ssd1/download_models | 模型名 | 上下文长度 | 量化方式 | 最小部署资源 | 说明 | | :----- | :-------------- | :----------- |:----------- |:----------- | -| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT2 | 1卡*96G显存/1T内存 | 128K需要开启Chunked Prefill | +| baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT2 | 1卡*141G显存/1T内存 | 128K需要开启Chunked Prefill | | baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT4 | 4卡*80G显存/1T内存 | 128K需要开启Chunked Prefill | | baidu/ERNIE-4.5-VL-424B-A47B-Paddle | 32K/128K | WINT8 | 8卡*80G显存/1T内存 | 128K需要开启Chunked Prefill | | baidu/ERNIE-4.5-300B-A47B-Paddle | 32K/128K | WINT4 | 4卡*64G显存/600G内存 | 128K需要开启Chunked Prefill |