# tokenizer ## Description The **Tokenizer** subcommand provides encoding and decoding functionality between text and token sequences. It also allows viewing or exporting model vocabulary information. Both text and multimodal models are supported. ## Usage ``` fastdeploy tokenizer --model MODEL (--encode TEXT | --decode TOKENS | --vocab-size | --info) ``` ## Parameters | Parameter | Description | Default | | ----------------------------- | ------------------------------------------------------------------------------ | ------- | | --model, -m | Model path or name | None | | --encode, -e | Encode text into a list of tokens | None | | --decode, -d | Decode a list of tokens back into text | None | | --vocab-size, -vs | Display the vocabulary size | None | | --info, -i | Display detailed tokenizer information (special tokens, IDs, max length, etc.) | None | | --vocab-export FILE, -ve FILE | Export the vocabulary to a file | None | ## Examples ``` # 1. Encode text into tokens # Convert input text into a token sequence recognizable by the model fastdeploy tokenizer --model baidu/ERNIE-4.5-0.3B-Paddle --encode "Hello, world!" # 2. Decode tokens into text # Convert a token sequence back into readable text fastdeploy tokenizer --model baidu/ERNIE-4.5-0.3B-Paddle --decode "[1, 2, 3]" # 3. View vocabulary size # Output the total number of tokens in the model’s vocabulary fastdeploy tokenizer --model baidu/ERNIE-4.5-0.3B-Paddle --vocab-size # 4. View tokenizer details # Includes special symbols, ID mappings, max token length, etc. fastdeploy tokenizer --model baidu/ERNIE-4.5-0.3B-Paddle --info # 5. Export vocabulary to a file # Save the tokenizer’s vocabulary to a local file fastdeploy tokenizer --model baidu/ERNIE-4.5-0.3B-Paddle --vocab-export ./vocab.txt # 6. Support for multimodal models # Decode tokens for a multimodal model fastdeploy tokenizer --model baidu/EB-VL-Lite-d --decode "[5300, 96382]" # 7. Combine multiple functions # Encode, decode, view vocabulary, and export vocabulary in a single command fastdeploy tokenizer \ -m baidu/ERNIE-4.5-0.3B-PT \ -e "你好哇" \ -d "[5300, 96382]" \ -i \ -vs \ -ve vocab.json ```