FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-12-24 13:28:13 +08:00

Author	SHA1	Message	Date
lizexu123	c563eca791	[Feature] support reward model (#5301 ) * Your commit message here * add test * update develop * support reward * support enable_chunk_prefill * support bingfa * support convert is reward * update test * delete print * fix enable_thinking * add document * fix place * fix test * fix * support enable_prefix_caching * add no-enable_prefix-caching test * fix * support enable_prefix_caching * delete print * fix document * fix * fix test * fix document and delete chinese * udpate * enable_thinking * fix test	2025-12-02 14:55:31 +08:00
Echo-Nie	c18b177f21	fix the get_act_fn,_load_st_projector (#4824 )	2025-11-06 16:13:35 +08:00
lizexu123	4ac6de9a3c	[Feature] support pooling model runner (#4590 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * support qwen3-embedding * support qwen3-embedding-0.6b * fix * fix bug * fix test_return_token_ids.py and update enable_thinking * fix mtp dummy_run * merge develop * fix np.float32 * delete FD_DISABLE_CHUNKED_PREFILL and FD_USE_GET_SAVE_OUTPUT_V1 * delete and build_stream_transfer_data * fix test_update_v1: * fix * fix * update dummy_run post_process * delete test_update_v1 * fix * fix dummy_run * fix model_path * fix model_path * fix dummy_run	2025-10-31 22:32:05 +08:00
lizexu123	c234b995ab	[Feature] support pooling model dummy_run (#4345 ) * support qwen3-embedding * fix ci bug * support pooling dummy_run * fix * delete print * parallel_config.max_model_len * delete is_pooling_model in dummy_run * fix * fd_model * fix embedding load * fix * fix post_process	2025-10-17 13:30:55 +08:00
SunLei	b4b579a7ed	Feature：Add support for Pooling Model Embedding and provide an OpenAI-compatible API. (#4344 ) * feat: add OpenAIServing * feat: add ZmqOpenAIServing & OpenAIServingEmbedding * feat: Refine the basic ServingEngine class and introduce ServingContext * fix: codestyle * fix: request * fix: pooling_params * feat: _process_chat_template_kwargs * feat: support batch request * feat: pooling_params verify & default parameters --------- Co-authored-by: sunlei1024 <sunlei1024@example.com>	2025-10-15 19:42:59 +08:00
lizexu123	c86945ef49	[Feature] support pool (#3827 ) * support pool * update pooling * add pooler_config and check * update * support AutoWeightsLoader load weight * fix * update * delete print * update pre-commit * fix * fix xpu * fix ModelRegistry->model_registry * fix Copilot review * fix pooler.py * delete StepPooler * fix abstract * fix default_loader_v1 * fix Pre Commit * support torch qwen3 dense * add test and fix torch-qwen * fix * fix * adapter ci: * fix review * fix pooling_params.py * fix * fix tasks.py 2025 * fix print and logger * Modefy ModelRegistry and delete AutoWeightsLoader * fix logger * fix test_embedding * fix ci bug * ernie4_5 model_registry * fix test * support Qwen3-Embedding-0.6B tp=1 load * fix extra code * fix * delete fix vocab_size * delete prepare_params_dict * fix:	2025-09-22 14:09:09 +08:00

6 Commits