FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2025-09-27 12:52:29 +08:00

Author	SHA1	Message	Date
RAM	870364b547	[CUDAGraph]CUDA Graph support unique memory pool (#4230 ) * cuda graph use unique memory pool * fix custom device import bug * refine code * refine code * refine code	2025-09-24 19:45:22 +08:00
Yuanle Liu	b1b33211e8	[CUDAGraph] Support multi output buffers and merge some fixes from feature/exp_0908 (#4062 ) * refine cudagraph * refine cudagraph * typo * fix * fix plugins * fix * update * update * update	2025-09-15 16:21:30 +08:00
RAM	d3e4ae3d49	[Executor] Adjust signal sending order in RL training (#3773 ) * Adjust processing order * fix bug * fix update_parameters bug * refine code	2025-09-10 13:24:20 +08:00
RAM	205b706ef8	[Executor] Fix bug of import paddle with RLHF (#3781 )	2025-09-02 17:32:13 +08:00
co63oc	d6369b4d51	fix typos (#3684 )	2025-09-01 17:50:17 +08:00
zyfncg	f677c032c0	[CudaGraph] [SOT] Support spliting static graph into piecewise graph with cuda_graph (#3478 ) * support spliting static graph into piecewise graph with cuda_graph * Update fastdeploy/model_executor/graph_optimization/cudagraph_piecewise_backend.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix merge conflict * fix bug --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-29 16:28:01 +08:00
RAM	00898603c8	[CUDAGraph]Add debug func (#3616 ) * add print dot files * refine code	2025-08-26 16:43:48 +08:00
RAM	2fa173e327	[Executor] CUDAGraph support RL training (#3265 ) Some checks failed CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details Publish Job / publish_pre_check (push) Has been cancelled Details Publish Job / print_publish_pre_check_outputs (push) Has been cancelled Details Publish Job / FD-Clone-Linux (push) Has been cancelled Details Publish Job / Show Code Archive Output (push) Has been cancelled Details Publish Job / BUILD_SM8090 (push) Has been cancelled Details Publish Job / BUILD_SM8689 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled Details Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled Details Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled Details Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled Details Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled Details Publish Job / Run Base Tests (push) Has been cancelled Details Publish Job / Run Accuracy Tests (push) Has been cancelled Details * add clear graph opt backend * cuda graph support rl * add branch * 1.fix dynamic_weight_manager bug 2.add clear api for CasualLM * open test case * fix typo * update mkdocs.yaml * [Docs]Update mkdocs.yml * update test case * use unittest in graph test case	2025-08-25 20:59:30 +08:00
Jundong Liu	70ee910cd5	[Excutor] Change cudagraph hashkey from batch size to num_tokens (#3454 )	2025-08-18 16:16:48 +08:00
Zero Rains	0fb37ab7e4	update flake8 version to support pre-commit in python3.12 (#3000 ) * update flake8 version to support pre-commit in python3.12 * polish code	2025-07-24 01:43:31 -07:00
zhink	0262ef7eb3	custom all reduce support cuda graph (#2938 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * Support enabling cuda graph and custom all reduce at the same time, and fix the overwritten custom all reduce flag * rename communication_op to communication	2025-07-21 22:52:03 +08:00
Zero Rains	25698d56d1	polish code with new pre-commit rule (#2923 )	2025-07-19 23:19:27 +08:00
RAM	0fad10b35a	[Executor] CUDA Graph support padding batch (#2844 ) * cuda graph support padding batch * Integrate the startup parameters for the graph optimization backend and provide support for user - defined capture sizes. * Do not insert max_num_seqs when the user specifies a capture list * Support set graph optimization config from YAML file * update cuda graph ci * fix ci bug * fix ci bug	2025-07-15 19:49:01 -07:00
RAM	e3768c5a83	[Executor] Fix bug of logger.debug (#2778 )	2025-07-09 04:13:43 -07:00
RAM	03a74995b8	Clear dead code And supplementary notes (#2757 ) Some checks failed Deploy GitHub Pages / deploy (push) Has been cancelled Details * 1.supplementary notes 2.delete dead code * fix bug of forward meta * Global modification of forward meta * fix vl model_runner bug	2025-07-09 16:17:34 +08:00
Jiang-Jia-Jun	92c2cfa2e7	Sync v2.0 version of code to github repo	2025-06-29 23:29:37 +00:00
jiangjiajun	684703fd72	[LLM] First commit the llm deployment code	2025-06-09 19:20:15 +08:00

17 Commits