# Lmdeploy 源码解析01
基于自己的理解和实际线上部署,分析lmdeploy模型推理框架的流程,方便问题排查和优化。
- 启动命令和入口 lmdeploy.serve.openai.api_server
api server是基于fastapi框架提供的类openai api接口,支持http调用。主要接口是:
- /v1/models
- /health
- /v1/chat/completions
- /v1/chat/completions_qos
- /v1/completions
- /v1/completions_qos
- serve启动参数
def serve(model_path: str,
model_name: Optional[str] = None,
backend: Literal['turbomind', 'pytorch'] = 'turbomind',
backend_config: Optional[Union[PytorchEngineConfig,
TurbomindEngineConfig]] = None,
chat_template_config: Optional[ChatTemplateConfig] = None,
server_name: str = '0.0.0.0',
server_port: int = 23333,
tp: int = 1,
allow_origins: List[str] = ['*'],
allow_credentials: bool = True,
allow_methods: List[str] = ['*'],
allow_headers: List[str] = ['*'],
log_level: str = 'ERROR',
api_keys: Optional[Union[List[str], str]] = None,
ssl: bool = False,
qos_config_path: str = '',
**kwargs):
pass
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
基于启动参数,初始化AsyncEngine,这里边根据参数来初始化turbomind engine or pytorch engine。
- v1/chat/completions接口
OpenAi Complete接口 (opens new window)
#生成generator
result_generator = VariableInterface.async_engine.generate(
request.messages,
request.session_id,
gen_config=gen_config,
stream_response=True, # always use stream to enable batching
sequence_start=True,
sequence_end=True,
do_preprocess=not isinstance(request.messages,
str), # text completion for string input
adapter_name=adapter_name,
)
#generator逻辑。先计算message prompt input
prompt_input = await self._get_prompt_input(prompt, do_preprocess,
sequence_start,
adapter_name)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
3.1 Pytorch Engine
async def async_stream_infer(
self,
session_id: int,
input_ids: List[int],
gen_config: GenerationConfig = None,
adapter_name: str = None,
input_embeddings: InputEmbeddingType = None,
input_embedding_ranges: InputEmbeddingRangeType = None,
**kwargs):
pass
1
2
3
4
5
6
7
8
9
10
2
3
4
5
6
7
8
9
10
Apache License 2.0 | Copyright © 2022 by xueliang.wu 苏ICP备15016087号