Protobuf(Protocol Buffers) 是 Google 开发的一种轻量高效的二进制序列化格式,常用于服务间通信或数据存储。它通过 .proto 文件定义数据结构,然后生成不同语言的代码(Python、Go、C++ 等),在这些代码中直接读写结构化数据。

以下示例定义了 user.proto

syntax = "proto3";

package example;

message User {
  int32 id = 1;
  string name = 2;
  string email = 3;
  repeated string tags = 4;
}

生成 Python 代码的命令如下:

protoc --python_out=./gen user.proto

执行后,gen/ 目录中会出现 user_pb2.py

❯ lsd --tree --depth 2 gen
 gen
└──  user_pb2.py

该文件里定义了 User 类,内容类似:

# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# NO CHECKED-IN PROTOBUF GENCODE
# source: user.proto
# Protobuf Python Version: 6.33.0
"""Generated protocol buffer code."""
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import runtime_version as _runtime_version
from google.protobuf import symbol_database as _symbol_database
from google.protobuf.internal import builder as _builder
_runtime_version.ValidateProtobufRuntimeVersion(
    _runtime_version.Domain.PUBLIC,
    6,
    33,
    0,
    '',
    'user.proto'
)
# @@protoc_insertion_point(imports)

_sym_db = _symbol_database.Default()




DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\nuser.proto\x12\x07\x65xample\"=\n\x04User\x12\n\n\x02id\x18\x01 \x01(\x05\x12\x0c\n\x04name\x18\x02 \x01(\t\x12\r\n\x05\x65mail\x18\x03 \x01(\t\x12\x0c\n\x04tags\x18\x04 \x03(\tb\x06proto3')

_globals = globals()
_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, _globals)
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'user_pb2', _globals)
if not _descriptor._USE_C_DESCRIPTORS:
  DESCRIPTOR._loaded_options = None
  _globals['_USER']._serialized_start=23
  _globals['_USER']._serialized_end=84
# @@protoc_insertion_point(module_scope)

示例用法:

from gen import user_pb2

user = user_pb2.User()
user.id = 1234
user.name = "Neo"
user.email = "[email protected]"
user.tags.extend(["admin", "tester"])

print(user.SerializeToString())

然而 Protobuf 会在运行时动态生成 User 类,静态分析器无法推断其结构,导致 VS Code 等工具的类型提示与自动补全体验不佳。

alt text
alt text

解决方法是在生成 Python 代码时同步产出 .pyi stub 文件,只需追加 --pyi_out 参数:

protoc --python_out=./gen --pyi_out=./gen user.proto

此时目录结构为:

❯ lsd --tree --depth 2 gen
 gen
├──  user_pb2.py
└──  user_pb2.pyi

生成的 user_pb2.pyi 描述了类型信息,编辑器即可获得完整的类型提示:

alt text
alt text