Protobuf Benchmarks by Zserio

Protobuf Benchmarks by Zserio is an independent benchmark which uses zserio-benchmarks-datasets to compare Google's Protocol Buffers performance to Zserio on the same sets of data.

Zserio vs. Protocol Buffers

Google's Protocol Buffers are very popular and in wide-spread use. One of the many questions we always have to answer is: "Why don't you use Protobuf? It is already there."

Fact is that it wasn't open sourced when we would have needed it. Maybe we would have used it back then. But even today we think we came along with something more tailored to our needs. This is also the reason why we open sourced Zserio after such a long time.

So let's see how Zserio performs in comparison to Protobuf. For being fair we have chosen as well the example that is used on Google's documentation page of Protobuf (addressbook). This example does not really help to promote a binary - thus smaller - representation of data. It mostly uses strings.

Running

Make sure you have the following pre-requisites installed:

Protocol Buffers Compiler
CMake
ZIP utility
Supported Compiler (gcc, clang, mingw, msvc)

Also do not forget to fetch the datasets with git submodule update --init.

Now you are ready to run the benchmark.sh script which accepts the required platform as a parameter (e.g. cpp-linux64-gcc):

scripts/benchmark.sh <PLATFORM>

The script benchmark.sh automatically generates simple performance test for each benchmark. The performance test uses generated Protocol Buffers' API to read appropriate dataset from JSON format, serialize it into the Protocol Buffers' binary format and then read it again. Both reading time and the BLOB size are reported. BLOB size after zip compression is reported as well.

Results

Used platform: 64-bit Linux Mint 21.1, Intel(R) Core(TM) i7-9850H CPU @ 2.60GHz
Used compiler: gcc 11.3.0

Protobuf 3.21.12

Benchmark	Dataset	Target	Time	Blob Size	Zip Size
addressbook.proto	addressbook.json	C++ (linux64-gcc)	1.731ms	356.292kB	193kB
apollo.proto	apollo.proto.json	C++ (linux64-gcc)	0.641ms	286.863kB	136kB
carsales.proto	carsales.json	C++ (linux64-gcc)	2.053ms	399.779kB	242kB
simpletrace.proto	prague-groebenzell.json	C++ (linux64-gcc)	0.386ms	113.152kB	54kB

Zserio 2.10

Benchmark	Dataset	Target	Time	Blob Size	Zip Size
addressbook.zs	addressbook.json	C++ (linux64-gcc)	1.478ms	305.838kB	222kB
addressbook_align.zs	addressbook.json	C++ (linux64-gcc)	0.844ms	311.424kB	177kB
apollo.zs	apollo.zs.json	C++ (linux64-gcc)	0.244ms	226.507kB	144kB
carsales.zs	carsales.json	C++ (linux64-gcc)	1.374ms	280.340kB	259kB
carsales_align.zs	carsales.json	C++ (linux64-gcc)	0.925ms	295.965kB	205kB
simpletrace.zs	prague-groebenzell.json	C++ (linux64-gcc)	0.221ms	87.042kB	66kB

Time Comparison

Size Comparison

Why Is Zserio More Compact Than Protobuf?

To be fair, it is necessary to note that Protobuf encodes more information which are used for compatibility of encoder/decoder when proto file is changed:

Protobuf encodes each field ID (i.e. the = 1, = 2 in the following messages example), to preserve compatibility when adding new fields or reordering them in messages:
```
message Road
{
  int32 id = 1;
  string name = 2;
}
```
These IDs have an encoding cost, which zserio does not pay. In zserio, it would merely be:
```
struct Road
{
  int32 id;
  string name;
};
```
Protobuf always encodes the field size, so that old decoders can skip field IDs which they do not know about. This is useful for forward/backward compatibility. This has a cost which zserio does not pay.

On another hand, zserio encoder uses better compactness:

Zserio can have fields of arbitrary bit size, non-byte aligned, unlike protobuf which has fewer possible types, all byte aligned. And structures (messages) are not byte aligned in general (although explicit alignment is possible e.g. align(8))
Zserio has constraint expressions to indicate whether a field is encoded or not based on previously decoded information. The constraint expression has zero cost in encoding size since it's only present in the generated encoding/decoded code. In the following example, the box field is only encoded iff and expression following it is true, based on previously decoded info, which helps being compact:
```
struct Foo
{
  int8 type;
  BoundingBox box if type == 1;
};
```
The encoding size of such structure in zserio would be only 1 byte (which may not be byte aligned) for the type field.
Arrays in zserio do not need to encode the size of the array. It's known in the generated encoding/decoding code, even for arrays of variable size as in the following example. When size is zero in particular, the array has zero encoding cost:
```
struct Foo
{
  int8 num_items;
  Items list[num_items];
};
```
In protobuf, this would be a repeated field, but the repeated field always has an encoding cost to encode its length as for every other fields in protobuf, to be able to skip it:
```
message Foo
{
  int8 num_items = 1;
  repeated Items list = 2;
}
```

How to Add New Benchmark

Add new dataset (e.g. new_benchmark) in JSON format into datasets repository
Add new schema (e.g. new_benchmark) in Protobuf format into benchmarks directory
Make sure that the first message in the schema file is the top level message

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
cmake		cmake
datasets @ 789e6e3		datasets @ 789e6e3
images		images
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Protobuf Benchmarks by Zserio

Zserio vs. Protocol Buffers

Running

Results

Protobuf 3.21.12

Zserio 2.10

Time Comparison

Size Comparison

Why Is Zserio More Compact Than Protobuf?

How to Add New Benchmark

About

Releases

Packages

Contributors 3

Languages

License

ndsev/zserio-protobuf-benchmarks

Folders and files

Latest commit

History

Repository files navigation

Protobuf Benchmarks by Zserio

Zserio vs. Protocol Buffers

Running

Results

Protobuf 3.21.12

Zserio 2.10

Time Comparison

Size Comparison

Why Is Zserio More Compact Than Protobuf?

How to Add New Benchmark

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages