Parquet is columnar storage, which is much faster for querying. And typically for protobuf you deserialize each row, which has a performance cost - you need to deserialize the whole message, and can't get just the field you want.
So, of you want to query a giant collection of protobufs, you end up reading and deserializing every record. For parquet, you get much closer to only reading what you need.
So, of you want to query a giant collection of protobufs, you end up reading and deserializing every record. For parquet, you get much closer to only reading what you need.