Week 2 & 3 — Taking first steps… Understanding and writing protobuf for the range diff RPC
In my previous blog post, I shared my plan to begin working on the .proto file for the git range-diff command. Over the course of these 2 week, I dedicated my time to gaining a comprehensive understanding of protocol buffers and the process of writing a .proto file. Furthermore, I explored the intriguing realm of Gitaly and how it leverages the power of protocol buffers.
What are Protocol Buffer?
Protocol Buffers (or Protobuf) is a very amazing way of organising and sharing data between different programs and languages. Think of them as a universal translator for data.
At it’s core, a Protocol Buffer is a schema that defines the structure of the data. This schema is defined in .proto file, where we can specify the data fields and their types. The magic happens when we use Protocol Buffer compiler, which takes the .proto file as input and generates code in your desired programming language. This generated code provides convenient APIs for working with the structured data. So, whether we’re using JavaScript, Go or any other supported language, we can generate the corresponding code and effortlessly work with the serialized data.
Let’s take an easy example of using Protocol Buffers (protobuf) with a small .proto file and generate the corresponding code in Golang
| 1 | syntax = "proto3"; | 
- To start, I have created a file called - person.proto. This file define the structure of the data we want to serialize.
- To generate code from the - person.protofile we have to download Protobuf Compiler (- protoc). We can download it from here accordingly https://github.com/protocolbuffers/protobuf.
- Since we want to generate code in Golang we should use this compiler https://github.com/protocolbuffers/protobuf-go and run the following commands - 1 - protoc --go_out=. person.proto - So this command tells - protocto generate Golang code (- - - go_out) and output it in the current directory (- .).
- After generating the code, we’ll find a - person.pb.gofile in the current directory. The file is a Golang source code file that contains the generated code for working with the Protocol Buffers data defined in the- person.protofile.
Gitaly‘s back story
As GitLab’s infrastructure scaled and the number of users accessing the .git directory grew, the company encountered several challenges. Initially, they implemented horizontal scaling by deploying multiple server instances behind a load balancer. However, this approach eventually became a bottleneck. Another problem arose from the shared access to the critical .git directory, which caused potential downtime if the hosting server failed and led to performance issues due to high read/write activity. To tackle these issues, GitLab introduced Gitaly. This innovative solution involved creating a fail-safe and optimized layer around the .git directory. Instead of direct interaction, Gitaly acted as a server-side proxy, managing all requests and performing operations on the server-side before transmitting the results over the network. By centralizing operations and optimizing server-side processes, Gitaly significantly improved GitLab’s performance and efficiency.
Current Progress 🏗️
- I have started working on the - rangediff.protofile. So my initial set of tasks is to compare two versions of a merge request using- git range-diff. When implementing this functionality as an RPC, we need to define the necessary parameters to perform the comparison. These parameters will include:- start_oid_old: Starting commit OID of the old version.
- head_oid_old: Latest commit OID of the old version.
- start_oid_new: Starting commit OID of the new version.
- head_oid_new: Latest commit OID of the new version.
 - By obtaining these commit OIDs for both versions, we can pass them as parameters to the RPC implementation, specifically to the - git range-diffcommand. This command is responsible for executing the comparison between the two versions and generating a comprehensive report detailing the changes made. The report includes information on- added,- removed,- modified, and- unmodifiedlines, providing valuable insights into the differences between the two versions of the merge request.- Here is the link to the MR: https://gitlab.com/gitlab-org/gitaly/-/merge_requests/5918 
- How Gitaly compiles the proto file to generate corresponding golang code. - Thanks to ZheNing for asking me to go through it! Gitaly compiles proto files to generate corresponding Golang code using the protoc compiler. The Makefile in the Gitaly repository provides the necessary instructions and tools for this process. - Let’s focus on the part related to compiling the proto file and generating the corresponding Go code. - To generate Go code from a proto file, Gitaly uses the - protoccompiler along with the- protoc-gen-goand- protoc-gen-grpcplugins. Here is a code snippet of Makefile:- 1 
 2
 3
 4
 5
 6
 7
 8
 9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29- # Directories 
 SOURCE_DIR := $(abspath $(dir $(lastword ${MAKEFILE_LIST})))
 PROTO_DEST_DIR := ${SOURCE_DIR}/proto/go
 # Tools
 PROTOC := ${TOOLS_DIR}/protoc
 PROTOC_GEN_GO := ${TOOLS_DIR}/protoc-gen-go
 PROTOC_GEN_GO_GRPC := ${TOOLS_DIR}/protoc-gen-go-grpc
 # protoc target
 PROTOC_VERSION ?= v23.1
 PROTOC_BUILD_DIR ?= ${DEPENDENCY_DIR}/protobuf/build
 PROTOC_INSTALL_DIR ?= ${DEPENDENCY_DIR}/protobuf/install
 # ...
 # protoc installation
 ${PROTOC_INSTALL_DIR}/.installed:
 # Download, build, and install protoc
 # ...
 # Generate Go code from proto files
 ${PROTO_DEST_DIR}/%.pb.go: ${SOURCE_DIR}/proto/%.proto ${PROTOC_INSTALL_DIR}/.installed
 @mkdir -p $(@D)
 ${PROTOC} \
 --proto_path=${SOURCE_DIR}/proto \
 --go_out=paths=source_relative:${PROTO_DEST_DIR} \
 --go-grpc_out=paths=source_relative:${PROTO_DEST_DIR} \
 $<- Here’s how the process works: - The root directory of the Gitaly source code is identified as the source directory (SOURCE_DIR)
- The Go code is generated and placed in the proto/godirectory within the source directory.
- The necessary tools and plugins for the compilation process are specified, which includes protoc(the protobuf compiler) as well as theprotoc-gen-goandprotoc-gen-go-grpcplugins.
- The target for generating Go code from a proto file is defined using a pattern rule. It specifies that for any .protofile in theprotodirectory, a corresponding.pb.gofile should be generated in thePROTO_DEST_DIR.
- The PROTOCcommand is executed with the appropriate arguments to generate the Go code. The--proto_pathflag specifies the directory where the.protofile is located, and the--go_outand--go-grpc_outflags specify the output directories and options for the generated Go code.
 - To trigger the generation of Go code from proto files, we can run the following command in the Gitaly source directory: - 1 - make proto - This will invoke the - prototarget defined in the Makefile, which will compile the proto files and generate the corresponding Go code. The generated Go files will be placed in the- proto/godirectory.- Additionally, the - Makefileincludes other build and test configurations, such as Go package tests, test coverage reports, and build options for dependencies like- Gitand- libgit2.
- The root directory of the Gitaly source code is identified as the source directory (
Next step
- Work on the merge request for adding rangediff.protoin Gitaly.
Till next time,
Siddharth 🖖🏻