Week 4 & 5 — Exploring Rangediff Proto, Feature Flags, and Mailmap Changes!

In my previous blog post, I outlined my plan to work on the .proto file for the git range-diff command. In week 4 & 5 of my GSoC project, my progress was somewhat limited due to a series of back-to-back interviews. I made some strides in key areas. I focused on enhancing the mailmap functionality, improving test coverage, introducing a feature flag for flexible functionality control, and refining the structure of the proto file.

Rangediff

The rangediff.proto file defines the message structure and service needed got performing range difference comparison between different version of Merge Request. So, RangeDiffRequest message allows users to specify the repository and commit Object IDs (OIDs) for the old and new versions.

1
2
3
4
5
6
7
8
9
10
11
12
message RangeDiffRequest {
// This comment is left unintentionally blank.
Repository repository = 1 [(target_repository) = true];
// Base commit OID of the old version.
string base_oid_old = 2;
// Starting commit OID of the old version.
string start_oid_old = 3;
// Base commit OID of the new version.
string base_oid_new = 4;
// Starting commit OID of the new version.
string start_oid_new = 5;
}

These parameters serve as the foundation for comparing the differences between the specified commit ranges. As there may be scenarios where different base commits need to be considered for each commit range. In such cases, it becomes necessary to use the git range-diff <base_oid_old>..<start_oid_old> <base_oid_new>..<start_oid_new> syntax to provide the required base commit information. This syntax allows for a more precise comparison that takes into account the changes from the base commit to each head commit.

RangeDiffResponse serves as a significant component in the RPC response. This message encapsulates the insightful information that is generated by the RPC call. It provides valuable details about the old and new commits being compared, including their relationship and the specific lines of difference in the code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
message RangeDiffResponse {
// The old commit.
Commit old_commit = 1;
// The new commit.
Commit new_commit = 2;
// The comparison relationship between the old and new commits.
enum comparison {
// Indicates that the content remains unmodified between the old and new versions (=).
UNMODIFIED = 0;
// Indicates that the content has been modified between the old and new versions (!).
MODIFIED = 1;
// Indicates that the content was removed in the new version (<).
REMOVED = 2;
// Indicates that the content was added in the new version (>).
ADDED = 3;
}
// Diff lines between the old and new commits.
bytes diff_lines = 4;
}

To put these concept into action, we’ve introduced the RangeDiffService, which hosts the RPC method called CompareRanges

1
2
3
4
5
6
7
8
9
// Service for range difference comparison
service RangeDiffService {
// RPC method to compare ranges and retrieve the difference
rpc CompareRanges(RangeDiffRequest) returns (stream RangeDiffResponse) {
option (op_type) = {
op: ACCESSOR
};
}
}

The CompareRanges method accepts a RangeDiffRequest as a parameter and returns a stream of RangeDiffResponse. Through this implementation, we can efficiently compare the specified commit ranges and retrieve comprehensive information about the range differences.

Next step

  • Next Step is to start implementing the RPC and improve the .proto file as well.

Mailmap changes under Feature Flag!

The git cat-file command is used to retrieve information about Git objects, such as commits and files. We wanted to improve this command by taking account a feature called mailmap.

To incorporate the mailmap feature into the git cat-file command we made changes to a few files. We added a special flag called --use-mailmap whenever git cat-file command is executed. This flag tells Git to apply the mailmap mapping and provide the correct information about the commit authors.

John suggested that the mailmap changes should be implemented behind a feature flag. He referenced a Merge Request as an example, which demonstrates how to add a feature flag, utilize it, and perform testing. By including the --use-mailmap flag behind a feature flag, we ensure that the mailmap feature is only enabled when desired. This gives us more control when and how the mailmap functionality is applied. It’s like having a switch that we can turn on or off based on our needs.

In the catfile_info.go file, we modified the command execution by adding the --use-mailmap flag to the git cat-file command options. This ensures that the mailmap feature is utilized whenever the cat-file command is executed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
options := []git.Option{
batchCheckOption,
git.Flag{Name: "--batch-all-objects"},
git.Flag{Name: "--buffer"},
git.Flag{Name: "--unordered"},
}
if featureflag.MailmapOptions.IsEnabled(ctx) {
options = append([]git.Option{git.Flag{Name: "--use-mailmap"}}, options...)
}

// ...

cmd, err := repo.Exec(ctx, git.Command{
Name: "cat-file",
Flags: options,
}, git.WithStderr(&stderr))

Similarly, in the object_content_reader.go and object_info_reader.go files, we added the --use-mailmap flag to the git cat-file command options:

1
2
3
4
5
6
7
8
9
10
11
if featureflag.MailmapOptions.IsEnabled(ctx) {
flags = append([]git.Option{git.Flag{Name: "--use-mailmap"}}, flags...)
}

// ...

batchCmd, err := repo.Exec(ctx, git.Command{
Name: "cat-file",
Flags: flags,
}, git.WithSetupStdin())

By including the --use-mailmap flag under the MailmapOptions feature flag check, we ensure that the mailmap feature is only enabled when the corresponding feature flag is set. This provides flexibility and control over when the mailmap functionality is applied during the execution of the git cat-file command.

Adding test for mailmap changes!

As part of development process, I wrote test for mailmap changes to validate the behaviour of the object content reader when reading objects from a Git repository with a mailmap file. Let’s dive into the details of the test.

First, I set up a test repository and created a commit on the “main” branch with a .mailmap file containing specific email address mappings. Here’s the code snippet from object_content_reader_test.go:

1
2
3
4
5
6
commitID := gittest.WriteCommit(t, cfg, repoPath,
gittest.WithTreeEntries(
gittest.TreeEntry{Path: ".mailmap", Mode: "100644", Content: mailmapContents},
),
gittest.WithBranch("main"),
)

Next, using the object content reader, I retrieved the contents of an object from the “main” branch. Here’s the code snippet:

1
2
3
4
5
reader, err := newObjectContentReader(ctx, newRepoExecutor(t, cfg, repoProto), nil)
require.NoError(t, err)

object, err := reader.Object(ctx, "refs/heads/main")
require.NoError(t, err)

To validate the correctness of the object content reader, I compared the retrieved contents with the expected results obtained by executing a Git command with the --use-mailmap option on the commit. This ensured that the reader properly applied the mailmap mappings. Here’s the code snippet:

1
2
3
4
5
6
commitContentsWithMailmap := gittest.Exec(t, cfg, "-C", repoPath, "cat-file",
"--use-mailmap", "-p", commitID.String())

data, err := io.ReadAll(object)
require.NoError(t, err)
require.Equal(t, commitContentsWithMailmap, data)

This test ensured that the object content reader accurately handles mailmap mappings, providing reliable and consistent commit authorship information.

Here is the link to the MR: https://gitlab.com/gitlab-org/gitaly/-/merge_requests/4822

So yeah, that was the week 4 & 5. Thanks a lot for reading 🙂

Till next time,

Siddharth 🖖🏻