Fresh Hacker News | Reformatting 100k Files at Google in 2011

▲Reformatting 100k Files at Google in 2011(laurent.le-brun.eu)

230 points by laurentlb 385 days ago | 13 comments

▲rsc 385 days ago

My notes say it was 193k at the start. The final dashboard when we stopped said "216,626 / 216,890 = 99.8%; 264 to go".

The other correction I would make is that this post does not mention Nilton Volpato, who had written an earlier Buildifier and graciously accepted replacing his implementation with a new one and then taking over ownership for that new implementation as well. (Eventually ownership moved to Laurent's team.)

It looks like it was just under 2,000 commits. We did pretty extensive testing, by having Blaze load a BUILD file and its transitive closure and then dump that parsed form back out to a binary format. Any automated commit had to preserve that parsed-and-dumped binary format bit for bit. The slowest part of the testing was waiting for Blaze to do all the loads.

Every day I would prepare and test as many files as I could, break them into CLs (think PRs), mail Rob a shell script he could run to approve them all, and go to bed. Then I'd get up early in the morning (5am ET) to submit the changes, because there were various cached indexes that got updated when BUILD files got submitted, and it seemed better to send them when not many people would be working.

That scheme worked until a system did fall over and someone got paged, and then after that I agreed to only submit the large changes during business hours. :-)

▲haburka 385 days ago

You didn’t have Rosie to automatically split up your changes and send them out yet?? That must have been rough. LSCs are way easier now

▲rsc 384 days ago

(Rosie is a code janitor program at Google that takes a change that affects files all across the tree and automatically cuts it into individual changes that can be mailed out to the individual teams as CLs/PRs for their approval.)

Rosie existed but very much wanted to break up the CL into independent per-directory CLs, and since I was editing one file in every directory in the entire tree, that would have been 200,000 independent CLs. I broke the list up by top-level directory or sub-directory and hit 100+ directories at a time.

Rosie also really wants to run each affected directory's tests, and I did not, because at scale flaky tests and such would be a significant source of false positives. The bit-for-bit check on the internal parsed representation of the meaning of the BUILD file proved that the changes were no-ops. That was better than any tests of the code in the directory.

I was already automating everything else, including deciding which files to change, reverting edits in files that were concurrently modified (they got swept into the next attempt), and the testing. Running a shell command to actually make the CLs was not difficult. And it generating the approval script trivial too.

Rosie is great but it wasn't the right tool for this job.

▲ammar2 385 days ago

Yeah I'm surprised by that as well. As far as I remember, Rosie started out in 2010 and people were using in 2012. Maybe the clustering/splitting didn't support this use-case or it wasn't well-known enough?

▲singron 384 days ago

From my memory, extremely large but semantically simple changes still used global approvers since it wasn't considered worth the effort to get approval from hundreds or thousands of individual OWNERs using rosie. Also, with a change of this magnitude, especially one affecting BUILD files, it might not have been possible to create the mega-CL that rosie uses, but I don't remember specific limitations on that.

▲jrockway 384 days ago

That's exactly what I remember. I did a few LSCs; one I used Rosie just to see how it worked (I recall wanting to use it for all my CLs because it submits after someone clicks approve and the tests pass), but most of the time it was easier to find a global approver and submit the whole thing atomically.

▲dartos 384 days ago

I always appreciate these googler threads talking about all the Google internal tools.

Reminds me of all the impenetrable jargon around me when I was new

▲laurentlb 385 days ago

Thanks for the precisions! I've added some updates at the bottom of the post.

▲wonger_ 385 days ago

Autoformatting is so nice. Crazy to think that formatters only became popular after `gofmt`.

I also found this related quote from Russ Cox intriguing: "Most people think that we format Go code with gofmt to make code look nicer or to end debates among team members about program layout. But the most important reason for gofmt is that if an algorithm defines how Go source code is formatted, then programs, like goimports or gorename or go fix, can edit the source code more easily, without introducing spurious formatting changes when writing the code back. This helps you maintain code over time."

▲autarch 385 days ago

The Perl world had perltidy (first release in 2002) many years before Go was even a thing. It's funny that Perl, a language notorious for it's "There's More Than One Way To Do It" (TMTOWTDI) philosophy had a tidier so early. Of course, perltidy is _ridiculously_ configurable.

One thing I really love about gofmt is that it has no configuration at all. I think that was a major "innovation" and I'd love to see more languages adopt this approach.

▲freedomben 384 days ago

Perl is a fascinating example because it was so ahead of its time, so innovative, forward-thinking, so wonderful, and so loved, and yet today is looked upon with such disdain. I remember people battling for Perl positions and being so happy to get them! Now you can't even find people that admit to knowing Perl ;-)

Someday I want to study this and really understand what happened.

▲XorNot 384 days ago

There were also various Python formatters. The problem was the zeitgeist - engineers would tell me that "they didn't want an algorithm messing with their code" as though that was a serious concern, or that wading through endless pull-request changes for syntax was a good use of anyones time.

▲ravishi 384 days ago

The thing about configurable vs non-configurable in this case is that when its configurable then people will spend time debating how exactly they should configure it.

▲autarch 384 days ago

Yes, that's why I love gofmt. There's nothing to debate!

▲jrockway 384 days ago

Helpfully, you can debate between "gofmt", "gofmt -s" and "gofumpt".

I actually like how gofumpt formats stuff but ... nobody else on the team would have it, so it would make things worse.

▲dmurray 384 days ago

You could always have your editor reformat files in gofumpt's way before viewing code or diffs, and reformat with go fmt when committing.

I've never seen anyone with a workflow like this (lots of people have the second part, of course, but not the first one), nor tooling that makes it a really natural thing to do, but wouldn't it work? There are some pain points if you ever want to pair program or if you use multiple tools to collaborate on code.

▲jaeckel 384 days ago

> [...] nor tooling that makes it a really natural thing to do [...]

If we ignore the fact that switching between those two formatters would "break" the formatting: There exist clean&smudge filters in Git, which could accomplish this technically.

https://git-scm.com/docs/gitattributes#_filter https://git-scm.com/book/en/v2/Customizing-Git-Git-Attribute...

▲cmcaine 384 days ago

No, you'd likely get unrelated style changes in your commits. Read the gofumpt readme to understand why.

▲dmurray 384 days ago

I see! Seems like neither formatter is completely rigid, and both respect some style decisions made by the programmer, so this level of automated reformatting isn't possible.

▲autarch 384 days ago

Yeah, there's still a few options. But try running `perltidy --help`. Even that doesn't really give you a sense of how many options there are because many of the options are not booleans, but instead take several enum values indicating which style to choose.

▲josephg 384 days ago

I hate formatters like this with a passion. I realised when I tried it that there’s hundreds of tiny editorial choices I make throughout my source files. For example, I use different numbers of new lines between functions in a file to indicate similarity or to group functions together. Sometimes I’ll put a simple function on one line - like lerp or vecadd and then make a block of similar functions in my code. Stuff like that.

By removing the maker’s marks, these tools make my code less readable. While, in my opinion, adding practically no value. I’m more than happy for every line of code to have consistent indentation (of course, but it did already). I also don't have a problem with silly but arbitrary formatting choices - like sorting my import lines. But these tools seem to drive so far for consistency that it costs readability.

That’s a nope for me. No debate.

▲stavros 384 days ago

These tools, like any tool of the type, bring your code to the 90th percentile. This is good for nine out of ten people, because it improves their code. It's also good for the tenth person, when he has to read the code of the other nine.

If you're the tenth person, and you work alone, or with other fastidious people, you won't like the formatter. That's fine, you don't need to use it.

▲autarch 384 days ago

That's fine for your solo projects. It's definitely not okay at work.

▲josephg 384 days ago

Why not? Whats the ROI of making the number of lines that separate functions the same across our entire codebase? That sounds completely pointless.

▲autarch 384 days ago

It's easier to read and refactor code when it's all formatted the same way. Otherwise diffs end up with tons of extraneous noise. Plus it adds needless decisions. If I move a function in a file with Person A's style to a file with Person B's style, do I reformat it?

What about when someone leaves the company? Is it free game to reformat everything they wrote?

Why do you need to put your mark on code at work? It's not _your_ code. It belongs to the employer. The best work is work that is useful and not an irritant long after you're gone.

▲josephg 384 days ago

> It's easier to read and refactor code when it's all formatted the same way.

Sometimes it matters: indentation, naming_style and bracing should match throughout a codebase.

Sometimes it makes no difference: I really don't care about the order of your import statements. It simply doesn't need to be consistent throughout a program. It doesn't matter.

And sometimes making code "all formatted the same way" makes it all worse. I think thats true for spacing between functions. Functions simply shouldn't have the same spacing between them. Nor should lines of code within a function. Whitespace is a wonderful tool for telling the reader how lines of code group together. Gofmt erases all of that to make sure "code is formatted in the same way" - but in doing so, readability is actively decreased.

> If I move a function in a file with Person A's style to a file with Person B's style, do I reformat it?

Thats up to you! Why does everything have to have a right and a wrong answer? Obsessing over this stuff is a pointless waste of time. I guess thats the point of gofmt & friends - that you don't need to think about it. But, you can also not think about it by just not thinking about it, and letting your codebase be a bit inconsistent. Its not a crime. There are no consistency police. You won't go to jail.

> Why do you need to put your mark on code at work? It's not _your_ code.

You have an identifiable style whether you like it or not. Its evident in how you name your functions and variables. In how you write your comments, and where you put them. How you order functions, and where and when you split code between files, classes and modules.

Your style is inescapably everywhere in your work. And it will always have been written by you, long after you're gone.

Are you ashamed of how you write code? Why go out of your way to write and run tools that delete your mark on your work? It doesn't make the code better. Your team will not be more productive as a result. And it doesn't improve quarterly profits.

Like it or not, we're "creatives": That is, we're people who create. The software we write is distinctly our own. Having a little pride in our work is a very healthy thing.

▲JyB 384 days ago

That’s for teams. No one care about an individual opinion.

▲rurban 384 days ago

I now almost always format C and C++ with clang-format -I pure, without any .clang-format overrides.

There's no need for lengthy discussions with co-workers then. Just for bigger projects there's a need for Statement Macro Declarations, but that's arguably a bug/limitation in clang-format

▲afavour 385 days ago

I’ve wondered before whether the world would be well served by a programming language (or source control system, I suppose) that just stores ASTs in files rather than text code. When users open the file the editor formats to whatever their personal preference is, then saves edits back to the AST.

It really is dumb to be arguing over tabs vs spaces, after all.

▲arp242 384 days ago

Well-formatted code is more than just an AST. Even in Go, a language with probably one of the lowest style divergences out there (for better or worse) there are tons of style choices that aren't in an AST.

Blank lines is an obvious one: where do you insert them to "group" sections of a 30 line function? Or are there no blank lines at all?

Line length: just "wrap at columns X" (or never wrap) is not enough, because people can and do wrap at specific locations for specific reasons, because that makes more semantic sense or looks nicer than cramming as much as possible.

▲icholy 384 days ago

I prefer no blank lines. If you feel like you need one, write a line comment instead describing the next section.

▲josephg 384 days ago

I think that's a terrible choice. Its like saying "I don't like whitespace in webpage design. If they need whitespace, fill it with content - like maybe some text."

Whitespace gives readers subtle information about the structure of a function before they read any of it. Its a powerful tool. Dismissing or - worse - deleting whitespace wholesale sounds profoundly misguided to me.

Why make code harder to read? Where's the benefit to your approach? I can't see any.

▲icholy 383 days ago

Whitespace is useful in normal prose because it has no inherent visual structure aside from punctuation. A programing language is inherently structured into logical functions and blocks. If your function is so complex that you need to start adding whitespace to make it visually parseable then that's its own problem.

▲josephg 383 days ago

Let’s look at a real example. Here’s the source code for binary search from rust’s standard library:

https://doc.rust-lang.org/src/core/slice/mod.rs.html#2786-28...

The function is pretty short - 40 lines including comments. Despite how short the function is, it still uses whitespace to separate and group adjacent lines of code. Personally, I find the code more readable like this. Indentation makes syntactic blocks obvious (the while loop and if statement). But there are also conceptual groupings between lines that mean nothing to the compiler, but are semantically meaningful to humans. I can tell at a glance that the comment about safety is most associated with that one line below the comment. And so on.

I think this code would be worse if we deleted the whitespace. How would you improve this code?

▲icholy 380 days ago

All of the blank lines can be deleted from that function without reducing the readability.

▲josephg 379 days ago

Maybe for you. But it would reduce the readability for me.

At a minimum, the blank lines implicitly scope the comments that sit above connected code blocks. Those comments - especially the comment about safety - would be harder to understand and audit without its context being so clear.

And again, can you name any benefit to removing all the blank lines? If we can both read the code easily with some empty lines to space it out, that seems like the best option.

▲arp242 384 days ago

Well okay, but a great many people disagree and this sort of thing isn't captured in an AST, so clearly it's not a suitable format for storing code.

▲JyB 384 days ago

I mean the whole Go stdlib barely use them right?

▲duped 385 days ago

One very nice property of an AST is that it's an IR that is allowed be instable even if the syntax it represents is stabilized. If the AST becomes the source of truth on the file system you lose that property.

On top of that, now you need to write a parser and compiler for your AST file. It's probably very simple and does rudimentary validation, but that defeats the point of the AST - it's a valid, canonical representation of a program by construction.

All in all, it seems like a good idea, and people have done it. But there's also good reasons to be apprehensive.

And at the end of the day, you need to ingest text as input, and you need to do it as fast as possible. There's not a ton of benefit to keeping an AST around on disk and in sync with the text that generated it when you are already able to compute it faster than you can read and deserialize it.

▲lmm 384 days ago

No, because there are no good tools for this format. It's a classic "worse is better" problem - a programming language cannot succeed unless its source code format is unix text files.

▲eru 385 days ago

> It really is dumb to be arguing over tabs vs spaces, after all.

In an in-house dialect of Haskell I used to work with, we solved this problem by just making tabs a syntax error. Never had any problems.

(I think tabs might be have been allowed inside strings.)

▲afavour 385 days ago

Arguably it would be a problem for a coworker that wanted to use tabs.

“We’ll just force everyone to do things one way” kind of ignores the point I was making. It shouldn’t be necessary for you to care how anyone else formats their code, same as you don’t care what font their code displays in or what text editor they use. It feels like a vestigial aspect of programming that we have to concern ourselves with it in 2024.

▲eru 384 days ago

> Arguably it would be a problem for a coworker that wanted to use tabs.

Wasn't a problem in practice. Just like we never had any problems with anyone wanting to use eg Pascal or so.

▲refulgentis 384 days ago

100%, way this felt in practice at Google was I could have whatever I wanted in my IDE, and it'd be transformed upon check-in into the house style, which I don't need to care about

FWIW, just happy to have a chance to unload this thought finally: it had surprisingly little impact on code reviews, in that the "personal preference I need to enforce" just ascended abstraction levels.

▲ukuina 384 days ago

Does the IDE transform the existing code in the repo back into your preferred style on the next checkout?

▲refulgentis 384 days ago

Maybe I cant remember right, but IIRC any IDE in use there (same ones as outside, there's nothing special) had a setting for tabs vs. spaces / indentation size

And yes, that doesn't help you if ex. your style is a blank line following every code line.

In practice, it works, I surmise because people are fine with someone else's code being in a different style, but they want to write in their style.

▲kccqzy 385 days ago

So I'm guessing you mean Mu, the dialect by Standard Chartered.

▲eru 385 days ago

Yes, indeed. In it's circa 2014 incarnation, when I last worked there.

(I don't really know what happened to it since.)

▲lupire 384 days ago

Simple, elegant, and wrong.

▲eru 384 days ago

Tabs are particularly useless in a language like Haskell, because we don't do a lot of block-indentation there, but we do a lot of alignment. And your code mixes alignment and indentation.

In a language like C your outermost layers of leading whitespace are always indentation, and then you might have some alignment inside.

But in Haskell you might want to align arguments to a function, but some of the arguments can have blocks inside of them.

Mixing up tabs and spaces is technically possible, but it's too much of a pain in practice to bother.

▲tengbretson 385 days ago

Does that mean that a file with a syntax error is not able to be saved?

▲__s 385 days ago

language servers have pushed things to inspecting source code while programmer has partially written code, so having a kind of (invalid-span content="garbage") node in AST helps

▲duped 385 days ago

Language servers usually are designed around full/concrete syntax trees instead of ASTs for exactly this reason. Adding error nodes to the AST is a hack that hurts more than helps.

More technically, language servers usually have a CST that they use to build the AST incrementally, and the AST contains references back to the CST that generated it. This is what allows you to handle incremental text edits and compile small deltas to the AST instead of the typical batch compiler design that attempts to parse everything all at once.

▲jenadine 385 days ago

What's wrong with error node in the ast?

I've seen language server that completely ignores the parts with error, and I much prefer error nodes because then I still know there is something and these error node can still have children

▲duped 384 days ago

There are a few problems with errors-as-nodes in a syntax tree. An abstract syntax tree is a hierarchical representation of the program in the language's grammar - and errors are not members of the grammar (they're everything else!)

There is also the problem that an error returned by a language server is a class of a "diagnostic" that includes syntax errors, semantic errors, warnings, lints, etc, associated with a span in the source code. It's much easier to think of diagnostics as a separate data structure that gets filled up during lexical/semantic analysis and associated with spans in the full syntax tree (you can even store them there as fields, but they don't necessarily have children). Then it's obvious how the structure gets created and fed back to the user.

And finally, the whole point of an AST is to be a valid canonical representation of a program so the compiler query it drives doesn't have to do additional input validation. So it just makes the queries/compiler passes easier to write.

▲ 384 days ago

▲SR2Z 385 days ago

Well, not as an AST at least. Presumably it would still be ok as text.

▲rsc 385 days ago

At that point you could even use different languages. Maybe you like programs that look like Lisp and I don't. There was a project at Microsoft Research in the late 1990s/early 2000s that did exactly this - storing ASTs in source control instead of code - but the name escapes me at the moment.

▲freedomben 384 days ago

That's a fascinating idea, but I wonder if code written in Go or similar language would make for some gargantuan and awful-to-read Lisp, and possibly vice-versa (Go with a ton of functions that are all only a few lines long. Sounds ok to me, but very different from any Go code I've worked on that I didn't write :-D).

▲whstl 384 days ago

Was it Intentional programming? I remember it being described as something similar to what you say, but the Wikipedia shows something slightly different.

https://en.wikipedia.org/wiki/Intentional_programming

▲refulgentis 384 days ago

Quick 10 min into Googling, def. double check me:

You're right, ex. https://dev.to/jillesvangurp/comment/6gnb/

Sadly there's enough bitrot that ex. intentsoft.com is offline.

More here, but article is v opinionated/judgement oriented, comments are useful, but again, bitrot :( https://wiki.c2.com/?IntentionalProgramming

It strikes me that what I am describing as "bitrot" may also be "never really shipped, so the vagueness isn't accidental"

▲throwaway2037 384 days ago

Hat tip for c2.com link! The follow-up is savage: https://wiki.c2.com/?CritiqueOfIntentionalProgramming

▲turtleyacht 385 days ago

One (naive) approach I keep thinking on is using ed(1) to write transforms of code, so chunks of Java are later sewn together to create the app.

This always sounds more difficult on paper than just wrestling dependencies till dawn, upgrading from JDK 11 to JDK 17, for example. So I usually give up the mental exercise there.

Plus, following a file of transforms is mind-bending: someone may follow a method definition with a pattern seek, and then start appending some more code. Context is lost. It would be literate programming only with enough empathy for comments.

Which is all to say, would it be easier to move between Spring versions if the app's commit history were a series of transforms instead of changes to static files?

Suppose a commit establishes a framework version, and then follow a bunch of commits for domain objects, a skeleton controller, and so on. If we could play those decisions forward, but edit the transform instead of the source, would it be easier to dissect which next dependency to manage?

This loops back to ASTs: we would still edit and change files, but the history would be ed(1) macros (or something better, like ASTs). Somehow, it feels like there could be reconciliation between source control and "manipulating a timeline of changes."

Git may already have this, or a simple while loop with some decisions about how far to play the changes, like editing a cassette tape. A list of patches to apply, with pre- and post- hooks for rules scripts.

▲striking 385 days ago

A sufficiently-motivated engineer could do this today by setting a different formatter in their editor than in their pre-commit hook. Then they need only activate the formatter in their editor as they edit, and ensure pre-commit hooks run as they commit.

This saves you the trouble of authoring a separate programming language, or finding a way to preserve all of the niceties of the original syntax and formatting that wouldn't directly translate to an AST (like how many newlines are after a particular stanza or function).

Case in point: Recast (https://github.com/benjamn/recast) is of particular interest with regards to JS/TS in this vein, because it does preserve a lot of the spirit of the source in its conception of the AST. But also last time I used it (couple years ago now) it would explode on any code with an emoji in it. It's genuinely not an easy problem.

▲python_pele 385 days ago

Unison has something similar to what you're talking about: https://www.unison-lang.org/

▲shawn_w 385 days ago

I think that's what InterLisp did.

▲coolcoder613 385 days ago

Many BASIC dialects did this.

▲delichon 385 days ago

It reduces cognitive load on integrations, both software and wetware. So many brain hertz are wasted on internally bikeshedding format details. Autoformat fixes a hole where the rain gets in and stops my mind from coding.

▲rowanG077 385 days ago

I never thought it only got popular with go. I remember using formatters before go. What go did do is that it shows how good it is if it is ubiquitous.

▲plorkyeran 385 days ago

I think mandatory code formatting was popularized by Go? Plenty of code formatting tools had been around for quite a while by the time Go came out, but they were the sort of thing you ran on a codebase once to clean up a mess, not something you checked on each commit. A common sentiment was that it was a good idea but was only viable because Go had it from the beginning, which fortunately turned out to be wrong.

▲omoikane 385 days ago

Autoformatters got popular was because a lot people don't care about formatting, and those who do care can't win against the auto part of autoformatters.

It works for go because gofmt was there from the start, so even if you are returning a multi-dimension array and elements come out unaligned, that's just accepted as how it is and nobody cares. For other languages, people will have to either accept "not caring" as becoming the norm, or actively fight the autoformatter from steamrolling over their code.

For people who would give more thought to how their code would be read, autoformatters were often more frustrating than "nice".

▲freedomben 384 days ago

Yes precisely this. I worked on a team in C++ that vertically aligned things, and at first it seemed like extra work but once I started seriously reading the code, I had an epiphany of realization that it is an amazing improvement in readability, but also makes it super easy to find certain bugs, and also do block editing in tools like vim. After leaving that team, I found that everywhere I tried to introduce this people just shrugged and rolled an auto-formatter over and told me that it's not worth thinking about.

IMHO autoformatters are awesome until there's something you care about that the implementers didn't care about, then they are horrible. Problem is, the people who put a lot of thought into those decisions are often in the minority and tend to lose the argument.

▲YZF 384 days ago

Interesting enough people still find things to argue about even with gofmt and goimports. The order of imports and maximum line length are two examples. I imagine if those were done by gofmt/goimports then people would argue about other things.

It's why we can't have nice things... Autoformatters do help though.

▲fweimer 385 days ago

Java IDEs have had optional reformat on safe for a long time.

https://help.eclipse.org/latest/index.jsp?topic=%2Forg.eclip...

▲jakjak123 384 days ago

They have, but the different IDEs do not format equivalent and are way too easy to reconfigure to personal preference.

▲DangitBobby 385 days ago

That doesn't really seem like a good reason. They remove all formatting control with an automated tool so it doesn't matter when (if, big if) an automated tool later rewrites some of your source code formatted in a way you don't control?

▲striking 385 days ago

It's more common than you think! Tools like https://github.com/facebook/jscodeshift, https://ts-morph.com/, and https://ast-grep.github.io/ are the ones I'm more familiar with (since my day-to-day is TypeScript).

When I led the conversion of a decently-sized codebase from Flow-typed JS to TypeScript, I ensured that a code formatting tool that we were already using on pre-commit and CI called `prettier` was executed after each step. We took a git snapshot of each step of our automated conversion pipeline, and the diff was much clearer with `prettier` in place at each of those steps.

We've since used codemods frequently to make huge changes to the codebase in an automatic, reproducible, iterable way. They're very comfortable, very fun, and (thanks to the use of a formatter on all code) rarely produce incomprehensible diffs.

▲Nzen 385 days ago

Mu. Russ Cox quote indicates that the maintainers of goimports and gorename do not need to handle formatting at all (and hence, benefit). The go team put that wholly in the gofmt team's court. Otherwise, the go team would likely have to handle tickets about how the formatting of goimports and gorename differ over time for edge cases (if only by paying the social capital of ignoring the tickets).

▲DangitBobby 385 days ago

That would fall under what I already assumed to be true, and also under an umbrella they specifically said it wasnt about

> end debates among team members about program layout

(Except in this case of course it's not "team members")

I'm definitely not getting this interpretation from the quote.

▲bsmith0 385 days ago

I'd be curious what role "global approvers" at Google/Google's scale typically have/how many are there/what's the process?

▲y2mango 385 days ago

I was one of the global approvers (and also on the Python team until my role at Google was eliminated recently). I no longer have access to the stats, but from my memory, there are currently 50+ global approvers depending on how many are still considered active.

Typically, code authors would create a proposal by filling out a doc template. It's usually light weight and also accompanied with examples or full set of the pending code changes. Then 1-3 of us will review and LGTM the proposal. As part of the review, we also determine whether the changes should be sent to local code owners, or "globally approved" by one of us. The default option is to use "global approval", unless the changes need local code owner's knowledge during the code review. Said in another way, when sent to local code owners, their role is not gate keeping the changes, but to provide necessary local knowledge where we as global approvers don't have.

Refactoring changes, such as formatting or API migrations, shouldn't bother local code owners because 1) it would just be a waste of their time to review and approve; 2) in practice, we find a central code reviewer for the same large set of code changes is more likely to catch bugs (with review automation tooling) than local reviewers.

We consider ourselves as facilitators rather than approvers or gatekeepers of the code changes. Our goal is to make these changes done more efficiently and save engineering time when possible.

If you like stats: over the past 5 years, I have reviewed ~300 such proposals and ~40K changelists (equivalent to PRs). One changelist/PR typically contains 10s to 100s of files depending on the nature of the change. When I was most active, I was about ~5th-ish when ranking the number of changes we were approving. There are many global approvers who have approved more than 100K changelists, which is a milestone we celebrate with a cake. Too bad I didn't have the chance to have my cake.

▲mudkipdev 384 days ago

Just curious, what kind of global changes do you usually make? And what is the process of becoming an approver?

▲laurentlb 384 days ago

I was also global approver.

Examples of global changes include:

- changes to Buildifier that require updating existing files

- rename/refactor a function used everywhere in the repository

- fix the existing code before turning a lint warning into an error

- fix code that will break with a compiler update

Anyone in the company can propose this kind of change. The proposal will be reviewed by a committee (to ensure the change is worthwhile, that are mechanisms to prevent regressions, etc.) and by a domain expert (the team that owns the area).

Global approvers are people who often deal with this kind of changes. They usually come from the language teams (e.g. I knew the specificities that come with global changes touching BUILD/Starlark files).

▲y2mango 384 days ago

> what is the process of becoming an approver?

New approvers are nominated by an existing member and then LGTM'ed by other three. Usually they have gained a lot of large scale change experiences on the other side, and we recognize that we could use more help on the committee side. Especially we want a good coverage on various languages, tech stacks, and time zones.

▲dmazzoni 385 days ago

I was at Google until a few years ago.

The purpose of global approvers was exactly things like this. If you want to do a mechanical change to an insanely huge number of files, they can potentially approve it.

In my experience, global approvers were used extremely rarely, only in cases like this where the transformation was purely mechanical and it was possible to verify that there were no logic changes.

Most of the time rather than global approvers, you were encouraged to use a system that would automatically split your change into a bunch of smaller CLs (PRs), automatically send those to owners of each module, then automatically merge the changes if approved. It would even nag owners daily to please review. If you had trouble getting approval for some files you could escalate to owners of a parent directory, but it'd rarely be necessary to go all the way up to global approvers.

Basically if there was even the slightest chance that your change could break something, it's always safer to ask individual code owners to approve the change.

▲tylerhou 385 days ago

Even if you get global approval it is still good to split CLs to avoid e.g. merge conflicts.

▲vitiral 385 days ago

I'm going to take a contrarian view here. Code formatting is amazing in a corporate environment where nobody truly cares about their code -- it's just a means to get a paycheck. It's also great for beginners to a language who are still trying to get a handle of the syntax.

But where you are nearly the sole owner of a small library and you are crafting that library to be beautiful and understandable... there is something pleasurable about structuring concepts so you have each on a single line, or creating similar functions so the concepts are structured by column.

I know not everyone will hold this view and that is fine, but when you are writing your own hobby library in your favorite language for your own purposes I recommend you try it out.

▲lokar 385 days ago

Everywhere I’ve worked, including Google, many (probably most) people care a lot about the code and the format. Often too much. That was the point rsc was making.

▲neilv 385 days ago

Also, sometimes the formatting simply makes reading something messy merely tractable, not aesthetically pleasing.

Once, a product launch depended on me urgently kludging a device driver in Python (long story). And this involved a large hand-maintained mapping table. I wrote it quickly but carefully, and found some formatting that made the table readable enough, without implementing a minilanguage in Python.

But the Black formatter had been rigged to run automatically on commit, so... poof! :)

▲tczMUFlmoNk 384 days ago

Black supports ignoring sections with a `# fmt: off` directive, and a hand-formatted constant table is a common use for that:

https://black.readthedocs.io/en/stable/usage_and_configurati...

Other formatters have similar functionality; e.g.:

- /* prettier-ignore */: https://prettier.io/docs/en/ignore.html#javascript

- #[rustfmt::skip]: https://github.com/rust-lang/rustfmt?tab=readme-ov-file#tips

▲neilv 384 days ago

Thank you, that'll come in handy again, I'm sure.

▲JZL003 385 days ago

I do more research, one-off ,code and being able to align equals signs or successive lines which do similar things both makes me happy and makes it easier to skim. I think the Linux kernel does it too. Emacs has a way of aligning by common separators (=/, as well as quick regex)

I guess I've never gotten into an actual serious debate with someone over formatting so I don't know what I'm avoiding, but sometimes auto formatted code makes it harder to skim

▲paulddraper 385 days ago

I don't find the mental burden of formatting code to have a positive ROI.

I want to write syntacially-valid code, without worrying about the visual presentation of it. (I want a good presentation, but I don't want to put forth the effort to create it.)

▲fragmede 384 days ago

Yeah that's why I fight with how I feel about python so much. In other languages I can just shove the curly bracket in the right place and fix the whitespace later, but Python has whitespace as significant, so I'm forced to fix it/get it right. But that means it's right when I dig a "temporary" script up years later.

▲paulddraper 384 days ago

You can write a lot (but not all) Python on one line.

And my IDE makes it very easy to indent a snippet.

▲markrages 385 days ago

Yes! Code is for people to read, not computers. Beautiful code uses the full expressiveness that the language allows.

▲JyB 384 days ago

And people have, by definition, a much easier time reading code when it’s always consistent across codebases. Thanks for making the point.

▲striking 385 days ago

I think you're doing yourself a disservice by having to play with the formatting itself instead of being able to express what you mean with the tools the language already provides you.

As soon as anyone else looks at your code or wants to participate in the development process, what is meaningful to you about the arrangement simply won't translate into their view; what is the use of a programming language if not to use its existing definitions and concepts to communicate the concepts of programming?

You can use fancy syntax tricks, but domain-specific languages are a much better way to handle the same problem. You can express things with them that other humans can understand while still retaining access to your existing formatting tools.

▲thfuran 384 days ago

You'd write a DSL to somehow clarify formatting a matrix literal instead of just putting some spaces as needed to get the rows and columns to line up cleanly?

▲vitiral 385 days ago

I personally dislike ad-hoc DSLs that aren't composed of the language's existing syntax.

Considering I get pleasure from formatting the code beautifully I'm not sure how it's even possible to be doing _myself_ a disservice. It's like saying I'm doing a disservice learning guitar instead of listening to Coldplay on my speakers.

As far as others are concerned... I agree somewhat, but I don't think it's by any means proven. I think we are always better off then cobbled together code where formatting hasn't been thought about. By well-designed hand-crafted formatting can be expressive in it's own way IMO

▲erik_seaberg 385 days ago

This. Automatically doing it well (clarifying the intent of the code) is an AI-complete problem. That's not an excuse for automatically doing it poorly and forcing everyone else to live with it.

▲rowanG077 385 days ago

I mean you can do whatever you want in a hobby project you work on solo.

I do want to say that I have the opposite view. People who use formatters want their code to be consistent and go the extra mile it ensure it does. It's like manual testing vs automated testing to me. Sure with manual testing you can test many more corner cases as they come up as an intelligent person is in the loop. But there will be mistakes made, tests forgotten etc. Just like there will always be inconsistencies when you manually format the code.

▲vitiral 385 days ago

Im not sure how it's even remotely related to testing. I do agree that it depends on the person, just like every medium depends on both the artist creating the work and the patrons viewing it.

Many people who use formatters (myself included) just want consistent code and don't want to bicker with others about it. When it is soley owned by me and I'm doing it for fun, these reasons fall away for me.

▲rowanG077 385 days ago

It meant to illustrate the same human weakness in consistency and repeatability disadvantages both manual testing and manual formatting. In essence manually formatting is a baker making every bread by hand. Some bread will be better then others. In contrast to an automated factory turning out the same quality bread every day.

▲vitiral 385 days ago

Ah, makes sense. I think there are use-cases for both, even personal hobby projects. There's some languages with so much damn syntax that I don't think I could even begin to write code without a formatter (I'm looking at you rust and Java) and others where it stays out of your way (python, Lua, etc)

▲thfuran 384 days ago

A language with semantic white space seems a somewhat strange choice for an example of when formatting stays out of your way.

▲vitiral 384 days ago

Fair enough! I only wrote it early in my career, it might feel more restrictive now

▲jsolson 385 days ago

[flagged]

▲yegle 385 days ago

There are 3 tools that that makes maintaining BUILD files enjoyable: buildifier, buildozer and build_cleaner (internal only unfortunately).

▲lopkeny12ko 385 days ago

What is build_cleaner?

▲randomifcpfan 385 days ago

A tool for updating bazel build target dependencies. It inspects build files and source code, then adds/removes dependencies from build targets as needed. It requires using global include paths in C/C++ sources. It is not perfect, but it is pretty nice!

▲rsc 385 days ago

If you're using Go with Bazel, gazelle is available outside Google: https://github.com/bazelbuild/bazel-gazelle

Enabling tools like these was exactly the point of the enforced formatting. It worked extremely well.

▲dijit 384 days ago

I should add that it seems Gazelle is being expanded to other programming languages other than Go.

For example: https://github.com/Calsign/gazelle_rust

▲jerf 384 days ago

The term "bikeshedding" comes up a lot on HN, when people spend a lot of time spinning wheels in endless debates about the things that are easy to debate while indefinitely deferring the tough conversations, but I think an underappreciated aspect of this meme is that the unimportant conversations quite frequently really are just that; unimportant, to everybody involved. Often (but not always) nobody actually cares. Caring is not necessarily represented by the amount of time or text spewed forth on a topic. A lot of things other than passion or importance factor into how much verbiage comes through and it is a mistake to overinterpret mere volume.

Unfortunately, the flip side of this coin is harder to deal with; sometimes truly important issues or things that people do deeply care about have disproportionately too little verbiage. Finding those can be very difficult.

▲fragmede 384 days ago

> Caring is not necessarily represented by the amount of time or text spewed forth on a topic.

The meta says that it is. there are only 81600 seconds most days, and you get to choose them how you want, so choose how you spend them wisely. if that's arguing over tabs or spaces, then that's your choice.

▲jerf 384 days ago

The meta would say that if people were optimally spending their time on things that matter to them. They don't. If they did bikeshedding wouldn't exist. It obviously does.

This is basically the same claim that economics can treat humans as perfectly rational actors perfectly rationally pursuing their perfectly rational goals. It is not a good model of humanity.

▲fragmede 383 days ago

It's a revealed preference and there's a ton of economic studies about that vs stated preferences. You can say you don't care about tabs vs spaces all you want, but if you spend hours online talking about it, people are going to think you care, no matter what you say. Bringing economics into this, how do you metricize caring? Can you simplify it to be the time and money you put into a thing?

"All models are wrong, but some are useful." -G. Box

▲rsc 384 days ago

This One Trick Will Add 80 Minutes To Your Day

▲vasco 384 days ago

Sounds like a quick way to get Golang installed on every Google developer machine and drive adoption.

▲rsc 384 days ago

No, because Go compiles to binaries, like C/C++ but unlike Java or Python. Only the buildifier binary needed to be installed on everyone's machine.

(I think Go was already installed on every Google developer machine at that point anyway.)

▲rapfaria 385 days ago

After reformatting, did git blame always pointed to those commits and the most recent author, or were they added to --ignore-rev?

▲Arainach 385 days ago

Google doesn't use Git internally, and its code search and source control tools expose the concept of "show blame before this change", so in practice reformats like this aren't troublesome with regard to blame.

▲montag 385 days ago

This is true but I think it would still be nice for VCS to have a first class concept of "peek through" changes (whitespace, formatting, etc.) for the purpose of blame.

▲morgante 384 days ago

It does. Just put ignored commits in a file: https://docs.github.com/en/repositories/working-with-files/u...

▲montag 373 days ago

Thanks! Didn't know about this cool feature.

▲tantalor 385 days ago

Blame can be configured to ignore cleanups like this.

▲mckn1ght 385 days ago

This has always been something I use to argue against just reformatting code in isolation. TIL git provides a way to ignore that. Thanks!

▲rsc 384 days ago

Not git, but yes, it did mess up the blame lines. Luckily most people don't care about the attribution on BUILD files. But I did get the occasional email for years afterward about "hey looks like you added this build target, can you explain it?" or just an automated tool deciding I was the last person who edited the file so I was a good reviewer for a future change. Those pings were a fun tour of directories I'd never heard of (but did edit). Been a few years since I got one of those.

▲aiuto 384 days ago

I want to start by saying that I do not want to diminish or disparage the work that Russ, Rob, Laurent, and others have done. It has made the Google code base better. That is an inarguable fact. Nor do I want to pick on buildifier or gofmt or any other tool as a singleton problem. I'll talk about buildifier because that is what I personally fight with. Others may have different demons. (YDMV - "your daemon may vary". I'm taking the authorship of that one now, in case it ever takes off).

But back to the point... formatting rules without firm, incredibly strict enforcement ends up being a tax on the janitors - the people who clean the code base and do large scale changes. That makes me sad. These are the people who care a great deal about code health, and their work is hindered by the lint checks that we have imposed.

Let me give an example.

I'm trying to eliminate a constraint in the build system. It's a "small" large change - only O(30K) instances. (Yeah, Google scale is different). I have an incredible wealth of tools available to me to automate the process. For the benefit of the Googlers, I can identify Blaze targets to change, use buildozer to fix them, and ship off CLs to review. But the changes I want to make are often ones which should be reviewed by the code owners, and not globally approved. So possibly O(10K) individuals might be involved in reviews.

Let's explore the problem. First, shouts to y2mango for bringing up incremental formatters. This should be the default for all tools. And another to flymasterv for raising the question of "why not just format as each person touches a BUILD file". Here's the situation.

1. buildozer is really good at rewriting BUILD files syntactically correctly. 2. It has an unfortunate side effect of not being incremental. It calls buildifier to rewrite the entire file. 3. We update the formatting rules to make them stricter over time. That means that a "correct" BUILD file on January 1, might require changes on March 1. 4. Buildifier findings are advisory, rather than mandatory. 5. No team is staffed with repeating the monumental work this post started with.

The reality on the ground is that little touched BUILD files become stale, and would require a formatting update over time. It is actually worse than that, because many teams take the path of ignoring buildifier warnings and committing their working code anyway. Without continual BUILD file reformatting there is a lot of stale floating around. [Root cause: We could fix this by promoting people for doing that repeat work. But we don't. We promote for the initial sprint.]

And then a janitor comes along.

I use bulldozer to fix a problem. It reformats an ancient BUILD file completely (not incremental). I send it to the code owner. They see changes far beyond my 2 line fix. They reject it, or ask for a change to only the two lines that actually mattered. Sure. I can hand build the change once or twice. But not for a few hundred, or thousands files. So.... I have to hack up an incremental format. Or, it turns out that users are very happy if I don't bother with formatting at all, and just change single lines. It's not that any individual is right or wrong. It is that they all have a choice and a preference and Google created a policy that allowed individual teams to have a choice of strict compliance or not. That is the failure.

If you are going to have a policy about code formatting: - make it hard mandatory for everything except a "break glass" situation - if the policy can evolve, staff a team with enforcing it globally

The fact that Google, as a company, does not reward this behavior does not take away from any individual's accomplishments. This post may sound grumpy to an outsider, but I am constantly amazed at the tools I have available to fix things on an enormous scale. The friction is usually only where we have good intentions, without the policy teeth to enforce alignment with the intentions. That's a management problem, not a technical one.

▲rsc 384 days ago

I can't speak to what has happened with Buildifier, but in general you are right. It has to be a hard rule that if you change the format rules, you have to reformat everything to match.

If that means it's too hard to change the format rules, then don't change the format rules. And if you don't reformat, then it has to be a clear rule known to everyone (or written down somewhere you can point to) that incidental formatting changes are acceptable and not something you are allowed to push back on.

I can speak to Go and gofmt, and there we are VERY reluctant to change formatting rules. It does happen for the odd corner case once in a while, but nothing that would cause "changes far beyond my 2-line fix".

▲repsilat 384 days ago

For this sort of change I think the best strategy is "two passes".

Auto-format those O(30k) files and get global approval. Then, separately, make your two-line semantic change and seek approval from local owners.

▲aiuto 384 days ago

That is missing the point about the misdirection of costs. Your suggestion forces the people doing meaningful semantic changes, into involuntary servants to the goal of cleaning the stylistic problems. It's fine to have a policy that costs a little to each of the owners of their own code. It's a tax for the overall good. It becomes a problem if the cost of compliance is shifted to "the next person who looks at it." That encourages people to not look at it.

▲MBlume 384 days ago

Seems like the solution here is to set up buildifier so that it can send out robot CLs on an ongoing basis.

▲spc476 384 days ago

> formatting rules without firm, incredibly strict enforcement ends up being a tax on the janitors

Wait! I thought Google only promoted and rewarded people who do new development, not maintenance. What gives?

▲freedomben 384 days ago

I think that is what GP was saying:

> [Root cause: We could fix this by promoting people for doing that repeat work. But we don't. We promote for the initial sprint.]

▲ 385 days ago

▲troad 385 days ago

The use of light grey text on a darker grey background strains my eyes and makes this unnecessarily unpleasant to read. I'd respectfully suggest increasing the contrast dramatically.

I keep a quick little scriptlet in my bookmark bar for cases like this:

  javascript:(function(){ $('head').append('<style>*{color:#101010 !important; background:#f0f0f0 !important;}</style>'); }());

(A ten second hack job; suggested improvements from front-end friends are welcome.)

▲Dagger2 383 days ago

I've been using this:

    javascript:(function() { for (var n of document.querySelectorAll('a, p, li, div')) { n.style.color = (n.nodeName == 'A' ? 'LinkText' : 'CanvasText'); n.style.backgroundColor = 'Canvas'; n.style.font = '500 16px/1.4em sans-serif'; }})();

It uses system colors and thus, if your browser supports them, should adapt to dark mode automatically. Using .style has the advantage that sites can't override the style themselves using .style. (You'd think looping over all these elements would be slow, but it's not.) This version also works on sites that aren't using jQuery, although it wouldn't be hard to use `var s = document.head.appendChild(document.createElement("style")); s.innerText = "...";` for that.

I was surprised at how helpful forcing the font face and spacing is. There's a lot of sites out there with bad-looking fonts or huge line spacing on top of unreadably-light gray.

I added the background color part based on your version. Thanks for prompting me to try that; the way my bookmarklet didn't work on black backgrounds was occasionally a problem. I also added a bit to force link colors, since neither of our versions handled those well.

Perhaps the next step is a "multistage" bookmarklet that applies more rules the more times you click on it, so the more forceful rules (like background color, which often messes up other parts of the site design) can be optional.

▲troad 381 days ago

Thank you, this is great! I hadn't considered changing the font for greater readability, but it's an obvious improvement. Good job with the links, too, that was a great idea. I've added 'blockquote' to your tag list, since I find that often gets a background shade of its own (e.g. code snippets), but otherwise I've saved it as is.

I tend to prefer interfaces in dark mode and content in light mode, so I'll see how I feel about the conditional logic there, I may eventually wind up going back to hardcoding some colours.

> Perhaps the next step is a "multistage" bookmarklet that applies more rules the more times you click on it, so the more forceful rules (like background color, which often messes up other parts of the site design) can be optional.

That's a really neat idea. I can imagine it stretching from slight readability tweaks all the way to a pseudo-reader mode. It would definitely be a bit more of an undertaking than either of our quick snippets, though.

▲laurentlb 385 days ago

Thanks, I've increased the contrast (not as much as your suggestion, but I hope it's better).

▲ 385 days ago

▲ragall 385 days ago

The real lesson here should be that source code shouldn't be stored in a text format, but in a well-defined strict binary format that stores the parse tree directly, which completely eliminates the need for formatters.

▲mmastrac 385 days ago

A well-defined, unambiguous formatting standard for a text file is philosophically identically to a well-defined strict binary format. :)

▲YZF 384 days ago

I agree with this sentiment. It also provides an opportunity to store other metadata with the code. It lets different people choose their formatting preference when rendering back to human readable code.

The source code is text thing seems like a "legacy" concept.

At the very least I think this is an interesting thing to explore. Maybe it leads nowhere...

▲flymasterv 385 days ago

I don’t understand why they had to format 100k files. You enforce the format with a presubmit and let the code get formatted in the next change.

I have long felt that Google’s strength has always been making a bad architectural choice and then executing on it flawlessly. So many systems are designed in ways that require incredible technical execution to make them workable, and they do it.

▲rsc 385 days ago

If you do it that way, then what should be a 1-line BUILD file change turns into something that changes every line. It distracts from the actual purpose of the future change. Many directories aren't touched for long periods of time. A few months from now someone tries to make a 1-line change and is unpleasantly surprised they have to deal with tons of seemingly spurious formatting changes. Not good.

Putting the time of submitting the changes on a small team (mostly me, with approvals from Rob and help from Laurent) was absolutely the right tradeoff. It avoided the "unfunded mandate" and tech debt of making everyone else deal with it.

Update: I found the FAQ we wrote back then. It was very short. These were the last two questions:

Q: Who will update all the existing BUILD files?

A: We will. There are nearly 200,000 of them, and we’ll take care of that. We’re sending CLs out now. If you want to do it yourself, that’s fine: see go/buildifiernow for a tool that can help.

Q: You’re creating a lot more work for me.

A: We are creating significant amounts of work for ourselves, including reformatting all 193,000 BUILD files in google3. For the rest of the engineers in the company, we intend to make the transition as smooth as possible, with integration in Eclipse, Emacs, and Vim, as well as tools like Rosie and GenJsDeps. It is an explicit goal not to create significant work for other engineers. If, as we roll this out, you find that we’ve created noticeable work in your workflow, please let us know so that we can address that.

▲diegocg 385 days ago

Make the one line change be a commit, then the reformatting be another one, review only the first one. It shouldn't be a problem with a proper review system.

▲laurentlb 384 days ago

If every engineer needs to make two commits when they change the build file, that's a higher cost compared to having people dedicated to the migration.

▲dilyevsky 385 days ago

Googles source control system at the time (Perforce) didn’t allow for this at least easily. Not sure about now

▲tantalor 385 days ago

All changes need to be reviewed. That's the point of code reviews.

Your suggestion would allow people to bypass the code review by just saying "oh it's just cleanup don't worry".

▲YZF 384 days ago

My understanding is the 100k changes files were not reviewed by a human, they did some automatic validation, so that validation could also be done on demand, e.g. a commit saying "reformatted" could trigger a check to make sure the files were identical and bypass a human review... but sounds like they chose a reasonable approach.

I've always been against "reformat the whole code base" but it's an interesting example where it seems to have been the right choice.

▲y2mango 385 days ago

Foremost: formatting these BUILD files was the correct decision as rsc already explained.

Then, there are also other formatters that support "incremental formatting", meaning it only formats lines that are changed in your commit.

Disclaimer: I authored https://github.com/google/pyink and replaced Google Python's YAPF formatter with this Black fork and also implemented the "incremental formatting" feature in Pyink and upstreamed to Black.

When we were rolling out the formatter change, we chose to NOT format the Python files mainly because 1) not all teams at Google enforce Python formatting at presubmit time; 2) the formatter supports "incremental formatting" to minimize the diffs introduced by the formatter.

There are of course less ideal cases where even incremental formatting has to touch not-changed-lines, such as a large Python dictionary/list/set literal that spans across dozens or even hundreds of lines. It's a tradeoff in the end.

▲valicord 385 days ago

Because then every commit from now on has a ton of formatting changes that make it harder to see what was actually changed.

▲flymasterv 385 days ago

And this is a flaw of Perforce: in a Git/Mercurial system, the presubmit can stack the changes into two commits. In P4, one CL has to contain both changes.

And Google uses P4(ish) because monorepo, so they build further abstractions over P4 to enable git and hg in user space which erases most of the potential benefits of either which is also all really, really good software, but it’s all effort necessitated by monorepo. CitC is a work of art, but it is also something necessitated by a stack of other choices that forced their hand into inventing something miraculous to keep hacking around a previous limitation that nobody else has.

▲valicord 385 days ago

>the presubmit can stack the changes into two commits

That seems like way more complexity than just doing it once and for all. Now the commit log is littered by a bunch of automatic commits that format one file at a time.

▲lesuorac 385 days ago

The commit history for untouched files is mostly just cleanup CLs for reformatting or changing an import for a decade+. A lot of the history is rather useless for finding a bug but at least its generally well tagged with 'CLEANUP=TRUE' so you know to ignore them.

▲lesuorac 385 days ago

You can effectively submit two CLs at the same time where the first CL is just a formatting change and the second has just your changes. Although you would need approval for both CLs which really is no different than if you used Git/Mercurial.

▲fragmede 385 days ago

git5's been depreciated. fig/hg is well supported though.

▲ChrisAntaki 385 days ago

Ex: Angular

▲lopkeny12ko 385 days ago

[flagged]

▲refulgentis 385 days ago

It's saturday night, in spring, it's really beautiful out. Another year alive. It gives me joy and reminds me to ease off a bit.

In unrelated news, OP was suggesting no 100K file CL, and a presubmit. They were not disputing what the article said. They were suggesting sharding out the initial formatting change to 100K individual CLs.