Rendered at 18:09:34 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
rtpg 16 hours ago [-]
> The issue was labeled p-critical and i-miscompile, out of +61K rust issues there only 7 (including this one) that are both p-critical and i-miscompile, to me those are the most dangerous kind of bugs a compiler can have, given that they violate the contract between the programmer and the language. They show that not every safe code you write is safe. And also more generally there are only 247 p-critical issues to begin with.
I don't mean this in a snarky way or the like... and I guess the fact there are so few of these are a good indicator that the current processes are working decently well... but I would be very curious on a post-mortem from maintainers on how this bug got through. Miscompiles feel like the scariest sort of thing.
Maybe the deep and dark secret is simply that compiler optimizations are just extremely prone to mistakes and we all are just lucky enough that most messed up optimizations will break _something somewhere_ early enough to not get merged.
IshKebab 2 minutes ago [-]
> compiler optimizations are just extremely prone to mistakes
This is definitely true - there has been quite a lot of formal work done to prove that optimisation passes don't change semantics. I guess it hasn't made it into Rust yet. Or maybe even LLVM.
wyldfire 2 hours ago [-]
> how this bug got through.
Simplest answer is that there's no representative test case in the test suite yet.
Unfortunately the problem of compiler testing is a very challenging one IMO. You can't test it exhaustively, or perhaps if you did write such a suite it would take longer to execute than is feasible.
> we all are just lucky enough that most messed up optimizations will break _something somewhere_ early enough to not get merged
Your compiler is only as good as the set of code people routinely use it on. Rust has exploded in popularity but it's still much less popular than C, C++.
Semaphor 6 hours ago [-]
Still looking fondly back to when I was studying back in ~2006, and my C program on some (I think?) ATMega device (probably not the latest version of the compiler? I really can’t remember the details anymore) device didn’t work the way I thought it should. Called the professor over. He looked at the assembly and realized the compiler was modifying a variable outside the loop for some reason that was modified inside with the C code. So before I even heard of "It’s never the compiler", I learned that sometimes it actually is the compiler ;)
norir 20 hours ago [-]
This feels like it should have been a warning rather than an optimization in the first place. In my opinion, dead code elimination should only be done during link time optimization where it can be proven that branches are not taken given the whole program information. If there is an unused assignment regardless of the branch, the compiler could emit a warning so the user can do their own dead code elimination, or choose to suppress/ignore the warning. The worst thing a compiler can do is silently incorrectly apply an optimization.
Of course, I also feel this way about the vast majority of optimizations. If the compiler can optimize a piece of code, it can also show the user what it thinks the optimal code would be so that they can rewrite it themselves, if they so choose. This both prevents these kinds of miscompiles and prevents compilation times from exploding because the compiler doesn't need to do much work, it primarily just translates the human readable code into machine code.
dzaima 20 hours ago [-]
Such source-level warnings do exist in various forms in various languages, with various levels of fixed analysis done for determining them.
Tying such in with optimizations largely just does not work, given that functions with an unused return value exist, being dead code after inlining, and compilers can emit dead code themselves (e.g. duplicating a piece of code, and then DCEing unused things in one copy; or dead branches of inlined functions); never mind the complete unpredictability of various compiler heuristics now being able to change warning behavior (gcc has some of this type of optimization-dependent warnings, and it annoys the hell out of me)
Copying the compiler's work into your code falls apart the moment you target multiple architectures, as different architectures can often benefit from quite-different implementations.
And there's the whole thing that most compiler optimization stages often do not translate well or at all to the source language (e.g. LLVMs poison semantics do not exist in C, nor any language afaik; goto spam!; and there are optimizations that can be applied to safe code that cannot be translated back to safe code without entirely undoing the optimization (e.g. replacing known-unused variables or array elements with undefined ones))
rcxdude 9 hours ago [-]
> If the compiler can optimize a piece of code, it can also show the user what it thinks the optimal code would be so that they can rewrite it themselves, if they so choose
This is not straightforward. Apart from the mapping from a several-layers-deep optimization to the source level being very difficult, it may not be even representable in the original language. And even if it is, it may require complicating the code significantly. Part of the point of compiler optimization is so that you can write straightforward code and still have it be fast.
Compilers will often warn on dead code, but only at fairly early stages of translation where it's obvious that something is definitely dead code in all possible contexts and the fix is obvious. These rules are different to what the optimizer actually uses much later on in the pipeline.
LtWorf 20 hours ago [-]
How would you rewrite some code to use SSE vector instructions and still be readable?
Agingcoder 6 hours ago [-]
Compiler bugs are surprisingly common - most people simply never notice them. Whether these bugs are major or not is a different topic.
If you have a very large scale test suite for your application, a large codebase, and exert most compiler features including the optimizer , you’ll probably find a few every time you upgrade .
recursivedoubts 21 hours ago [-]
interesting story
as an aside, I think the original code, with the if statement, was clearer and would be easier to debug, etc, even if it was a bit longer
jpollock 17 hours ago [-]
I prefer the second. :)
It's about the cognitive load and having to follow branches.
The second version minimizes the cyclomatic complexity, taking it to 1. The reader doesn't have to keep the if statement in mind when reading through the code, and doesn't have to worry about all the ways the code can get there if they want to modify it (e.g. to add logging, metrics, other logic).
Whether or not consume should be a separate function depends on how often it's called. Here, I'm guessing it's once. :)
dzaima 5 hours ago [-]
..except, if you want to add logging/metrics/other logic, it's quite possible you'll want it to be conditional on the boolean anyway, bringing branching back, now mixed with the non-branching code.
And even if you don't need to keep an if statement in mind, you still need to keep the variable in mind anyway.
bayesnet 20 hours ago [-]
Especially interesting because (as one of the compiler team members noted on Zulip) this is a miscompilation relatively easy to stumble into in safe code (yikes). Looks like the cause was a late pass in MIR optimization pipeline — I would think these are carefully vetted for soundness so am surprised that this slipped in there.
Panzerschrek 12 hours ago [-]
For me this happens from time to time. But it's for a good reason - I develop my own compiler. And it can be pretty tricky to identify and track such bugs.
goodwillhunting 20 hours ago [-]
It's a good story, it takes a bit of courage to post a compiler issue - so well done. As an aside note - I abs love the lefthand progress color indicator on your blog!
WJW 20 hours ago [-]
If the author is here, typo in the second paragraph: "eariler", should probably be "earlier".
fusl 14 hours ago [-]
There are probably a couple more typos in this ("momenet" being another one). It's very refreshing to see blog posts not being written by LLMs for once.
Surac 11 hours ago [-]
can someone explain what the self.0 means? im not a rust speaker
In most other languages this would be written as "this.current_index" or some other property name.
tialaramex 4 hours ago [-]
self.0 is the zeroth field of self,
self is the name for a value of this type, apparently a type named LexerConsumer, this function took "&mut self" a mutable reference to a LexerConsumer. In Rust using any variety of "self" in this way means the function can (if you want) be called as a method on a LexerConsumer like this:
foo.consume(); // If foo is a LexerConsumer, we will call that `consume` function and pass it a mutable reference to foo.
Presumably the LexerConsumer doesn't give its fields names, so hence self.0 meaning just the zeroth field. Perhaps the author couldn't think of a good name for it.
bitwize 18 hours ago [-]
I was experimenting with an early, early version of the Java SDK, v1.0.1, about 30 years ago. I was trying to build a calendar using a GridBagLayout, and all the date squares got scrambled and drew on top of each other. I posted to mailing lists and USENET, asking what exactly I was doing wrong, since—recalling the first rule of compiler bugs, to wit, "it's probably not a compiler bug"—while I thought I was "holding it correctly" (i.e., using the APIs in the manner recommended by the documentation), I was willing to concede that I was missing something important.
Turns out it was a bug in the Java runtime, one that was fixed in v1.0.3.
jackmott42 19 hours ago [-]
I started programming at ~8 and as a kid you often are sure the compiler is wrong, and it never is. (maybe once, have a vague memory of it)
Then years later I started playing with Nim when it was still in beta, I think I found 3 compiler bugs in a few weeks! Reported them all, all got fixed!
And then a few years later found a bug in LLVM, doing weird stuff with Rust and SIMD intrinsics. Was difficult to even communicate what was going wrong but did get it reported and confirmed eventually!
I don't mean this in a snarky way or the like... and I guess the fact there are so few of these are a good indicator that the current processes are working decently well... but I would be very curious on a post-mortem from maintainers on how this bug got through. Miscompiles feel like the scariest sort of thing.
Maybe the deep and dark secret is simply that compiler optimizations are just extremely prone to mistakes and we all are just lucky enough that most messed up optimizations will break _something somewhere_ early enough to not get merged.
This is definitely true - there has been quite a lot of formal work done to prove that optimisation passes don't change semantics. I guess it hasn't made it into Rust yet. Or maybe even LLVM.
Simplest answer is that there's no representative test case in the test suite yet.
Unfortunately the problem of compiler testing is a very challenging one IMO. You can't test it exhaustively, or perhaps if you did write such a suite it would take longer to execute than is feasible.
> we all are just lucky enough that most messed up optimizations will break _something somewhere_ early enough to not get merged
Your compiler is only as good as the set of code people routinely use it on. Rust has exploded in popularity but it's still much less popular than C, C++.
Of course, I also feel this way about the vast majority of optimizations. If the compiler can optimize a piece of code, it can also show the user what it thinks the optimal code would be so that they can rewrite it themselves, if they so choose. This both prevents these kinds of miscompiles and prevents compilation times from exploding because the compiler doesn't need to do much work, it primarily just translates the human readable code into machine code.
Tying such in with optimizations largely just does not work, given that functions with an unused return value exist, being dead code after inlining, and compilers can emit dead code themselves (e.g. duplicating a piece of code, and then DCEing unused things in one copy; or dead branches of inlined functions); never mind the complete unpredictability of various compiler heuristics now being able to change warning behavior (gcc has some of this type of optimization-dependent warnings, and it annoys the hell out of me)
Copying the compiler's work into your code falls apart the moment you target multiple architectures, as different architectures can often benefit from quite-different implementations.
And there's the whole thing that most compiler optimization stages often do not translate well or at all to the source language (e.g. LLVMs poison semantics do not exist in C, nor any language afaik; goto spam!; and there are optimizations that can be applied to safe code that cannot be translated back to safe code without entirely undoing the optimization (e.g. replacing known-unused variables or array elements with undefined ones))
This is not straightforward. Apart from the mapping from a several-layers-deep optimization to the source level being very difficult, it may not be even representable in the original language. And even if it is, it may require complicating the code significantly. Part of the point of compiler optimization is so that you can write straightforward code and still have it be fast.
Compilers will often warn on dead code, but only at fairly early stages of translation where it's obvious that something is definitely dead code in all possible contexts and the fix is obvious. These rules are different to what the optimizer actually uses much later on in the pipeline.
If you have a very large scale test suite for your application, a large codebase, and exert most compiler features including the optimizer , you’ll probably find a few every time you upgrade .
as an aside, I think the original code, with the if statement, was clearer and would be easier to debug, etc, even if it was a bit longer
It's about the cognitive load and having to follow branches.
The second version minimizes the cyclomatic complexity, taking it to 1. The reader doesn't have to keep the if statement in mind when reading through the code, and doesn't have to worry about all the ways the code can get there if they want to modify it (e.g. to add logging, metrics, other logic).
Whether or not consume should be a separate function depends on how often it's called. Here, I'm guessing it's once. :)
And even if you don't need to keep an if statement in mind, you still need to keep the variable in mind anyway.
In most other languages this would be written as "this.current_index" or some other property name.
self is the name for a value of this type, apparently a type named LexerConsumer, this function took "&mut self" a mutable reference to a LexerConsumer. In Rust using any variety of "self" in this way means the function can (if you want) be called as a method on a LexerConsumer like this:
foo.consume(); // If foo is a LexerConsumer, we will call that `consume` function and pass it a mutable reference to foo.
Presumably the LexerConsumer doesn't give its fields names, so hence self.0 meaning just the zeroth field. Perhaps the author couldn't think of a good name for it.
Turns out it was a bug in the Java runtime, one that was fixed in v1.0.3.
Then years later I started playing with Nim when it was still in beta, I think I found 3 compiler bugs in a few weeks! Reported them all, all got fixed!
And then a few years later found a bug in LLVM, doing weird stuff with Rust and SIMD intrinsics. Was difficult to even communicate what was going wrong but did get it reported and confirmed eventually!