Automatic translation like this shifts the trust to the tool. coq-of-rust itself is written in Rust, not in Coq. The recursive nature is somewhat boggling, but I think a proof of its correctness is possible in a similar process to David A. Wheeler's “Countering Trusting Trust through Diverse Double-Compiling” (2009) [0] (which circumvents Ken Thompson's Trusting Trusting attack by using a second compiler), but with a mix of a CompCert approach.
To verify it, you'd use coq-of-rust to convert the coq-of-rust translator from Rust to Coq. That translation is not trusted, because it was performed in Rust, but it doesn't matter. Once in Coq, you prove the desired correctness properties—crucially, that it preserves the semantics of the Rust program when it translates a program to Coq.
As in the article, it is likely easier to work with more functional definitions in proofs instead of generated ones, so you'd undertake the same process as they do with the stdlib of proving equivalence between definitions. Since the current line count for the coq-of-rust translator (specifically, lib/ [1]) is 6350 lines of Rust, it even seems feasible to write a full translator in Coq and prove its equivalence to the generated one.
Then, you execute the proven-correct Coq coq-of-rust translator on the Rust source of the coq-of-rust translator. The Coq definitions it outputs should match the output of the Rust coq-of-rust translator that you started with.
As an aside, it's nice to see industry funding for work like this. I'm often cynical of cryptocurrency, but its correctness constraints really push for improvements in areas I like (Rust, Coq, funding for masters students I know, etc.).
[0]: https://dwheeler.com/trusting-trust/wheelerd-trust.pdf
[1]: https://github.com/formal-land/coq-of-rust/tree/main/lib
Indeed this would be a nice process to verify coq-of-rust. Also, although the code is rather short, we depend on the Rust compiler to parse and type-check the input Rust code. So that would need to be also verified, or at least formally specified without doing the proofs, and the API of rustc is rather large and unstable. It could still be a way to get more insurance.
Since we don't have a verified rustc (a la CompCert [0]), I wonder if an approach like the translation validation of seL4 [1] would work. They prove that the artifact (ARM machine code) produced by an existing compiler (gcc) for a chosen program (seL4) matches the source semantics (C). Thus you could circumvent trusting rustc, but it only works to verify a specific output of a chosen program. If the chosen program was coq-of-rust, I don't think this would be easier than the approach I detailed above. The seL4 kernel is 9,500 lines of C, while their Isabel/HOL specification is over 200,000 lines, so the technique doesn't seem to scale to a large chosen source like rustc.
Isn't bootstrapping fun?
[0]: Xavier Leroy. 2008. “Formal verification of a realistic compiler”. https://xavierleroy.org/publi/compcert-CACM.pdf
[1]: Thomas Sewell, Magnus Myreen, and Gerwin Klein. PLDI 2013. “Translation Validation for a Verified OS Kernel”. https://sci-hub.st/10.1145/2491956.2462183
However, we also lose some information going to MIR, as there are no expressions/loops from the original code anymore. There are still ways to reconstruct these, but we preferred to use the THIR representation directly.
If a program has a loop we show that it terminates by constructing an execution trace. Note that we do not yet consider concurrency, so programs are deterministic.
It obviously introduced issues around verifying the output or the transpiler, but I think the resulting C code was quite readable and limited in style for verification purposes, and the tools had a high degree of trust.
The SPARK static analysis was only ever a part of the whole verification and validation process and there was plenty of testing and other activities that provides additional levels of trust.
The whole approach seemed to work pretty well.
That's part of it, but another part I think is that crypto currencies simply transferred a lot of wealth to people interested in CS (and a wouldn't-it-be-cool-if mindset in general). And they are spending that money on CS research not just because it benefits them but because it's philanthropy aligned with their hobby.
Another approach is proof carrying code: The rust compiler would emit a coq proof that its executable output matches the semantics of the source code input.
Of course, you could also write a compiler of a subset of rust (perhaps without the borrow checker or optimizations) in machine code for one architecture, then bootstrap everything with that.
The process is: Compile mrustc with a C++ compiler. Compile rustc sources with untrusted rustc binary and compile rustc sourcs with mrustc (these have identical behavior, but different codegen). Compile rustc sources with rustc-by-rustc and compile rustc sources with rustc-by-mrustc (these will have identical behavior and codegen). Those will match. If you compile once more, they will match. Since mrustc is never compiled by rustc, such a backdoor would have to also exist in gcc/clang and propagate with exactly identical behavior in mrustc. The process could be repeated for gcc/clang.
The level of confidence that software requires (as opposed to an abstract algorithm) is not that much higher than one minus the probability of an "unaccounted" failure (such as a hardware failure). Sometimes the extra confidence beyond that point is free, and sometimes it can be very expensive.
We don't. We will always have to trust _something_ (the hardware, the compiler, the specification, Coq's trusted kernel, ...). Formal verification is sadly often discussed as removing the possibility of bugs in their entirety. In practice, formal verification only makes bugs much less likey.
That said, yes, automatic translation between Rust and Coq massively reduces the complexity of the things you have to trust and should therefore be preferred over manual translation.
The most common ones, business logic, often escape verification simply because of wrong assumptions.
Simple example: some websites use the browser's default language to set your locale, some use your ip. Thus I travel to Poland and suddenly YouTube shows me Polish content and ads (this can be changed often, but not always, see Google docs idiocy). They will all unavoidably lead to business logic bugs because the assumptions are wrong in the first place.
Verification is checking against formal or at least known requirements like those being discussed in this thread.
Validation is a set of processes for ensuring the specifications actually correspond to how the product ought to work. "How it ought to work" is open-ended and meant to include real-world aspects you didn't know beforehand. For example business logic assumptions. But despite being open-ended it is possible to formalise and track the process of looking for issues systematically.
In your example of YouTube showing Polish content when you travel, verification is the process of checking the website uses the IP correctly because using the IP is specified somewhere, but validation is the process of asking "what's the right thing to do when a user travels, is it really to use the IP?" and "let's systematically research what issues are affecting users due to incorrect specifications that we should change".
In many cases, a bug in the translation simply makes the proof impossible. Somebody then investigates why the proof does not go through and finds the bug in the translation.
We only have a problem if the bug in the translation specificially cancels a bug in the original code. If there is no systematic risk, it is quite unlikely that two bugs cancel each other in this way.
I doubt the translation would be that bad but it could have more subtle issues of the same kind.
This is used by compcert: https://compcert.org/
My question was whether it can help detecting translation errors from the first step.
I know people raise questions about trust in Coq itself and in translation (and I question how it will synchronize with rust updates), but even perfect formal verification doesn't mean 100% correctness from start to finish
Rust has its own definition of what "safe" means, which is arguably a subset of memory safety.
You can have perfectly safe Rust code with data races, deadlocks, memory exhaustion, etc. Not even talking about logical errors.
No, you can't. Data races are explicitly one type of race condition that Rust protects you from. Anything else would indicate an unsound library or compiler bug. For reference, you can take a look at the non-exhaustive list of undefined behaviors[0]. None of those behaviours should be possible to trigger from safe Rust.
[0] https://doc.rust-lang.org/reference/behavior-considered-unde...
> However Rust does not prevent general race conditions.
> This is mathematically impossible in situations where you do not control the scheduler, which is true for the normal OS environment.
> For this reason, it is considered "safe" for Rust to get deadlocked or do something nonsensical with incorrect synchronization: this is known as a general race condition or resource race. Obviously such a program isn't very good, but Rust of course cannot prevent all logic errors.
> Are "data races" and "race condition" actually the same thing in context of concurrent programming
> No, they are not the same thing. They are not a subset of one another. They are also neither the necessary, nor the sufficient condition for one another.
The really curious thing here is that the Nomicon page also describes this distinction in great detail.
Data races are not possible in safe Rust. Race conditions are. The distinction is made clear in the Nomicon article, but commenters here are really muddying the waters...
I could have also belittled your comment as "bunch of possibly-irrelevant content" since most of the content was and still is unnecessary snark.
But then it would have said more about my own etiquette and capability to debate objectively than about the topic at hand.
Our definition of data race seems to differ, and because you don't seem to be able to separate objective discussion from personal attacks, I'll stop here.
That doesn't really make sense because there are other witnesses, so everybody who knows about this topic can see immediately that you're wrong and the other person is right.
So one could have written just the same about what they wrote because of what it also contained. "bunch of possibly-irrelevant content".
This kind of behavior is not justifiable on technical merits alone. At least it shouldn't be.
Rust can only offer its safety for the lifetimes it tracks, and this tracking is limited to a single instance of your process.
In other words, if you perform concurrent accesses through multiple threads spawned from a same process, you're safe from data races, at risk of race conditions. If you perform concurrent accesses through multiples processes, you're at risk of both.
That implies that even in a world where everything is safe Rust, you cannot guarantee that these two Rust processes will not data races together.
How would you do that in safe code? If I understand correctly, the problem you're referring to is the exact reason that mmap APIs in Rust are unsafe: https://docs.rs/memmap2/latest/memmap2/struct.Mmap.html#safe...
In highly concurrent/long lived systems, you often end up using a multiprocess approach between your producers and consumers rather than multithreading. This is because it allows you to dynamically spawn new processes to consume from a shared memory without going through a bastion process managing threaded consumers. e. g. It allows you to spawn at will newly developed consumers without redeploying and restarting the producers. Note that this does not mean a different codebase.
As for the expectations, I think it's fair to highlight it. Because people tend to think that a world of applications fully developed in Rust would somehow imply no data races. That's not the case: On a fully fault-free operating system, with only 100% safe Rust applications running, you can, and will, still run into data races, because in the real world applications cross process boundaries.
While I'm aware of some differences in opinion in the details, all of the definitions I'm aware of exclusively refer to multiple threads of execution writing to the same memory location, which makes the concept of a multi-process data race impossible. For example, Regehr: https://blog.regehr.org/archives/490
A data race happens when there are two memory accesses in a program where both:
target the same location
are performed concurrently by two threads
are not reads
are not synchronization operations
> The means by which this memory location was shared don't really matter.
It does in the context of this conversation, because you are claiming that safe Rust can create data races. Is there a safe API that lets you map addresses between processes? The closest thing I'm aware of is mmap, which is unsafe, for exactly this reason.
"Safe" and "unsafe" are scoped properties. They apply to the block of code that they encompass, by very definition of the fact that they are scoped. As long as an unsafe block respects the constraints that would be enforced if Rust could understand them, then that unsafe block can be wrapped in a safe block and exposed.
In other words, "unsafe" is not transitive as long as what is done in the unsafe block respects the constraints. Mapping a block of memory is not unsafe. AFAIK Rust is able to allocate memory, which means Rust calls mmap, and that is safe.
Thus, mmap being marked unsafe is not relevant here, because the actual unsafe block using mmap can be safe by knowledge of the programmer, while still inducing a data race later on (or not).
If your argument is that unsafety is transitive by virtue of calling something that might have unsafe side effects later on, then that is seriously broken. Because with the same logic, every file access could use /proc to overwrite your process memory or change your memory maps, should we mark the all unsafe? Or worst, you prone a "maybe unsafe" approach, where things are maybe transitively unsafe, in which case why bother, let's rename "unsafe" to "shady", or unsafety just becomes some qualitative measure.
It is not. I am in full agreement with this:
> As long as an unsafe block respects the constraints that would be enforced if Rust could understand them, then that unsafe block can be wrapped in a safe block and exposed.
But I am not sure what you mean by
> AFAIK Rust is able to allocate memory, which means Rust calls mmap, and that is safe.
liballoc is not required to call mmap, and even if it did, that mmap isn't shared with other processes, as far as I'm aware.
> while still inducing a data race later on
If this is possible, then it is not safe, otherwise you lose the properties we're in agreement about above: no data races are allowed in safe Rust.
> Because with the same logic, every file access could use /proc to overwrite your process memory
It is true that Linux specifically exposes the ability to mess with process memory via the filesystem API, and you can use that to subvert Rust's guarantees. But this one specific hole on one specific platform is considered to be outside of the scope of the ability of any programming language to guarantee. An operating system corrupting your memory is a completely different thing.
I doubt we're going to come to an agreement here, but I appreciate you elaborating, as now I think I understand your position here. It's not the one in general use though, so you're going to encounter these sorts of arguments in the future.
1. You need to formalize the semantics of unsafe Rust. Ralf Jung's pioneering RustBelt[1] work was a huge step in that direction, but it is not yet complete. One element that's emerging as tricky is "pointer provenance."
2. Part of that is making a formal model of the borrow checker. Stacked borrows[2] was a good attempt, but has some flaws. Tree borrows[3] may correct those flaws, but it's possible something even more refined may emerge.
3. Another part of that is a formal memory model, which is mostly about the behavior of atomics and synchronization (thus, is critically important for standard library components such as Arc). It is widely understood that Rust's memory model should be similar to the C++ one (and that they should interoperate), but there are some remaining flaws in that, such as the "out of thin air" problem and also some missing functionality like seqlocks (part of why the Linux kernel still has its own).
4. You need a proof that the safety guarantees compose well; in particular if you have sound code written in unsafe Rust, composed with other code written in safe Rust, the safety guarantees still hold. There are good results so far but it needs to be proved over the entire system.
5. You do need to close all the remaining soundness bugs in Rust[1], for such a proof to hold for all code. Progress on this is slow, as many of these problems are theoretical, or are only really relevant for adversarial code[5].
When all this is done, you still only have a partial guarantee. Huge amounts of the surface area to interface with the system are based on unsafe code. If you're doing only pure computation, that's one thing, but if you're building, say, a graphical UI, there's a massive amount of stuff that can still go wrong.
Even so, there is a practical path forward, and that leads to a much better situation than the usual state of things today, which is of course systems riddled with vulnerabilities.
[1]: https://people.mpi-sws.org/~dreyer/papers/rustbelt/paper.pdf
[2]: https://plv.mpi-sws.org/rustbelt/stacked-borrows/
[3]: https://www.ralfj.de/blog/2023/06/02/tree-borrows.html
[4]: https://github.com/rust-lang/rust/issues?q=is%3Aissue+is%3Ao...
It's interesting to me that serious people propose "bag of bits" semantics (basically "What is provenance, lets just say pointers are raw addresses") for C++ which seems like a terrible idea and I can't tell whether they really mean that. P2414R3: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p24...
https://github.com/search?q=repo%3Arust-lang%2Frust+unsafe+l...
I don't think coq, as being presented here, can verify all unsafe calls.
Compared to Aeneas the goal is very similar as we want to verify Rust programs using interactive theorem provers. However, with coq-of-rust we write the purely functional version of the code (the one on which we make proofs) by hand, or with the help of some GitHub Copilot as this is rather repetitive, and prove it equivalent with the automatic translation. In Aeneas the aim is to directly generate a functional version.
We handle all the pointers as if they were mutable pointers (the `*` type). We do not use any information from Rust's borrow checker, which simplifies the translation, but we pay that at proof time.
To reason about pointers in the proofs, we let the user provide a custom allocator that can be designed depending on how the memory will be used. For example, if the program uses three global mutable variables, the memory can be a record with three entries. These entries are initially `None` to represent when they are not yet allocated. We do not yet know how this technique can scale, but at least we can avoid separation logic reasoning for now. We hope that most of the programs we will verify have a rather "simple" memory discipline, especially on the application side.
Thus we can provide a path for people who are ready to sacrifice performance for proofs. I guess that immutable Rust is simpler to verify with the other systems too.
I don't think that's the case (it's not harder either). The type system of Rust makes handling mutability fairly trivial in verification, in fact it could be translated to immutable code in a continuation monad like the Tardis monad from Haskell I think.
This will look a lot like what you do with formal verification at the same level. Formal verification tools can go deeper, though, like answering questions about the state of the system internal to a computation. Like "I have this loop invariant, can I prove (or have a system prove) that it actually holds across all iterations?" or "Can I prove that while doing this computation there is never underflow/overflow in any intermediate computation?" The former could be done with property testing if you pull the loop kernel out into its own procedure and then test the invariant property by executing the kernel with a large number of intermediate states. The latter is harder to do without really, really tearing your program apart down to the level of making each line separately executable.
If you use these kinds of systems you can be more judicious in what you try to prove. Whole program proofs are infeasible (or at least impractical) generally, but you pick the subsets you care about most and you can add annotations and prove them (automatically ideally, but you can bring in an external proof assistant as well). You leave the rest of the system to testing which has proven remarkably effective over the decades (when taken seriously, at least).
This isn't an all-or-nothing proposition, you choose how to spend your time and effort with formal verification just like you do with testing or other work.
Otherwise you can outsource this work to specialized companies such as Formal Land or others!
It is. Formal verification is and will be (for a very long time) a thing you only do for airplanes and nuclear reactors.
I wonder if efforts like this could speed up rust in key parts of kernel adoption
Our main hope with the translation to Coq of the standard library is to get a formalization of it, so that we can precisely verify Rust programs calling core and alloc.
Also, people don't have a problem with the 'git' brand, somehow?
Do you have a source? As of now this is still in the roadmap[1] and I saw no information about removing it from there.
[1]: https://github.com/coq/ceps/blob/coq-roadmap/text/069-coq-ro...
Outside of France, most users of Coq are in the US where it is taught in many universities. I am sure there were no female professors or TA whose job was made harder purely due to being "the coq teacher" (this is sarcastic, multiple such professors and TAs have mentioned having this type of issues on the Coq mailing list).
I don't doubt this is true, but I question the worth of complaints.
And in my opinion multiple such accounts coming up are a good reason to proceed with a name change: the problem shows up as a pattern rather than isolated incidents, and a reasonable fix is identified.
And besides lame jokes when 14 years old hear it for the first time (both male and female, btw), it's fine.
It's essentially a problem of American puritanism (and likely also a problem of misogyny among students, but the solution to that really isn't to change the project name!)
But in my experience this simply causes some giggling in the classroom for a few days and that's it, which is not the experience I have seen recounted regarding Coq in the US.
There are multiple differences between these cases:
- The word "bit" is mostly used when speaking English, not in a French sentence (we use "octet" - I don't know the history here but I wouldn't be surprised that this is specifically because of the French meaning of "bit"). Coq, being the name of the tool, is used as-is in English sentences. [This is wrong, I somehow confused bit and byte here]
- Even when used in a French sentence, the gender is different ("un bit" vs "une bite"), removing ambiguity
- Bits are just one fraction of the curriculum, not the name of the tool used in every single lesson of the course
I will refrain from commenting further on this topic, as it has been rehashed many times already and distracts from the work on rust-to-coq translation.
GIMP and Git are words that most non-native English speakers don't have a strong association with. So they get a pass in most of the world.
But cock is as universal a swear word as you will find on this planet.
If I do a retreat in the village of Fucking, and write a compiler there, and then call it TFC -- the Fucking compiler, I know what I am doing.
Funnily enough, they have renamed it to Fugging in 2021 because of annoying tourists.
is it?
Are you confused with Dick? the name of course...
Ironically the word dick comes from the German language, meaning "dominant ruler"
Frankly I'd welcome more rejection of corporate safetyism. We're adults, not kids in a "Christian Minecraft server"
It's not, though. Come to France, say "coq" - people will wonder why you're talking about roosters all of a sudden. Holy Anglo-centrism, Batman...
I think you missed my point. "As universal as you will find" does not mean universal. It means There is nothing "more universal" - no word understood as a swear word by more people. There is no absolutely universal swear word. Obviously.
But English is spoken by a lot of people, most of them not native, most of whom don't have a native word that sounds like cock as a first association.
Now.. do I think Python should change its name? No.
Rocq is similar to the Roc programming language which is confusing naming (and they sound identical as far as I know).
As opposed to popular theorem provers.