The --enable-native-access option mentioned in the article is part of a large effort we call "Integrity by Default"[1]. The idea is that a library module can violate invariants established by another module (e.g. access to private fields and methods, mutation of final fields etc.) requires approval by the application, so that a library will not be able to have a global effect on the application without its knowledge, and the correctness of each module could be verfied in isolation.
Now, --enable-native-access is also required to use JNI, but JNI can violate the integrity of Java invariants in a much more extensive way than FFM can. For example, JNI gives native code access to private fields of classes in arbitrary modules, while FFM does not. The only invariant FFM can break is freedom from undefined behaviour in the C sense. This is dangerous, but not nearly as dangerous as what JNI can do.
For the time being, we decided to enable both FFM and JNI with the same flag, but, given how more dangerous JNI is, in the future we may introduce a more fine-grained flag that would allow the use of FFM but not of JNI.
Where does the "final means final" effort fit in? Can the JVM prevent modification of final fields via JNI, or is --enable-native-access also going to require (or imply) the flag which enables setAccessible() and friends?
When running with -Xcheck:jni, you'll get a warning when trying to mutate a final field with JNI.
Now, enabling this check by default without harming JNI performance proved to be too much of an effort. However, mutating final fields with JNI even today can already lead to undefined behaviour, including horrible miscompilation, where different Java methods can read different values of the field, for final fields that the JVM already trusts to be immutable, such as static finals, record components, or a few other cases (indeed, there are non-final fields that the JVM trusts to be assigned only once, and mutating those with JNI is also undefined behaviour). As the compiler starts trusting more final fields after this change, mutating almost all final fields will lead to undefined behaviour. Then again, using JNI can lead to undefined behaviour in many ways.
So to make sure your JNI code isn't mutating finals, test with -Xcheck:jni (as of JDK 26).
This brings back memories debugging an azul zing bug where an effectively final optimization ended up doing the wrong thing with zstd-jni. It was painful enough that I couldn’t convince the team to enable the optimization again for years after it was fixed.
> I’m always a bit shocked how seriously people take concerns over the install script for a binary executable they’re already intending to trust.
The issue is provenance. Where is the script getting the binary from? Who built that binary? How do we know that binary wasn't tampered with? I'll lay odds the install script isn't doing any kind of GPG/PGP signature check. It's probably not even doing a checksum check.
I'm prepared to trust an executable built by certain organisations and persons, provided I can trace a chain of trust from what I get back to them.
The thing that gets installed, if it is an executable, usually also has permissions to do scary things. Why is the installation process so scrutinized?
I think there's a fundamental psychological reason for this - people want to feel like some ritual has been performed that makes at least some level of superficial sense, after which they don't have to worry.
You see this in all the obvious examples of physical
security.
In the case of software it's the installation that's the ritual I guess. Complete trust must be conferred in the software itself by definition, so people just feel better knowing for near certain that the software installed is indeed 'the software itself'.
I'm so thankful for nixos for making it hard for me to give in to that temptation. you always think "oh just this once". but with nixos I either have to do it right or not bother.
It sort of does actually, at least if you don't have nix-ld enabled. A lot of programs simply won't start if they're not static-linked, and so a lot of the time if you download a third-party script, or try to install it when the `curl somesite.blah | sh`, it actually will not work. Moreover, it also is likely that it won't be properly linked in your path unless you do it thr right way.
$ ./Downloads/tmp/xpack-riscv-none-elf-gcc-15.2.0-1/bin/riscv-none-elf-cpp
Could not start dynamically linked executable: ./Downloads/tmp/xpack-riscv-none-elf-gcc-15.2.0-1/bin/riscv-none-elf-cpp
NixOS cannot run dynamically linked executables intended for generic
linux environments out of the box. For more information, see:
https://nix.dev/permalink/stub-ld
You have to go out of your way to make something like that run in an fhs env. By that point, you've had enough time to think, even with ADHD.
Equally I don't like how many instructions and scripts everywhere use shorthands.
Sometimes you see curl -sSLfO. Please, use the long form. It makes life easier for everybody. It makes it easier to verify, and to look up. Finding --silent in curl's docs is easier than reading through every occurrence of -s.
curl --silent --show-error --location --fail --remote name https://example.com/script.sh
For a small flight of fancy, imagine if each program had a --for-docs argument, which causes it to simply spit out the canonical long-form version equivalent to whatever else it has been called with.
agreed. i get if you're great at cli usage or have your own scripts, but if you're publishing for general use, it should be long form. that includes even utility scripts for a small team.
also, putting it out long-form you might catch some things you do out of habit, rather than what's necessary for the job.
LLVM IR is quite fun to play with from many programming languages. The Java example is rather educational, but there are several practical example,s such as in Go Lang:
I made a poster showing how one might write a Hello World program in 39 different programming languages, and even different versions of some common languages like Java:
I believe that is a C-ism, where the C runtime calls your main() and exits the process with the return value. The Java equivalent is System.exit(int status).
Cool poster! If you don't mind me asking, would you share what tools you use to create this poster? You've got syntax highlighting going on there too. What did you use for that?
This article specifically discusses calling external C ABI libraries via the FFM API.
GraalVM is for compiling JVM bytecode to native, architecture-specific binaries.
FFM is like "[DllImport]" in .NET, or "extern" definitions in other languages.
The article shows how to auto-generate JVM bindings from C headers, and then allocate managed memory + interact with externally linked libs via the FFM API passing along said managed memory.
BTW: We (the GraalVM team) maintain a full-blown LLVM bitcode runtime that can be embedded in Spring or any other JVM application and compiled to native: https://github.com/oracle/graal/tree/master/sulong
One of the neatest things I've been able to do is compile a .dll library "plugin" for an application which loads plug-ins by invoking a special exported symbol name like "int plugin_main()" using GraalVM and @CEntryPoint
The entrypoint function starts a Graal isolate via annotation params and no native code was needed
people still use make for things. how many stand-alone utilities require npm?
i don't know graalvm, but I've used too much ant, buldr, gradle and maven. I'm not really convinced Graal VM would make anything better just because you are more familiar with it.
The author even says to just use what you like because that part doesn't matter.
I was using it while dabbling on compiler stuff, it was useful to have a set of concise compilation examples. I haven't touched it much lately, unfortunately, and I added the eBPF because the target was there but had no way to validate it (standalone eBPF validator where?) so I think it's probably somewhat wrong... or invalid at least, maybe that's a separate concern for people who would want this.
LLVM is such an amazing piece of software, the amount of uses for it are unlimited especially when it comes to obfuscation. The IR is also really fun for compiling bytecode to native code since it's pretty trivial to translate it into IR (opposite of what is done in this article)
I've been playing with a very basic compiler for a language that looks a bit like go -> llvm ir, but I'm finding myself constantly revising my AST implementation as I progressively add more things that it needs to represent. Is anyone aware of any kind of vaguely standardized AST implementation used by more than one project? I've been searching this morning for one and am coming up empty. My thinking is that if I can find some reasonably widely used implementation, then hopefully that implementation has thought out lots of the corner cases that I haven't gotten to yet.
I can appreciate this answer, but I don't think it's really what I'm asking.
I think I'm more looking for some kind of standardized struct definition that translates easily to llvm IR and is flexible enough for a wide variety of languages to target.
It's more an fun educational overview of the new FFM API.
I can't think of many actual use-cases where you'd want to use the LLVM JIT over those built-in to HotSpot.
Interfacing with existing LLVM-based systems, writing a very tight inner loop using LLVM where you absolutely need LLVM-like performance, or creating a compiler that targets LLVM using Java would be the main "real-world" use-cases.
This is interop glue to cross language boundaries in the JVM without the problems that come with JNI. The natural goal/use-case being that you can call pre-existing code in other languages that target LLVM IR.
The article is presenting something different entirely. This is the precursor to what it would take to create a compiler written in java that produces native code.