For doing 'bare metal' embedded work in C you need the crt0 which is the weirdly named C startup code that satisfies the assumption the C compiler made when it compiled your code. And a set of primitives to do what the i/o drivers of an operating system would have been doing for you. And voila, your C program runs on 'bare metal.'
Another good topic associated with this is setting up hooks to make STDIN and STDOUT work for your particular setup, so that when you type printf() it just automagically works.
This will also then introduce you to the concept of a basic input/output system or BIOS which exports those primitives. Then you can take that code in flash/eprom and load a binary compilation into memory and start it and now you've got a monitor or a primitive one application at a time OS like CP/M or DOS.
Its a fun road for students who really want to understand computer systems to go down.
It is a small kernel, from only a bootloader to running elf files.
It has like 10 syscalls if I remember correctly.
It is very fun, and really makes you understand the ton of legacy support still in modern x86_64 CPUs and what the os underneath is doing with privilege levels and task switching.
I even implemented a small rom for it that has an interactive ocarina from Ocarina of Time.
LSE is the System's laboratory of EPITA (https://www.lse.epita.fr/)
[1] https://github.com/ChuckM/
[2] https://github.com/ChuckM/nucleo/blob/master/f446re/uart/uar...
[3] https://github.com/ChuckM/nucleo/blob/master/f446re/common/u...
Also regardless of what others say, you can have a go trying to feel how it was to use BASIC in 8 bit computers to do everything their hardware exposed, or even 16 bit systems like MS-DOS, but with Python.
Get a ESP32 board, and have a go at it with MicroPython or CircuitPython,
https://docs.micropython.org/en/latest/esp32/quickref.html
https://learn.adafruit.com/circuitpython-with-esp32-quick-st...
I remember I was trying to program Atari 8-bit using C compiler, and writing directly characters to Antic memory range WITH charcode translation was 100x faster than using printf.
However I'm not sharing this code because it won't work on UART... laughs nervously
I have used a part of newlib with many different kinds of microcontrollers and its build process has always been essentially the same as a quarter of century ago, so that the script that I have written the first time, before 2000, has always worked without problems, regardless of the target CPU.
The only tricky part that I had to figure the first time was how to split the compilation of the gcc cross-compiler into a part that is built before newlib and a part that is built after newlib.
However that is not specific to newlib, but is the method that must be used when compiling a cross-gcc with any standard C library and it has been simplified over the years, so that now there is little more to it than choosing the appropriate make targets when executing the make commands.
I have never needed to change the build process of newlib for a new system, I had just needed to replace a few functions, for things like I/O peripherals or memory allocation. However, I have never used much of newlib, mostly only stdio and memory/string functions.
I was talking about the migration effort and usage complexity, not what the compiler or linker actually sees. It may well be that Newlib can be configured for every conceivable application, but it was more important to me not to have a such a behemoth and bag full of surprises in the project with preprocessor rules and dependencies that a single developer can hardly understand or keep track of. My solution is lean, complete, and works with standard-conforming compilers on each platform I need it.
So whatever preprocessor rules and dependencies may be needed to build the tool chain, they do not have any influence on the building processes for the software projects used to develop applications.
The building of the tool chain is done again only when new tool versions become available, not during the development of applications.
I assume that you have encountered problems because you have desired to build newlib with something else than gcc + binutils, with which it can be built immediately, as delivered.
Even if for some weird reason the use of gcc is avoided for the intended application, that should have not required the use of a newlib compiled with something else than gcc, as it should be linked without problems with any other ELF object files.
Why not? Have a look at https://github.com/rochus-keller/Eigen.
> because you have desired to build newlib with something else than gcc + binutils
Well, the whole point was to make it compatible with my own C compilers.
For its intended purpose, i.e. as what must be added to gcc and binutils for obtaining a complete tool chain usable for the cross-compilation and linking of executable applications for any embedded computer, newlib works fine, with minimal headaches.
If instead of using it as intended, you want to integrate it as a component in a new and different tool chain, then I completely agree with what you have found out, that it is not a good choice.
I have reacted to your first comment because that seemed to imply that newlib is not fit for its purpose of being used in embedded programming applications, which is definitely false.
You have tried to use if for something very different, and in that context you are right, but you should have explained more of that in order to avoid confusions.
Your project seems interesting, but like in most such projects you should add on the initial page some rationale for the existence of the project, i.e. which are the features where it attempts to be different from the better known alternatives based on gcc or clang.
Following the links, one eventually reaches this succinct explanation:
"The Eigen Compiler Suite is a completely self-contained collection of software development tools. It exists to be recognized and adopted as a free development toolchain which is hopefully as useful and easy to use as its source code is intended to be approachable and comprehensible for developers and students wanting to learn, maintain, and customize a complete toolchain."
This does not mention any attempts of being better than alternatives in any direction, except for being much easier to modify if someone desires to implement some kind of compiler/linker customization.
This recommends it mostly for experimental projects, not for production projects. The former are important too, but it is good to know for what it is suitable.
It's a compiler kit, and I also added two C compilers, and of course I needed a standard library for those. It wouldn't make sense to have a separate project just for the standard library. Anyway, Newlib was not a good match for this for the said reasons. That was my own proposition so far. My compilers are expected to also work on embedded systems, even on bare metal.
> like in most such projects you should add on the initial page some rationale for the existence of the project
Have a look at the readmes; there is one in the root and most subdirectories.
// QEMU UART registers - these addresses are for QEMU's 16550A UART
#define UART_BASE 0x10000000
#define UART_THR (*(volatile char *)(UART_BASE + 0x00)) // Transmit Holding Register
#define UART_RBR (*(volatile char *)(UART_BASE + 0x00)) // Receive Buffer Register
#define UART_LSR (*(volatile char *)(UART_BASE + 0x05)) // Line Status Register
This looks odd. Why are receive and transmit buffer the same and why would you use such a weird offset? Iirc RISC-V allows that, but my gut says I'd still align this to the word size.Backwards compatibility aside, why bother implementing additional register address decoding? Since the host already doesn't need to read THR or write RBR they can be safely combined. Some UARTs call this a DATA register instead.
…from 1978.
https://en.m.wikipedia.org/wiki/8250_UART
The definitions are correct, look up an 16550 datasheet if you want to lose some sanity :)
It seems like one thing to get a bare-bones printf() working to get you started on a bit of hardware, but as the complexity of the system grows you might want to move on from (say) pushing characters out of a serial interface onto pushing them onto a bitmapped display.
Does newlib allow you to put different hooks in there as the complexity of the system increases?
The latter is small enough so that I have used it in the past with various small microcontrollers, from ancient types based on PowerPC or ARM7TDMI to more recent MCUs with Cortex-M0+.
You just need to make the right configuration choice.
That way you can print to a serial port, an LCD Display, or a log.
Meaning seriously the standard printf is late 1970's hot garbage and no one should use it.
That is normally enough to reduce the footprint of printf by more than an order of magnitude, making it compatible with small microcontrollers.
And it all falls apart as soon as a format string cannot be known at compile time.
Compilers do that, at least for the simple case of constant strings; gcc can compile a printf call as puts. See https://stackoverflow.com/questions/60080021/compiler-change...
Did you mean "once I learned that no, embedded libc variants have printf"?
To clarify as I had to check, embedded libc variants do indeed have some (possibly stripped-down) implementation of printf and as you say they just lack the output path (hence custom output backends like UART, etc).
char buffer[100];
printf("Type something: ");
scanf("%s", buffer);
Come on, it’s 2025, there’s no need to write trivial buffer overflows anymore.You see, actually, the printf() family of functions don't actually require _any_ metal, bare or otherwise, beyond the ability to print individual characters.
For this reason, a popular approach for the case of not having a full-fledged standard library is to have a fully cross-platform implementation of the family which "exposes" a symbol dependency on a character printing function, e.g.:
void putchar_(char c);
and variants of the printf functions which take the character-printing function as a runtime parameter: int fctprintf(void (*out)(char c, void* extra_arg), void* extra_arg, const char* format, ...);
int vfctprintf(void (*out)(char c, void* extra_arg), void* extra_arg, const char* format, va_list arg);
this is the approach taken in the standalone printf implementation I maintain, originally by Marco Paland:cf. https://news.ycombinator.com/item?id=43811191 for other notes.
And as a pre-processor I use a simple C preprocessor (I don't want to tie the code to the pre-processor of a specific assembler): I did that for x86_64 assembly, and I could assemble with gas, nasm and fasmng(fasm2) transparently.
That said, I know in some cases it could increase performance since the code would use less memory (and certainly more things which I don't know because I am not into modern advanced hardware CPU micro-architecture design).
If you are writing assembly you probably are using compressed instructions already since your assembler can do the substitions transparently, e.g.
addi a0,a0,10 -> c.addi a0,10
Example: https://godbolt.org/z/MG3v3jx7P (the disassembly shows addi but the instruction is only two bytes).They offer a nice reduction in code size with basically no downsides :)
I'd have called it "Bare metal puts()" or "Bare metal write()" or something along those lines instead.
(FWIW, FreeBSD's printf() is quite easy to pluck out of its surrounding libc infrastructure and adapt/customize.)
(The problem with '%D' hexdumps is that it breaks compiler format checking… and also 'D' is a length modifier for _Decimal64 starting in ISO C23… that's why our hexdump is hooked in as '%.*pHX' instead [which still gives a warning because %p is not supposed to have a precision, but at least it's not entirely broken.])
What customization would it support? Say, compared to these options:
https://github.com/eyalroz/printf?tab=readme-ov-file#cmake-o...
https://github.com/FRRouting/frr/tree/master/lib/printf
Disclaimer: my work.
Customised to support %pHX, %pI4, %pFX, etc. - docs at https://docs.frrouting.org/projects/dev-guide/en/latest/logg... for what these do.
> What customization would it support?
I don't understand your question. It's reasonably readable and understandable source code. You edit the source code. That's the customisation?
> Say, compared to these options: https://github.com/eyalroz/printf?tab=readme-ov-file#cmake-o...
First, it is customary etiquette to indicate when linking your own code/work.
Second, that is not a POSIX compatible printf, it lacks support for '%n$' (which is used primarily for localisation). Arguably can make sense to omit for tiny embedded platforms - but then why is there FP support?
Third, cmake and build options really seem to be overkill for something like this. Copy the code into the target project, edit it. If you use your own printf, you probably need a bunch of other custom stuff anyway.
Fourth, the output callback is a reasonable idea, but somewhat self-contradictory. You're bringing in your own printf. Just adapt it to your own I/O backend, like libc has FILE*.
I meant, customization where you don't have to write the customized code yourself, just choose some build options, or at most set preprocessor variables.
> First, it is customary etiquette to indicate when linking your own code/work.
You're right, although I was only linking to the table of CMake options. And it's only partially my code, since I'm the maintainer rather than the original author
> You're bringing in your own printf. Just adapt it to your own I/O backend, like libc has FILE.
One can always do that, but - with the output callback - you can bring in an already-compiled object, which is sometimes convenient.
> If you use your own printf, you probably need a bunch of other custom stuff anyway.
My personal use case (and the reason I adopted the library) was printf deficiencies in CUDA GPU kernels. And - I really needed nothing other than printf functions. Other people just use sprintf to format output of their mostly, or wholly, self-contained functions which write output to buffers and such. Different strokes for different folks etc.
But - I will definitely check out the link.
> Second, that is not a POSIX compatible printf, it lacks support for '%n$' (which is used primarily for localisation).*
That is true. But C99 printf and C++ printf do not support that either. ATM, the aim is completing C99 printf support (when I actually work on the library, which is not that often). So, my priority would be FP denormals and binary FP (with "%a"), before other things.
> * Arguably can make sense to omit for tiny embedded platforms - but then why is there FP support?*
It's there because people wanted it / needed it; and so far, there's not been any demand for numbered position specification.
Honestly, if you're shying away from customising an 1-2kloc piece of code, you probably shouldn't be using a custom printf().
Case in point: function pointers are either costly or even plain unsupported on GPU architectures. I would speculate that you aren't using the callbacks there?
Well, it was good enough for the arduino SDK to adopt: https://github.com/embeddedartistry/arduino-printf
> * function pointers are either costly or even plain unsupported on GPU architectures.*
When you printf() from a GPU kernel, your performance is shot anyway, so performance is not a consideration. And - function pointers work, as long as they all get resolved before runtime, and you don't try to cross CPU <-> GPU boundaries.
> Well, it was good enough for the arduino SDK to adopt: https://github.com/embeddedartistry/arduino-printf
Well, they didn't shy away from customizing it quite a bit ;)
To be clear I was trying to say it doesn't make too much sense to try to package this as an independent "easy to use" "library" with a handful of build options. Not that it's somehow not "good enough".
Put another way: a situation where you need/want a custom printf is probably a situation where a package like this doesn't exactly help you anyway and you'll need to muck with it regardless. But the code can be used. Which is exactly what the repo you linked did.
TBH I'd use qemu if I had to make something work for arbitrary code.
https://wiki.gentoo.org/wiki/Crossdev
But there are others.
Poopy garbage dog poop.
glibc is a dumpster fire of bad design. If you want to cross-compile for an arbitrarily old version of glibc then... good luck. It can be done. But it's nightmare fuel.
but I can answer with reasonable confidence "musl surely has other problems, but not this one". It's a nice, clean, simple, single set of headers and source files. Very nice.
It should be trivial to compile glibc with an arbitrary build system for any target Linux platform from any OS. For example if I'm on Windows and I want to build a program that targets glibc 2.23 for Ubuntu on x86_64 target that should be easy peasy. It is not.
glibc should have ONE set of .c and .h files for the entire universe. There should be a small number of #define macros that users need to specify to build whatever weird ass flavor they need. These macros should be plainly defined in a single header file that anyone can look at.
But glibc is a pile of garbage and has generated files for every damn permutation in the universe. This is not necessary. It's a function of bad design. Code should NEVER EVER EVER have a ./configure step that generates files for the local platform. EVER.
Read this blog post to understand the mountains that Zig moved to enable cross-compilation. It's insane. https://andrewkelley.me/post/zig-cc-powerful-drop-in-replace...
Code generation aside, this is not really a great way to do it either. The build should include target-specific files, as opposed to creating a maze of ifdefs inside the code.
Hard disagree. What makes you think it's a maze of ifdefs?
Compiling a library should be as simple as "compile every .c/.cpp file and link them into a static/shared lib". The nightmare maze is when you don't do that and you need to painfully figure out which files you should and shouldn't include in this particular build. It's horrible.
Far far far simpler is to stick with the rule "always compile all files". It's very simple to ifdef out an entire file with a single, easily understood ifdef at the top.
I do agree you don't want the middle of a file to be a fully of 10 different ifdef cases. There's an art to whether to switch within a file or produce different files. No hard and fast rule there.
Fundamentally you put either your branching logic in the code or in the build system. And given that source files should be compatible with an unbounded array of potential build systems it is therefore superior to solve the problem once in the code itself.
I am currently trying to get the Zig glibc source/headers to compile directly via clang and trying to figure out which files to include or not include is infuriatingly unclear and difficult. So no, I strongly and passionately disagree that build-system is where this logic should occur. It's fucking awful.
Like it is done on purpose.