- org -- bootstrappable.org project #bootstrappable on libera.chat
A full source bootstrapping for all free software programs (starting with GuixSD)
A package in GNU Guix is uniquely identified by the hash of its source code, its dependencies, and its build recipe.
Every package can be built from source, except for the bootstrap binaries.
The distribution is fully “bootstrapped” and “self-contained”: each package is built based solely on other packages in the distribution.
The root of this dependency graph is a small set of “bootstrap binaries”, provided by the ‘(gnu packages bootstrap)’ module
$ du -schx
$ du -schx
$ du -schx
“recipe for yoghurt: add yoghurt to milk”
Due to the activities of nation states in regards to compromising critical infrastructure and undermining critical freedoms (including software), we don’t know if we can even trust our compilers.
We can’t collaborate to verify if we are running the same software if our build processes don’t produce identical results.
We need to stop ignoring the problem and as a community work to fullfill our responsibilities to our community and promote the trust and good will that made our communties places we wanted to become members of.
We need to stop feeling powerless to the problems in this world. Pick up a shovel and do some serious damage to the problems we want gone.
I want code easy to reason about at the heart of this bootstrap, so that everyone will be able to sit down in the morning and be done by lunch time; understading how every piece of it works.
Producing from the smallest possible source, create the foundation upon which we depend in a clean bootstrapping function that is auditable and stable.
A universal core bootstrap that produces identical results across arbitrary hardware and software foundations. https://savannah.nongnu.org/projects/stage0
Using a hardware specification that was implemented back in the 1970s in TTL and reduced down to the essentials. We give ourselves an alien hardware platform to verify the stage0 steps for x86 bootstraps and know if a Nexus Intruder attack occured in any of the steps.
- hex0 monitor (280bytes) ensures you don’t need a text editor or any other
software period. Trivial to make by hand (toggling in bytes if you want) or using a trivial program of your own written in any language you desire.
- hex0 assembler (260 bytes) Only supports line comments (# and ;) [Could be
smaller and if you trust your text editor, you can use this as the bootstrap instead of the hex monitor]
- hex1 assembler (488 bytes) written in hex0 and provides single char labels and
relative displacements only (16bit for knight-vm, 32bit for i386 and amd64)
- hex2 assembler (1036 bytes) written in hex1 and provides long labels, adds
absolute addresses and the missing set (8bit relative, 16bit absolute and 16bit relative and 32bit absolute)
- M0 macro assembler (1792 bytes) written in hex2 and now allows arbitrary
definitions (like DEFINE ADD 05000 or DEFINE ADDI32_to_RAX 4805) and then use those definitions to write programs (thus it can support both knight, x86 and arm assembly)
- cc_x86 (16,370 bytes) written in M0 and now allows C syntax, structs, unions,
inline assembly, gotos and other standard C goodies.
- M2-Planet (64,011 bytes) written in the subset of C that cc_x86 can compile and
is capable of self-hosting. Weighing in at 1910 lines of C Code and slowly expanding in terms of functionality.
- M2-Mesoplanet (128,366 bytes) written in the subset of C that M2-Planet can
compile is a more standard C preprocessor and simplifies compilation, assembly and linking.
- mescc-tools (M1 [67,186 bytes], hex2 [67,109 bytes], blood-elf [49,601 bytes],
get_machine [34,125 bytes] and kaem [79,340 bytes]) a crossplatform assembler, linker, dwarf stub generator, a tool for detecting what architecture we are running on and a basic shell which provides enough functionality to drive any further bootstrap required.
- mescc-tools-extra (catm [22,649 bytes], chmod [36,610 bytes], cp [43,447 bytes],
match [33,717 bytes], mkdir [34,913 bytes], replace [39,649 bytes], rm [34,062 bytes], sha256sum [50,008 bytes], unbz2 [61,587 bytes], ungz [62,265 bytes] and untar [46,574 bytes]) finishing off the stage0 steps, we provide some tools to enable a semi-comfortable development basis. The ability to combine files, mark them as executable, copy files around, support conditional paths in kaem scripts, make folders, perform sed like file alterations, delete files, checksum everything, unpack bz2, gz and tar archives (allowing much more standard package sources).
- Stage0 FORTH (4008 bytes) written in M0 macro assembly and extends itself in
its own FORTH Primitives and has a slowly growing initial library (approaching GFORTH parity thanks to reepca). No FORTH programs of real use in bootstrapping have been created.
- Stage0 Lisp (8400 bytes) written in M0 macro assembly. Supports all of the
LISP primitives defined in McCarthy’s 1960 paper [Turns out he missed many essential things] with some modern improvements like Lexical scope, let expressions and raw string support. Turns out you need proper LISP macros in order to produce something useful in bootstrapping. Adding LISP macros in assembly simply is a task no one wants to do.
- VHDL Knight-vm on FPGA
- Knight on TTL with manually punched paper tape (Game over Trusting trust
attack/Nexus Intruder attack)
- Simply verify our sha256sum’d steps produce identical binaries on your weird
shit (git clone ‘https://git.savannah.nongnu.org/git/stage0.git’ && cd stage0 && make && make test
- Or help porting your architecture to stage0-posix or stage0-uefi
(https://github.com/oriansj/stage0-posix and https://git.stikonas.eu/andrius/stage0-uefi)
- Find/report bugs
- Audit stage0
- Do something cool
A port of Stage0 to Linux (i386, AMD64, armv7l, AArch64, RISC-V 32 and 64bit) using ELF format binaries https://github.com/oriansj/mescc-tools
legacy piece no longer required.
A universal cross-platform linker buildable via M2-Planet. With support for absolute addressing, long labels, multiple offset sizes, allows arbitrary base addresses and of course line comments. It is written in a subset of C It is bootstrapped by M2-Planet in stage0-posix and supports Knight, x86, AMD64, armv7l and AArch64
A universal cross-platform macro assembler. With support for DEFINE line-macros, raw strings, hex literals, numerics, alignment operations, padding operations and arbitrary byte and bit endianness of output. It is written in a subset of C It is bootstrapped by M2-Planet in stage0-posix and supports Knight, x86, AMD64, armv7l and AArch64.
Since debugging is painful when gdb and objdump have no idea how to handle M1-macro files, blood-elf creates a dwarf footer segment from a M1-macro input that is in M1-macro format. Not actually needed in bootstrapping but rather helpful for those wishing to develop in M1-Macro assembly. It is written in a subset of C It is bootstrapped by M2-Planet in stage0-posix and supports all 32 and 64 elf targets.
Since automatic tests will always fail since mescc-tools is cross-platform and hardware neutral, this program exists to allow hardware specific tests to be run on generated binaries. eg. have your i386 tests run on your i386 hardware but not on your ARM, SPARC or RISC-V board. Not actually needed in bootstrapping but rather helpful for those wishing to have proper tests for their M1-macro programs. It is written in a subset of C It is bootstrapped by M2-Planet in stage0-posix and supports all Posix hosts (if we don’t support yours let us know)
- Add support for more architectures
- Port mescc-tools to your weird hardware/Operating system combinations.
- Write tests for alternate hardware targets
- Find bugs
A PLAtform NEutral Transpiler that happens to look and behave enough like C that you can do development in GCC and use M2-Planet to compile the result. https://github.com/oriansj/M2-Planet
knight-native, knight-posix, x86, AMD64, ARMv7l, AArch64, RISC-V 32 and 64bit
void void* int int* unsigned unsigned* char char* char const char const* long long* SCM (unsigned long) SCM* (unsigned long*) FUNCTION (void (FUNCTION) () FUNCTION (void* (*FUNCTION) ()) any struct you wish to define (with unions or arrays supported as well) Pointers to any struct you wish to define typedef statements
All in a trivial to understand implementation https://github.com/oriansj/M2-Planet/blob/master/cc_types.c
All in a trivial to understand implementation https://github.com/oriansj/M2-Planet/blob/master/cc_strings.c
M2-Planet in –bootstrap-mode supports 2 types of comments: * Stuff * block comments and \# Stuff line comments
and inorder to maximize compatibility with C M2-Planet does something funny with C line comments. // code; is actually compiled by M2-Planet thus allowing M2-Planet specific code to be embedded in your C sources.
It and any other odd parsing behavior can be found in the rather trivial parser https://github.com/oriansj/M2-Planet/blob/master/cc_reader.c
M2-Planet is written using only a subset of the features that it supports https://github.com/oriansj/M2-Planet/blob/master/cc_core.c
The only parts of the C language not supported are C macros, switch statements and features that are not useful in bootstrapping and thus are ignored (until someone comes up with a reasonable use case)
M2-Planet supports M1-macro assembly being inlined within functions via asm(..); Support for CONSTANT FOO 4 statements to replace #define FOO 4 and CONSTANT CELL_SIZE sizeof(struct cell) to replace #define CELL_SIZE 1 eliminate the need for a C preprocessor entirely.
- Porting to SPARC
- Porting to MIPS
- Porting to PowerPC
- Porting to z80 (maybe)
- Porting to 6502 (maybe)
- Port to your personal architecture
- Find bugs
- Improve documentation
- Send patches
- Port to your weird hardware
A late stage bootstrap core componet that ensures that once you have achieved a certain minimal floor, that you have a solid path to producing GCC and thus everything you desire. https://gitlab.com/janneke/mes https://git.savannah.gnu.org/git/mes.git
A scheme interpreter prototyped in C ~5000 Lines that standards at our baseline target of minimal functionality. If you can build this or provide equivalent functionality, you are good to go. This is buildable by M2-Planet or better C compiler.
Provided a reasonable scheme exists and is functional, we leverage that to provide a C compiler written in Scheme (uses Nyacc C99 parser in Scheme) that is the core of this project and is the path to full GCC bootstrapping. mescc along with mescc-tools are capable of self bootstrapping.
Not Yet Another Compiler Compiler, is set of guile modules for generating parsers and lexical analyzers. https://savannah.nongnu.org/projects/nyacc
A guile replacement for shell+binutils that can in the future run on Gnu MES https://savannah.gnu.org/projects/gash
- awk.scm
- basename.scm
- cat.scm
- chmod.scm
- cmp.scm
- command.scm
- compress.scm
- cp.scm
- cut.scm
- diff.scm
- dirname.scm
- expr.scm
- false.scm
- find.scm
- grep.scm
- gzip.scm
- head.scm
- ln.scm
- ls.scm
- mkdir.scm
- mv.scm
- printf.scm
- pwd.scm
- reboot.scm
- rmdir.scm
- rm.scm
- sed.scm
- sleep.scm
- sort.scm
- tar.scm
- test.scm
- touch.scm
- tr.scm
- true.scm
- uname.scm
- uniq.scm
- wc.scm
- which.scm
A lovingly crafted work of art by Fosslinux and Stikonas with a good deal of help from numerous great individuals which took stage0 and Gnu Mes and ran it all the way to a complete Linux distribution without requiring any generated files of any kind.
https://github.com/fosslinux/live-bootstrap
This solid piece of art can be the basis of your distro, bootstrap yourself today.
A very clever route to bootstrapping Gnu Guile without requiring pregenerated files by Michael Schierl. Definitely worth a read and a solid reminder that even GNU tools might have bootstrapping problems.
https://github.com/schierlm/guile-psyntax-bootstrapping
A brilliant effort by Richard R. Masters which is a 384byte bootloader (written in hex0) builds an elegant 3.5KB POSIX kernel written in hex0 which builds and spawns hex0 and kaem-optional and runs all of the steps of stage0-posix/live-bootstrap until it builds TCC and bootstraps a bigger and more standard kernel written in the C that TCC can build. https://github.com/ironmeld/builder-hex0
See Current\ bootstrap\ map.pdf
done
- stage0/stage0-posix/stage0-uefi are able to build mes.c
- mescc can build TCC and the path to modern software is done
- builder-hex0 is in progress to sorting out the zero to Linux path
- Gash is progressing and growing nicely
- programmers to port the steps to more architectures
- report bugs, issues, concerns or recommendations
- testing and finding issues with our documentation (we are human after all)
- We need people willing to improve documentation (art would be nice)
- People to tell us all the ways things are broken and we can make it better
- Every possible port of mescc-tools is buildable by every other possible
mescc-tool port and thus forces any hardware/software trusting trust attack to compromise all past, present and future hardware platforms, including those that are made for fun out of TTL logic: http://cpuville.com/Projects/Original-CPU/Original-CPU-home.html or even those made out of individual transistors: https://monster6502.com/ or should someone wish http://web.cecs.pdx.edu/~harry/Relay/ using electromechanical relays.
- Porting of stage0 and mescc-tools to alternate platforms becomes a
straightforward mechanical exercise.
- M2-Planet is trivial to modify to support alternate hardware platforms and
and thus function as a cross-platform, self-hosting compiler.
- M2-Planet’s output is 100% deterministic and easily predictable; even major
code changes result in only in differences directly related to the changed code block.
- no operating system is required until long after we bootstrap some
- Low level encoding details need to be figured out for various architectures
- Poorly thought out instruction encodings make for low density binaries (AArch64, RISC-V, etc)
- Requires large amounts of largely mechanical effort
- Very very few developers or contributors
#bootstrappable and #guix on libera.chat via bootstrappable.org via our mailing list: [email protected] or if issues are entirely MesCC only [email protected]
- Because FORTH developers have not contributed more.
- FORTH programmers would find collapseOS and duskos more to their liking: https://git.sr.ht/~vdupras/duskos
- Because you did not write it yet or make any useful bootstrapping programs in
it either.
- Probably because it sucks at bootstrapping or requires a great deal of work that
no one wants to do.
- The good news is this is simple to port to arbitrary hardware, so the cost
needed to bootstrap hardware you designed yourself is lower than ever.
- There is nothing we can do in terms of software that eliminates the risk of
Nexus Intruder program class hardware subversion; as the only solution to that risk is to have your own trusted lithography fabrication plant that is run using only hardware/software that you know is trusted and uncompromised.
- libresilicon is honestly the only path forward currently
- If you want us to supprt your hardware platform, you need to have a reasonable
hardware target and provide the documentation and testing required.
- We can’t port to hardware we don’t have
- stage-posix is the operating system/hardware specific port of stage0
- ELF is not actually required for stage0
- BIOS level versions of stage0 is possible by simply rewriting the
syscalls into BIOS calls, removing the ELF header, adjusting the base address and adding the standard PC boot signature (0xAA55)
- The knight implementation found in stage0 which run on bare metal lack such trivia.
- We completely agree; however writing 79,000+ custom bootstraps isn’t viable yet.
- So that is why we are bootstrapping hardware (Knight and x86 later)
- Because Jeremiah wrote cc_x86 before we found any
- Because only BDS-c is the only other C compiler ever written in assembly
and it supports less useful features in bootstrapping than cc_* and is DOS/CPM only.
- Even the original Unix C compiler was not written in assembly.
- It doesn’t address the problem of the trusting trust attack
- It would take far far longer than what we are willing waste time.
It was the old name for stage0-posix as originally it was just binary blobs created by janneke to start work on MesCC’s bootstrap in Guix. Jeremiah took it over and converted it first to generated M1 programs, then handwritten pieces and expanded it until it covered all the steps from hex0 to M2-Planet+mescc-tools; effectively becoming a full POSIX port of stage0, hence the name change.
- For the current work (stage0-posix), you are absolutely correct.
- But not for the stage0 work running on Knight as all those pieces run on bare
metal or stage0-uefi.
- Builder-hex0 is 4KB of hex0, you can audit that.
- You are absolutely correct.
- However writing the pieces without depending on the bios is VERY hardware
specific
- Common hardware also has firmware messured in MB which we can’t yet replace.
- Well one needs a bootloader on modern hardware (I can’t avoid this)
- A minimal posix is needed to provide syscalls enabling easier bootstrapping
(don’t want to avoid this)
- A minimal shell run execute the bootstrapping steps while running as init
(Can’t avoid this on a posix)
- A hex0 assembler (Can’t avoid this in this sort of bootstrap)
- We can try to reduce their size and enable the generation of those files
locally from source.
- Yep, there currently is nothing we can do about it
- Anyone who wants to work on that is more than free to do so.