Hello everyone! This post will be about why executables are so big, and how to make them smaller. So, lets begin with some high level language! Also before we start there are few things I need to remind. The machine I will be testing this stuff on is 64bit Linux machine with the official GCC/Dotnet/Nasm used as complier. I will start with C# self contained executable, that looks something like this: using System; class Program{ public static void Main(string[] args){ Console.WriteLine("Hello, world!\n"); } } If I compile it with dotnet 8.0, it gives us output of about 56megabytes. But thats preety large. Lets try dotnet native complier that came out recently. It gives us output of about 1.40megabytes! Thats very low, given the fact that c# is very high level language. But 1.40 megabytes for Hello World is still alot, so lets go smaller! I rewrote the same program in Clang, and it loks like this: #include int main(){ printf("Hello, world!\n"); } Anyway, the ./a.out file is about 15.1Kb big. Why is simple executable like this over 15kilobytes? The reason lies in its dependencies, and mainly the dependency on clib. Clib is very big library, and even tho we only used the stdio library, it still adds up to those mentioned 15kb. So, how can we make it smaller? We can try to use different complier first. I tried to use clang from llvm as the complier, but the output was only about 20bytes bigger. So, what else can we do? Well, we can discard the clib altogether by using linux syscalls and assembly. So, lets do a simple program and see how it goes! So, I wrote this simple hello world program in asm and it looks like this: SECTION .data str: db "Hello, World!", 0 len: equ $ - str SECTION .text global _start _start: mov eax, 4 mov ebx, 1 mov ecx, str mov edx, len int 0x80 mov eax, 1 mov ebx, 0 int 0x80 As you can see, the programs code is now bigger in code size itself, but what about the filesize? After compiling it, the filesize is about 8.5kb. But wait, shouldnt it be in bytes, since the clib is now missing? Well, yes, but not really. The thing taking up most space right now is the elf structure. Sadly I can't show you the binary code itself due to the text format of this post, but in the executable is about 8kb worth of padded zeroes. And thats alot. The final size comes up to about 480bytes. But, how could we make it smaller? First of all, we can try to change platforms. I tried to compile the same program in C# you have seen earlier, but for windows pe format using mono this time. And the executable was just about 3kb! Lets look at the binary code, shall we? Even tho the binary size is smaller, the number of zeroes is actually much smaller, making the code itself without padding about 2.1kb. This means we cannot rely on any OS anymore, because the 480bytes file is just the smallest we will ever get using linux. So, lets go to real mode! Maximum size we can use here is 512bytes, before needing to load the disk and load more. So, lets write 16bit program for it! First of all, we have to write the string itself. I will define it like this: str: db "Hello, World!" This will define a hello world string, and then I wrote the rest of the program like this: bits 16 mov si, str jmp loop1 str: db "Hello, world!" strend: loop1: inc di mov si, strend mov ah, 0x0e mov al, byte [si] int 0x10 cmp di, si jne loop1 jmp $ times 510 - ($-$$) db 0 dw 0xAA55 And, what about the(actual with non zeroes) filesize? Its 35bytes. And this is truly the smallest you can go with hello world program. But can you go smaller? Yes, in fact you can go down to just 2 bytes. Obviously its not hello world anymore, but, the entire code consists of 1 instruction. jmp $ This is just an infinite loop, with nothing else. This is truly the smallest executable you can ever run, well is it? Not really. I took my breadboard, 2 transistors, and made the simpliest instruction. The not instruction, consisting of only 1 bit. And thats it! You simply can't go below 1 bit with the program still being ran by electricity. So, I can now comfortably say I made the smallest program that will ever be archieved.