Hello everyone!
Welcome to another of my posts, which will be about computer(mainly CPU and GPU) architectures.

So, what exactly is a computer architecture? Computer architecture is kind of like.. its difficult to explain really, so lets get there from the basics. First of all, it should be noted that computer architecture isnt singular. Theres propably more than 1 computer architecture in a computer, but there can be up to preety much unlimited. So, theres for example CPU architecture, GPU architecture, and all other _PU architectures. For example, nvidia gpus use the CUDA architecture, cpus commonly use something like x86 or arm, etc.

And what exactly does computer architecture consist of? Well for developers, its commonly the instruction set, some specs of the architecture, etc. and the smaller the architecture set, the less transistor does the CPU need and therefore the less power it consumes. Thats why reduced instruction set architectures(e.g. RISC, ARM) were created. For example, x86 architecture contains somewhere from 1500-6000instructions, depending on the number of the extensions, while RISC-V(5th gen.) contains only about 40 at its base, however this is much less than it normally is, because risc-v is modular, so we can count about 200instructions with normal risc-v cpu. This is still much less than x86 however, thus making it faster and cheaper to manufacture.

Why not make the fewest instructions as possible, if this is the case tho? Well, then there will be much less functionality to the cpu! For example if you try to find a way on arm to easily divide two numbers, you will most likely find that its close to impossible to do it easily. You have to use some stupidly large number, which will somehow trigger the division in the alu etc. So, its ideal to have all important, but not any unimportant instructions.

Now, why would we even care about CPU instructions? Well, for example when you try to compile a file for ARM, the process is different than for x86 for example. Lets try to run an arm executable on x86, shall we? Lets make a simple empty executable like this: int main(){} Now, lets compile it! I will use the arm-none-eabi-gcc for the job, and I can use it normally like any gcc complier. That brings me to another point, cross compilation. Unless you own every single computer architecture computer in history, its preety much impossible to avoid it. Not only does it take a lot of time to download all the compliers, but you will most likely find yourself in a lot of trouble setting them up and everything. This is big advantage of non-fully compiled languages like java, because you don't have to do preety much any setting up for every computer architecute, and it will propably just work on the machine if it supports java.

CPU architecture isn't the only one by far to be aware of tho, so, what about CUDA for example? Its an architecture thats used on GPUs, therefore the more cores, the better. Modern GPUs can have more than one type of cores tho. For example, some have CUDA cores, combined with some ML(machine learning) cores, with some even weirder architecture cores. Luckily, we don't need to strugle programming all of them in assembly. There are easy to use libraries like OpenGL, Vulkan, and many others that not only take one GPU type, but all of them and make everything work together nicely, at only the cost of some additional filesize, if the exectuable is not dynamic.

Anyway back to the arm file, when I compiled using the arm-none-eadbi complier, it compiled, but failed to run due to the incompatibility, which was highly expected. Well, but why don't we make just one computer architecture for everything, that would have the ideal amount of instructions, would be fast and everything like that? Well, its not possible to do this job goodly. Why? We cannot make universal architecture, because everything requires something different. x86 chips were designed right now for over propably 40years, so it makes sence they are so bloated, with every new generation adding tens of instrucitons. CUDA cores on the other hand are designed to use as few transistors as possible, because as many cores you can stick on the singular silicon chip, the better. Its something like when the CPU trying to do everything, including the tasks it shouldn't. Developers use CPU's for example for many parraler tasks, instead of using the GPU, because its easier, obviously, making the entire program slower and the experience using it worse.

Well, that will propably be all for today, see you next(or previous, in case of another time issue) time!