What happens when you type gcc main.c?

The big C.

C is a programming language designed by Dennis Ritchie back in the 70's. It is what is known as a low-level language, meaning that it provides little or no abstraction from a computer’s instruction set architecture — commands or functions in the language map closely to processor instructions. Generally, this refers to either machine code or assembly language. Because of the low (hence the word) abstraction between the language and machine language, low-level languages are sometimes described as being “close to the hardware”.

C is also a compiled language, as opposed to interpreted, meaning that files written in C should be compiled in order for them to be executable.

Compiled?

The compilation process has four different steps:

  • The pre-processing
  • The compiling
  • The assembling
  • The linking

The compiler we will be using as an example is GCC (GNU Compiler Collection). The GNU project is an free-software and mass-collaboration project launched by Richard Stallman in 1983, allowing developers to have access to powerful tools for free.

The letters “gcc” are also the Unix command itself to launch the compilation process on any “.c” file we might have written. Unless explicitly instructed otherwise, gcc will run the already mentioned four steps always in the same order: pre-processing, compilation proper, assembly and linking.

So, let’s assume we have written a C program and saved it into a file called main.c. By the way, the “.c” extension means that this is a file containing C code. This is how the raw code stored in main.c would look like:

So, let’s do this!

The pre-processor takes the source code as an input, and it removes all the comments from it. The pre-processor takes the pre-processor directive and interprets it. For example, if the<stdio.h> directive is available in the program, then the pre-processor interprets the directive and replace this directive with the content of the ‘stdio.h’ file.

The code which is expanded by the pre-processor is passed to the compiler. The compiler converts this code into assembly code. Or we can say that the C compiler converts the pre-processed code into assembly code.

The assembly code is converted into object code by using an assembler. The name of the object file generated by the assembler is the same as the source file. The extension of the object file in DOS is ‘.obj,’ and in UNIX, the extension is ‘o’. If the name of the source file is ‘main.c’, then the name of the object file would be ‘main.obj’.

Mainly, all the programs written in C use library functions. These library functions are pre-compiled, and the object code of these library files is stored with ‘.lib’ (or ‘.a’) extension. The main working of the linker is to combine the object code of library files with the object code of our program. Sometimes the situation arises when our program refers to the functions defined in other files; then linker plays a very important role in this. It links the object code of these files to our program. Therefore, we conclude that the job of the linker is to link the object code of our program with the object code of the library files and other files. The output of the linker is the executable file. The name of the executable file is the same as the source file but differs only in their extensions. In DOS, the extension of the executable file is ‘.exe’, and in UNIX, the executable file can be named as ‘a.out’. For example, if we are using printf() function in a program, then the linker adds its associated code in an output file.