Detailed Compilation Process with C Program Example

We will see the journey of the c program from source code to executable with help of the GCC compiler. We will see the input and output of all 4 steps involved in the compilation process.

The plan of action for this blog will be as follows,

  1. Source c code
  2. Preprocessing
  3. Compiling
  4. Assembling
  5. Linking

The compilation is 2nd step of this process and this process is also called compilation overall. 😕

Source C Code

We will create 2 libraries named simple_math and algebra. simple_math consists of 2 functions sum and minus. while algebra consists of a single method evaluate. you can find the source code in the GitHub repository link.

#ifndef __SIMPLE_MATH__
#define __SIMPLE_MATH__

int sum(int a, int b);

int minus(int a, int b);

#endif
#include"algebra.h"
#include<stdio.h>
#include<string.h>

int evaluate(char exp[]) {
  int length = strlen(exp);

  // considering 0+exp
  int left = 0, right = 0;

  // function pointer so that 
  // we can call correct function 
  int (*last_sign)(int, int) = &sum;

  for(int i=0;i<length;i++){
    
    if(exp[i] == '+') {
      left = (*last_sign)(left, right);
      right = 0;
      last_sign = &sum;
    } else if(exp[i] == '-') {
      left = (*last_sign)(left, right);
      right = 0;
      last_sign = &minus;
    } else {
      right = (exp[i] - '0') + (right * 10);
    }
  }
  
  return (*last_sign)(left, right);
}
#include"algebra.h"
#include"simple_math.h"

#include<stdio.h>

#define "12-19+33+57"

int main(){
  printf("12-19+33+57=%d\n",evaluate(EXP));

  printf("Sum of 2 and 3: %d", sum(2, 3));
  return 0;
}

int sum(int a, int b) {
  return a + b;
}

int minus(int a, int b) {
  return a - b;
}
#ifndef __ALGEBRA__
#define __ALGEBRA__

#include"simple_math.h"

int evaluate(char a[]);

#endif

Preprocessing

At preprocessing stage, header files will get added recursively. This means in our example, algebra.h and simple_math.h will get added in the first pass and in the next pass, it will add simple_math.h coming from algebra.h recursively. It will also place value EXP wherever it is present.

Preprocessor processes include files, conditional compilation instructions, and macros. You can preprocess our code in GCC as follow,

# -E Preprocess only; do not compile, assemble or link.
gcc -E main.c -o main_pre.c

You can do the same thing with simple_math.c & algebra.c and create files simple_math_pre.c & algebra_pre.c  respectively.

Compilation

In this stage, we create assembly code from preprocessed files.

# -S Compile only; do not assemble or link.
# this will create assembly code
gcc -S main_pre.c -o main.s

You can open main.s in any text editor and check assembly code generated. Create similar assembly code files for the other 2 source files.

Assembling

During this stage, an assembler is used to translate the assembly instructions to object code. The output consists of actual instructions to be run by the target processor.

# -c Compile and assemble, but do not link.
# if already compiled then only assemble
gcc -c main.c -o main.o

# we cant see content of object file
# seeing object file
# elf format executable and linkable format
objdump -D main.o

Linking

It takes one or more files or libraries as input and combines to produce a single executable file. In this stage, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and variables, and revises code and data to reflect new addresses (a process called relocation)

# link all object file and libraries
# and create single executable
gcc main.o algebra.o simple_math.o -o main

Linking static & dynamic libraries

This step is not necessary for our example because we are not using any external library.

If you are using static libraries then you can link these static libraries as follow,

# -L is for specifyig path of library
# in our case in same folder 
# ldll means libdll.a
# you can directly user libdll.a instead -ldll
gcc <application_object_files> -L . -ldll
gcc <application_object_files> -L . libdll.a

Or if you are using dynamic or shared libraries then you can link as follows

# ldll is libdll library
# it should present in /usr/lib folder
gcc <application_object_files> -ldll

References

What Really Happens when a C program runs?

Parikshit Patil

Parikshit Patil

Currently working as Software Engineer at Siemens Industry Software Pvt. Ltd. Certified AWS Certified Sysops Administrator - Associate.
Kavathe-Ekand, MH India