This article has been translated from Chinese by ChatGPT, and the wording may not be entirely accurate.
Section GC Analysis - Part 1: An Introduction to the Principles
Overview
This article is part of the Addressing Linux Kernel Section GC Failure Issues series.
- Section GC Analysis - Part 1 Introduction to the Principle
- Section GC Analysis - Part 2 Gold Source Code Analysis
- Section GC Analysis - Part 3 Reference Construction Process
- Addressing Linux Kernel Section GC Failure Issues - Part 1
- Addressing Linux Kernel Section GC Failure Issues - Part 2
This article will provide a brief introduction to the --gc-sections
feature.
The --gc-sections
option in GCC can trim unused functions and variables during linking.
During compilation and linking, the compiler and linker create a symbol table. When the --gc-sections
option is enabled, the linker will analyze the symbol table to determine which code and data are not used, and then remove them from the final output. This pruning has several benefits:
- Reduces the size of the executable file
- Optimizes loading time
- More cache-friendly for instructions and data
- Reduces attack surface
The prerequisite for performing GC on sections is that every function and data segment should have its own section before linking. However, by default, GCC places all functions in the .text
section. We can use the -ffunction-sections
parameter to give each function its own section.
Introduction to -ffunction-sections
By default, compilers place data into different sections according to the following rules:
Section | Data Type | Description |
---|---|---|
.text |
Executable code | Contains the program’s machine instructions |
.rodata |
Read-only data | Contains immutable constant data, such as string literals and global constants |
.data |
Initialized read-write data | Contains initialized global and static variables that can be read and written during program execution |
.bss |
Uninitialized data | Contains uninitialized global and static variables |
With all the code placed in the code section, the linker does not know which functions and variables are used and cannot perform trimming. To enable garbage collection, each function needs to have its own section.
The -ffunction-sections
and -fdata-sections
options in GCC make each function or variable have its own section. Here we will only discuss -ffunction-sections
in detail; -fdata-sections
works similarly.
Here is an example code that includes a used function fun()
and an unused function unused()
:
void fun(){
return;
}
void unused(){
return;
}
int main(){
fun();
}
Enable the -ffunction-sections
option, compile the file without linking.
gcc -c test.c
Examine the symbol table of the object file.
$ readelf -s test.o
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS test.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 .text
5: 0000000000000000 11 FUNC GLOBAL DEFAULT 1 fun
6: 0000000000000000 11 FUNC GLOBAL DEFAULT 1 unused
7: 0000000000000000 25 FUNC GLOBAL DEFAULT 1 main
We can see the functions fun()
and unused()
do not have individual sections.
Enable the -ffunction-sections
option again and compile, but do not link.
gcc -c -ffunction-sections test.c
Examine the symbol table of the object file.
$ readelf -s test.o
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS test.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 4 .text.fun
3: 0000000000000000 0 SECTION LOCAL DEFAULT 5 .text.unused
4: 0000000000000000 0 SECTION LOCAL DEFAULT 6 .text.main
5: 0000000000000000 11 FUNC GLOBAL DEFAULT 4 fun
6: 0000000000000000 11 FUNC GLOBAL DEFAULT 5 unused
7: 0000000000000000 25 FUNC GLOBAL DEFAULT 6 main
Now we can observe that both fun()
and unused()
functions have their own sections.
–gc-sections in Practice
Compile the example program without enabling the --gc-sections
option and check the object file size:
$ gcc test.c
$ size a.out
text data bss dec hex filename
1340 544 8 1892 764 a.out
The --print-gc-sections
option can print the sections that are trimmed. These arguments have to be passed to the linker with -Wl
.
Compile the example program with the --gc-sections
option enabled, print the trimmed sections, and check the object file size:
$ gcc -ffunction-sections -Wl,--gc-sections,--print-gc-sections test.c
/usr/bin/ld: removing unused section '.rodata.cst4' in file '/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o'
/usr/bin/ld: removing unused section '.data' in file '/usr/lib/gcc/x86_64-linux-gnu/11/../../../x86_64-linux-gnu/Scrt1.o'
/usr/bin/ld: removing unused section '.text.unused' in file '/tmp/cc9O4Y8L.o'
$ size a.out
text data bss dec hex filename
1285 536 8 1829 725 a.out
As we can see, the code section is reduced.
Read the symbol table:
$ readelf -s a.out | grep FUNC
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _[...]@GLIBC_2.34 (2)
5: 0000000000000000 0 FUNC WEAK DEFAULT UND [...]@GLIBC_2.2.5 (3)
4: 0000000000001070 0 FUNC LOCAL DEFAULT 14 deregister_tm_clones
5: 00000000000010a0 0 FUNC LOCAL DEFAULT 14 register_tm_clones
6: 00000000000010e0 0 FUNC LOCAL DEFAULT 14 __do_global_dtors_aux
9: 0000000000001120 0 FUNC LOCAL DEFAULT 14 frame_dummy
18: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main[...]
21: 0000000000001150 0 FUNC GLOBAL HIDDEN 15 _fini
22: 0000000000001129 11 FUNC GLOBAL DEFAULT 14 fun
26: 0000000000001040 38 FUNC GLOBAL DEFAULT 14 _start
28: 0000000000001134 25 FUNC GLOBAL DEFAULT 14 main
31: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GL[...]
32: 0000000000001000 0 FUNC GLOBAL HIDDEN 11 _init
It is evident that unused()
is no longer in the symbol table, while the used function fun()
remains.
Conclusion
Section GC is a method of trimming during the compilation and linking of binary files. It works by creating independent Sections for each function and variable during the compilation phase through the -ffunction-sections
and -fdata-sections
options. During linking, the --gc-sections
option then traverses all Sections, linking those functions and variables that are used into the target binary file, and removing the unused portions, thereby achieving the goal of reducing the program size.
References
- Tiny Linux Kernel Project: Section Garbage Collection Patchset