Next: , Previous: LTO object file layout, Up: LTO


24.3 Using summary information in IPA passes

Programs are represented internally as a callgraph (a multi-graph where nodes are functions and edges are call sites) and a varpool (a list of static and external variables in the program).

The inter-procedural optimization is organized as a sequence of individual passes, which operate on the callgraph and the varpool. To make the implementation of WHOPR possible, every inter-procedural optimization pass is split into several stages that are executed at different times during WHOPR compilation:

The implementation of the inter-procedural passes are shared between LTO, WHOPR and classic non-LTO compilation.

To simplify development, the GCC pass manager differentiates between normal inter-procedural passes and small inter-procedural passes. A small inter-procedural pass (SIMPLE_IPA_PASS) is a pass that does everything at once and thus it can not be executed during WPA in WHOPR mode. It defines only the Execute stage and during this stage it accesses and modifies the function bodies. Such passes are useful for optimization at LGEN or LTRANS time and are used, for example, to implement early optimization before writing object files. The simple inter-procedural passes can also be used for easier prototyping and development of a new inter-procedural pass.

24.3.1 Virtual clones

One of the main challenges of introducing the WHOPR compilation mode was addressing the interactions between optimization passes. In LTO compilation mode, the passes are executed in a sequence, each of which consists of analysis (or Generate summary), propagation (or Execute) and Transform stages. Once the work of one pass is finished, the next pass sees the updated program representation and can execute. This makes the individual passes dependent on each other.

In WHOPR mode all passes first execute their Generate summary stage. Then summary writing marks the end of the LGEN stage. At WPA time, the summaries are read back into memory and all passes run the Execute stage. Optimization summaries are streamed and sent to LTRANS, where all the passes execute the Transform stage.

Most optimization passes split naturally into analysis, propagation and transformation stages. But some do not. The main problem arises when one pass performs changes and the following pass gets confused by seeing different callgraphs between the Transform stage and the Generate summary or Execute stage. This means that the passes are required to communicate their decisions with each other.

To facilitate this communication, the GCC callgraph infrastructure implements virtual clones, a method of representing the changes performed by the optimization passes in the callgraph without needing to update function bodies.

A virtual clone in the callgraph is a function that has no associated body, just a description of how to create its body based on a different function (which itself may be a virtual clone).

The description of function modifications includes adjustments to the function's signature (which allows, for example, removing or adding function arguments), substitutions to perform on the function body, and, for inlined functions, a pointer to the function that it will be inlined into.

It is also possible to redirect any edge of the callgraph from a function to its virtual clone. This implies updating of the call site to adjust for the new function signature.

Most of the transformations performed by inter-procedural optimizations can be represented via virtual clones. For instance, a constant propagation pass can produce a virtual clone of the function which replaces one of its arguments by a constant. The inliner can represent its decisions by producing a clone of a function whose body will be later integrated into a given function.

Using virtual clones, the program can be easily updated during the Execute stage, solving most of pass interactions problems that would otherwise occur during Transform.

Virtual clones are later materialized in the LTRANS stage and turned into real functions. Passes executed after the virtual clone were introduced also perform their Transform stage on new functions, so for a pass there is no significant difference between operating on a real function or a virtual clone introduced before its Execute stage.

Optimization passes then work on virtual clones introduced before their Execute stage as if they were real functions. The only difference is that clones are not visible during the Generate Summary stage.

To keep function summaries updated, the callgraph interface allows an optimizer to register a callback that is called every time a new clone is introduced as well as when the actual function or variable is generated or when a function or variable is removed. These hooks are registered in the Generate summary stage and allow the pass to keep its information intact until the Execute stage. The same hooks can also be registered during the Execute stage to keep the optimization summaries updated for the Transform stage.

24.3.2 IPA references

GCC represents IPA references in the callgraph. For a function or variable A, the IPA reference is a list of all locations where the address of A is taken and, when A is a variable, a list of all direct stores and reads to/from A. References represent an oriented multi-graph on the union of nodes of the callgraph and the varpool. See ipa-reference.c:ipa_reference_write_optimization_summary and ipa-reference.c:ipa_reference_read_optimization_summary for details.

24.3.3 Jump functions

Suppose that an optimization pass sees a function A and it knows the values of (some of) its arguments. The jump function describes the value of a parameter of a given function call in function A based on this knowledge.

Jump functions are used by several optimizations, such as the inter-procedural constant propagation pass and the devirtualization pass. The inliner also uses jump functions to perform inlining of callbacks.