Using and Porting GNU Fortran

Node: Optimize Options, Next: Preprocessor Options, Previous: Debugging Options, Up: Invoking G77

Options That Control Optimization

Most Fortran users will want to use no optimization when developing and testing programs, and use -O or -O2 when compiling programs for late-cycle testing and for production use. However, note that certain diagnostics--such as for uninitialized variables--depend on the flow analysis done by -O, i.e. you must use -O or -O2 to get such diagnostics.

The following flags have particular applicability when compiling Fortran programs:

-malign-double

(Intel x86 architecture only.)

Noticeably improves performance of g77 programs making heavy use of REAL(KIND=2) (DOUBLE PRECISION) data on some systems. In particular, systems using Pentium, Pentium Pro, 586, and 686 implementations of the i386 architecture execute programs faster when REAL(KIND=2) (DOUBLE PRECISION) data are aligned on 64-bit boundaries in memory.

This option can, at least, make benchmark results more consistent across various system configurations, versions of the program, and data sets.

Note: The warning in the gcc documentation about this option does not apply, generally speaking, to Fortran code compiled by g77

See Aligned Data, for more information on alignment issues.

Also also note: The negative form of -malign-double is -mno-align-double, not -benign-double.

-ffloat-store

Might help a Fortran program that depends on exact IEEE conformance on some machines, but might slow down a program that doesn't.

This option is effective when the floating-point unit is set to work in IEEE 854 `extended precision'--as it typically is on x86 and m68k GNU systems--rather than IEEE 754 double precision. -ffloat-store tries to remove the extra precision by spilling data from floating-point registers into memory and this typically involves a big performance hit. However, it doesn't affect intermediate results, so that it is only partially effective. `Excess precision' is avoided in code like:

          a = b + c
          d = a * e

but not in code like:

                d = (b + c) * e

For another, potentially better, way of controlling the precision, see Floating-point precision.

-fforce-mem

-fforce-addr

Might improve optimization of loops.

-fno-inline

Don't compile statement functions inline. Might reduce the size of a program unit--which might be at expense of some speed (though it should compile faster). Note that if you are not optimizing, no functions can be expanded inline.

-ffast-math

Might allow some programs designed to not be too dependent on IEEE behavior for floating-point to run faster, or die trying. Sets -funsafe-math-optimizations, and -fno-trapping-math.

-funsafe-math-optimizations

Allow optimizations that may be give incorrect results for certain IEEE inputs.

-fno-trapping-math

Allow the compiler to assume that floating-point arithmetic will not generate traps on any inputs. This is useful, for example, when running a program using IEEE "non-stop" floating-point arithmetic.

-fstrength-reduce

Might make some loops run faster.

-frerun-cse-after-loop

-fexpensive-optimizations

-fdelayed-branch

-fschedule-insns

-fschedule-insns2

-fcaller-saves

Might improve performance on some code.

-funroll-loops

Typically improves performance on code using iterative DO loops by unrolling them and is probably generally appropriate for Fortran, though it is not turned on at any optimization level. Note that outer loop unrolling isn't done specifically; decisions about whether to unroll a loop are made on the basis of its instruction count.

Also, no `loop discovery'¹ is done, so only loops written with DO benefit from loop optimizations, including--but not limited to--unrolling. Loops written with IF and GOTO are not currently recognized as such. This option unrolls only iterative DO loops, not DO WHILE loops.

-funroll-all-loops

Probably improves performance on code using DO WHILE loops by unrolling them in addition to iterative DO loops. In the absence of DO WHILE, this option is equivalent to -funroll-loops but possibly slower.

-fno-move-all-movables

-fno-reduce-all-givs

-fno-rerun-loop-opt

Version info: These options are not supported by versions of g77 based on gcc version 2.8.

Each of these might improve performance on some code.

Analysis of Fortran code optimization and the resulting optimizations triggered by the above options were contributed by Toon Moene ([email protected]).

These three options are intended to be removed someday, once they have helped determine the efficacy of various approaches to improving the performance of Fortran code.

Please let us know how use of these options affects the performance of your production code. We're particularly interested in code that runs faster when these options are disabled, and in non-Fortran code that benefits when they are enabled via the above gcc command-line options.

See Options That Control Optimization, for more information on options to optimize the generated machine code.

Footnotes

loop discovery refers to the process by which a compiler, or indeed any reader of a program, determines which portions of the program are more likely to be executed repeatedly as it is being run. Such discovery typically is done early when compiling using optimization techniques, so the ``discovered'' loops get more attention---and more run-time resources, such as registers---from the compiler. It is easy to ``discover'' loops that are constructed out of looping constructs in the language (such as Fortran's DO). For some programs, ``discovering'' loops constructed out of lower-level constructs (such as IF and GOTO) can lead to generation of more optimal code than otherwise.