Using and Porting GNU Fortran

Node: Arrays, Next: Adjustable Arrays, Previous: Complex Variables, Up: Debugging and Interfacing

Arrays (DIMENSION)

Fortran uses "column-major ordering" in its arrays. This differs from other languages, such as C, which use "row-major ordering". The difference is that, with Fortran, array elements adjacent to each other in memory differ in the first subscript instead of the last; A(5,10,20) immediately follows A(4,10,20), whereas with row-major ordering it would follow A(5,10,19).

This consideration affects not only interfacing with and debugging Fortran code, it can greatly affect how code is designed and written, especially when code speed and size is a concern.

Fortran also differs from C, a popular language for interfacing and to support directly in debuggers, in the way arrays are treated. In C, arrays are single-dimensional and have interesting relationships to pointers, neither of which is true for Fortran. As a result, dealing with Fortran arrays from within an environment limited to C concepts can be challenging.

For example, accessing the array element A(5,10,20) is easy enough in Fortran (use A(5,10,20)), but in C some difficult machinations are needed. First, C would treat the A array as a single-dimension array. Second, C does not understand low bounds for arrays as does Fortran. Third, C assumes a low bound of zero (0), while Fortran defaults to a low bound of one (1) and can supports an arbitrary low bound. Therefore, calculations must be done to determine what the C equivalent of A(5,10,20) would be, and these calculations require knowing the dimensions of A.

For DIMENSION A(2:11,21,0:29), the calculation of the offset of A(5,10,20) would be:

       (5-2)
     + (10-1)*(11-2+1)
     + (20-0)*(11-2+1)*(21-1+1)
     = 4293

So the C equivalent in this case would be a[4293].

When using a debugger directly on Fortran code, the C equivalent might not work, because some debuggers cannot understand the notion of low bounds other than zero. However, unlike f2c, g77 does inform the GBE that a multi-dimensional array (like A in the above example) is really multi-dimensional, rather than a single-dimensional array, so at least the dimensionality of the array is preserved.

Debuggers that understand Fortran should have no trouble with non-zero low bounds, but for non-Fortran debuggers, especially C debuggers, the above example might have a C equivalent of a[4305]. This calculation is arrived at by eliminating the subtraction of the lower bound in the first parenthesized expression on each line--that is, for (5-2) substitute (5), for (10-1) substitute (10), and for (20-0) substitute (20). Actually, the implication of this can be that the expression *(&a[2][1][0] + 4293) works fine, but that a[20][10][5] produces the equivalent of *(&a[0][0][0] + 4305) because of the missing lower bounds.

Come to think of it, perhaps the behavior is due to the debugger internally compensating for the lower bounds by offsetting the base address of a, leaving &a set lower, in this case, than &a[2][1][0] (the address of its first element as identified by subscripts equal to the corresponding lower bounds).

You know, maybe nobody really needs to use arrays.