The code for the data dependence analysis can be found in
tree-data-ref.c and its interface and data structures are
described in tree-data-ref.h. The function that computes the
data dependences for all the array and pointer references for a given
loop is compute_data_dependences_for_loop
. This function is
currently used by the linear loop transform and the vectorization
passes. Before calling this function, one has to allocate two vectors:
a first vector will contain the set of data references that are
contained in the analyzed loop body, and the second vector will contain
the dependence relations between the data references. Thus if the
vector of data references is of size n
, the vector containing the
dependence relations will contain n*n
elements. However if the
analyzed loop contains side effects, such as calls that potentially can
interfere with the data references in the current analyzed loop, the
analysis stops while scanning the loop body for data references, and
inserts a single chrec_dont_know
in the dependence relation
array.
The data references are discovered in a particular order during the scanning of the loop body: the loop body is analyzed in execution order, and the data references of each statement are pushed at the end of the data reference array. Two data references syntactically occur in the program in the same order as in the array of data references. This syntactic order is important in some classical data dependence tests, and mapping this order to the elements of this array avoids costly queries to the loop body representation.
Three types of data references are currently handled: ARRAY_REF,
INDIRECT_REF and COMPONENT_REF. The data structure for the data reference
is data_reference
, where data_reference_p
is a name of a
pointer to the data reference structure. The structure contains the
following elements:
base_object_info
: Provides information about the base object
of the data reference and its access functions. These access functions
represent the evolution of the data reference in the loop relative to
its base, in keeping with the classical meaning of the data reference
access function for the support of arrays. For example, for a reference
a.b[i][j]
, the base object is a.b
and the access functions,
one for each array subscript, are:
{i_init, + i_step}_1, {j_init, +, j_step}_2
.
first_location_in_loop
: Provides information about the first
location accessed by the data reference in the loop and about the access
function used to represent evolution relative to this location. This data
is used to support pointers, and is not used for arrays (for which we
have base objects). Pointer accesses are represented as a one-dimensional
access that starts from the first location accessed in the loop. For
example:
for1 i for2 j *((int *)p + i + j) = a[i][j];
The access function of the pointer access is {0, + 4B}_for2
relative to p + i
. The access functions of the array are
{i_init, + i_step}_for1
and {j_init, +, j_step}_for2
relative to a
.
Usually, the object the pointer refers to is either unknown, or we can't prove that the access is confined to the boundaries of a certain object.
Two data references can be compared only if at least one of these two representations has all its fields filled for both data references.
The current strategy for data dependence tests is as follows:
If both a
and b
are represented as arrays, compare
a.base_object
and b.base_object
;
if they are equal, apply dependence tests (use access functions based on
base_objects).
Else if both a
and b
are represented as pointers, compare
a.first_location
and b.first_location
;
if they are equal, apply dependence tests (use access functions based on
first location).
However, if a
and b
are represented differently, only try
to prove that the bases are definitely different.
The structure describing the relation between two data references is
data_dependence_relation
and the shorter name for a pointer to
such a structure is ddr_p
. This structure contains:
are_dependent
that is set to chrec_known
if the analysis has proved that there is no dependence between these two
data references, chrec_dont_know
if the analysis was not able to
determine any useful result and potentially there could exist a
dependence between these data references, and are_dependent
is
set to NULL_TREE
if there exist a dependence relation between the
data references, and the description of this dependence relation is
given in the subscripts
, dir_vects
, and dist_vects
arrays,
subscripts
that contains a description of each
subscript of the data references. Given two array accesses a
subscript is the tuple composed of the access functions for a given
dimension. For example, given A[f1][f2][f3]
and
B[g1][g2][g3]
, there are three subscripts: (f1, g1), (f2,
g2), (f3, g3)
.
dir_vects
and dist_vects
that contain
classical representations of the data dependences under the form of
direction and distance dependence vectors,
loop_nest
that contains the loops to
which the distance and direction vectors refer to.
Several functions for pretty printing the information extracted by the
data dependence analysis are available: dump_ddrs
prints with a
maximum verbosity the details of a data dependence relations array,
dump_dist_dir_vectors
prints only the classical distance and
direction vectors for a data dependence relations array, and
dump_data_references
prints the details of the data references
contained in a data reference array.