The library contains a single comprehensive regression test.
For a given container type in this library, the test creates
an object of the container type and an object of the
corresponding standard type (e.g., std::set
). It
then performs a random sequence of methods with random
arguments (e.g., inserts, erases, and so forth) on both
objects. At each operation, the test checks the return value of
the method, and optionally both compares this library's
object with the standard's object as well as performing other
consistency checks on this library's object (e.g.,
order preservation, when applicable, or node invariants, when
applicable).
Additionally, the test integrally checks exception safety and resource leaks. This is done as follows. A special allocator type, written for the purpose of the test, both randomly throws an exceptions when allocations are performed, and tracks allocations and deallocations. The exceptions thrown at allocations simulate memoryallocation failures; the tracking mechanism checks for memoryrelated bugs (e.g., resource leaks and multiple deallocations). Both this library's containers and the containers' valuetypes are configured to use this allocator.
For granularity, the test is split into the several sources, each checking only some containers.
For more details, consult the files in
testsuite/ext/pb_ds/regression
.
This test inserts a number of values with keys from an
arbitrary text ([biblio.wickland96thirty]) into a container,
then performs a series of finds using
find
. It measures the average
time for find
as a function of
the number of values inserted.
It uses the test file:
performance/ext/pb_ds/text_find_timing_test.cc
And uses the data file:
filethirty_years_among_the_dead_preproc.txt
The test checks the effect of different rangehashing functions, trigger policies, and cachehashing policies.
The graphic below show the results for the native
and collisionchaining hash types the function
applied being a text find timing test using
find
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
cc_hash_mod_prime_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div2_sth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
cc_hash_mask_exp_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

In this setting, the rangehashing scheme affects performance more than other policies. As the results show, containers using modbased rangehashing (including the native hashbased container, which is currently hardwired to this scheme) have lower performance than those using maskbased rangehashing. A modulobased rangehashing scheme's main benefit is that it takes into account all hashvalue bits. Standard string hashfunctions are designed to create hash values that are nearlyuniform as is ([biblio.knuth98sorting]).
Trigger policies, i.e. the loadchecks constants, affect performance to a lesser extent.
Perhaps surprisingly, storing the hash value alongside each
entry affects performance only marginally, at least in this
library's implementation. (Unfortunately, it was not possible to run
the tests with std::tr1::unordered_map
's
cache_hash_code = true
, as it appeared to
malfuntion.)
This test inserts a number of values with uniform
integer keys into a container, then performs a series of finds
using find
. It measures the average time
for find
as a function of the number of values
inserted.
It uses the test file:
performance/ext/pb_ds/random_int_find_timing.cc
The test checks the effect of different underlying hashtables, rangehashing functions, and trigger policies.
There are two sets of results for this type, one for collisionchaining hashes, and one for generalprobe hashes.
The first graphic below shows the results for the native and
collisionchaining hash types. The function applied being a random
integer timing test using find
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
cc_hash_mod_prime_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mod_prime_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
cc_hash_mask_exp_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

And the second graphic shows the results for the native and
generalprobe hash types. The function applied being a random
integer timing test using find
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
gp_hash_mod_quadp_prime_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Probe_Fn

quadratic_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
gp_hash_mask_linp_exp_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Probe_Fn

linear_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

In this setting, the choice of underlying hashtable affects performance most, then the rangehashing scheme and, only finally, other policies.
When comparing probing and chaining containers, it is
apparent that the probing containers are less efficient than the
collisionchaining containers (
std::tr1::unordered_map
uses
collisionchaining) in this case.
HashBased Integer Subscript Insert Timing Test shows a different case, where the situation is reversed;
Within each type of hashtable, the rangehashing scheme
affects performance more than other policies; HashBased Text
find
Find Timing Test also shows this. In the
above graphics should be noted that
std::tr1::unordered_map
are hardwired
currently to modbased schemes.
This test inserts a number of values with uniform
integer keys into a container, then performs a series of finds
using operator[]
. It measures the average time
for operator[]
as a function of the number of
values inserted.
It uses the test file:
performance/ext/pb_ds/random_int_subscript_find_timing.cc
The test checks the effect of different underlying hashtables, rangehashing functions, and trigger policies.
There are two sets of results for this type, one for collisionchaining hashes, and one for generalprobe hashes.
The first graphic below shows the results for the native
and collisionchaining hash types, using as the function
applied an integer subscript timing test with
find
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
cc_hash_mod_prime_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mod_prime_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
cc_hash_mask_exp_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

And the second graphic shows the results for the native and
generalprobe hash types. The function applied being a random
integer timing test using find
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
gp_hash_mod_quadp_prime_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Probe_Fn

quadratic_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
gp_hash_mask_linp_exp_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Probe_Fn

linear_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of values with uniform i.i.d.
integer keys into a container, using
operator[]
. It measures the average time for
operator[]
as a function of the number of
values inserted.
It uses the test file:
performance/ext/pb_ds/random_int_subscript_insert_timing.cc
The test checks the effect of different underlying hashtables.
There are two sets of results for this type, one for collisionchaining hashes, and one for generalprobe hashes.
The first graphic below shows the results for the native
and collisionchaining hash types, using as the function
applied an integer subscript timing test with
insert
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
cc_hash_mod_prime_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mod_prime_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
cc_hash_mask_exp_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

And the second graphic shows the results for the native and
generalprobe hash types. The function applied being a random
integer timing test using find
.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
gp_hash_mod_quadp_prime_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Probe_Fn

quadratic_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
gp_hash_mask_linp_exp_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Probe_Fn

linear_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

In this setting, as in HashBased Text
find
Find Timing test and HashBased
Integer find
Find Timing test , the choice
of underlying hashtable underlying hashtable affects performance
most, then the rangehashing scheme, and
finally any other policies.
There are some differences, however:
In this setting, probing tables function sometimes more efficiently than collisionchaining tables. This is explained shortly.
The performance graphs have a "sawtooth" shape. The average insert time rises and falls. As values are inserted into the container, the load factor grows larger. Eventually, a resize occurs. The reallocations and rehashing are relatively expensive. After this, the load factor is smaller than before.
Collisionchaining containers use indirection for greater flexibility; probing containers store values contiguously, in an array (see Figure Motivation::Different underlying data structures A and B, respectively). It follows that for simple data types, probing containers access their allocator less frequently than collisionchaining containers, (although they still have less efficient probing sequences). This explains why some probing containers fare better than collisionchaining containers in this case.
Within each type of hashtable, the rangehashing scheme affects
performance more than other policies. This is similar to the
situation in HashBased Text
find
Find Timing Test and HashBased
Integer find
Find Timing Test.
Unsurprisingly, however, containers with lower α_{max} perform worse in this case,
since more rehashes are performed.
This test inserts a number of values with a markedly
nonuniform integer keys into a container, then performs
a series of finds using find
. It measures the average
time for find
as a function of the number of values in
the containers. The keys are generated as follows. First, a
uniform integer is created. Then it is then shifted left 8 bits.
It uses the test file:
performance/ext/pb_ds/hash_zlob_random_int_find_timing.cc
The test checks the effect of different rangehashing functions and trigger policies.
The graphic below show the results for the native, collisionchaining, and generalprobing hash types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
cc_hash_mod_prime_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
gp_hash_mod_quadp_prime_1div2_nsth_map  
gp_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Probe_Fn

quadratic_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

In this setting, the distribution of keys is so skewed that
the underlying hashtable type affects performance marginally.
(This is in contrast with HashBased Text
find
Find Timing Test, HashBased
Integer find
Find Timing Test, HashBased
Integer Subscript Find Timing Test and HashBased
Integer Subscript Insert Timing Test.)
The rangehashing scheme affects performance dramatically. A
maskbased rangehashing scheme effectively maps all values
into the same bucket. Access degenerates into a search within
an unordered linkedlist. In the graphic above, it should be noted that
std::tr1::unordered_map
is hardwired currently to modbased and maskbased schemes,
respectively.
When observing the settings of this test, it is apparent
that the keys' distribution is far from natural. One might ask
if the test is not contrived to show that, in some cases,
modbased range hashing does better than maskbased range
hashing. This is, in fact just the case. A
more natural case in which modbased range hashing is better was not encountered.
Thus the inescapable conclusion: reallife key distributions are handled better
with an appropriate hash function and a maskbased
rangehashing function. (pb_ds/example/hash_shift_mask.cc
shows an example of handling this apriori known skewed
distribution with a maskbased rangehashing function). If hash
performance is bad, a χ^{2} test can be used
to check how to transform it into a more uniform
distribution.
For this reason, this library's default rangehashing function is maskbased.
This test inserts a number of uniform integer keys into a container, then erases all keys except one. It measures the final size of the container.
It uses the test file:
performance/ext/pb_ds/hash_random_int_erase_mem_usage.cc
The test checks how containers adjust internally as their logical size decreases.
The graphic below show the results for the native, collisionchaining, and generalprobing hash types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details 

n_hash_map_ncah  
std::tr1::unordered_map

cache_hash_code

false
 
cc_hash_mod_prime_1div1_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mod_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_prime_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/1
 
cc_hash_mask_exp_1div2_nsth_map  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
gp_hash_mask_linp_exp_1div2_nsth_set  
gp_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Probe_Fn

linear_probe_fn
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The standard's hashbased containers act very differently than trees in this respect. When erasing numerous keys from an standard associativecontainer, the resulting memory user varies greatly depending on whether the container is treebased or hashbased. This is a fundamental consequence of the standard's interface for associative containers, and it is not due to a specific implementation.
This test inserts a number of values with keys from an arbitrary
text ([ wickland96thirty ]) into a container
using insert
. It measures the average time
for insert
as a function of the number of
values inserted.
The test checks the effect of different underlying data structures.
It uses the test file:
performance/ext/pb_ds/tree_text_insert_timing.cc
The three graphics below show the results for the native tree and this library's nodebased trees, the native tree and this library's vectorbased trees, and the native tree and this library's PATRICIAtrie, respectively.
The graphic immediately below shows the results for the native tree type and several nodebased tree types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_map  
std::map
 
splay_tree_map  
tree

Tag

splay_tree_tag

Node_update

null_node_update
 
rb_tree_map  
tree

Tag

rb_tree_tag

Node_update

null_node_update

The graphic below shows the results for the native tree type and a vectorbased tree type.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_map  
std::map
 
ov_tree_map  
tree

Tag

ov_tree_tag

Node_update

null_node_update

The graphic below shows the results for the native tree type and a PATRICIA trie type.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_map  
std::map
 
pat_trie_map  
tree

Tag

pat_trie_tag

Node_update

null_node_update

Observing the first graphic implies that for this setting, a splay tree
(tree
with Tag
= splay_tree_tag
) does not do
well. See also the BranchBased
Text find
Find Timing Test. The two
redblack trees perform better.
Observing the second graphic, an orderedvector tree
(tree
with Tag
= ov_tree_tag
) performs
abysmally. Inserting into this type of tree has linear complexity
[ austern00noset].
Observing the third and last graphic, A PATRICIA trie
(trie
with Tag
= pat_trie_tag
) has abysmal
performance, as well. This is not that surprising, since a
largefanout PATRICIA trie works like a hash table with
collisions resolved by a subtrie. Each time a collision is
encountered, a new "hashtable" is built A large fanout PATRICIA
trie, however, doe does well in lookups (see BranchBased
Text find
Find Timing Test). It may be
beneficial in semistatic settings.
This test inserts a number of values with keys from an
arbitrary text ([wickland96thirty]) into
a container, then performs a series of finds using
find
. It measures the average time
for find
as a function of the number of
values inserted.
It uses the test file:
performance/ext/pb_ds/text_find_timing.cc
The test checks the effect of different underlying data structures.
The graphic immediately below shows the results for the native tree type and several other tree types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_map  
std::map
 
splay_tree_map  
tree

Tag

splay_tree_tag

Node_Update

null_node_update
 
rb_tree_map  
tree

Tag

rb_tree_tag

Node_Update

null_node_update
 
ov_tree_map  
tree

Tag

ov_tree_tag

Node_Update

null_node_update
 
pat_trie_map  
tree

Tag

pat_trie_tag

Node_Update

null_node_update

For this setting, a splay tree (tree
with Tag
= splay_tree_tag
) does not do
well. This is possibly due to two reasons:
A splay tree is not guaranteed to be balanced [motwani95random]. If a splay tree contains n nodes, its average rootleaf path can be m >> log(n).
Assume a specific rootleaf search path has length m, and the searchtarget node has distance m' from the root. A redblack tree will require m + 1 comparisons to find the required node; a splay tree will require 2 m' comparisons. A splay tree, consequently, can perform many more comparisons than a redblack tree.
An orderedvector tree (tree
with Tag
= ov_tree_tag
), a redblack
tree (tree
with Tag
= rb_tree_tag
), and the
native redblack tree all share approximately the same
performance.
An orderedvector tree is slightly slower than redblack trees, since it requires, in order to find a key, more math operations than they do. Conversely, an orderedvector tree requires far lower space than the others. ([austern00noset], however, seems to have an implementation that is also faster than a redblack tree).
A PATRICIA trie (trie
with Tag
= pat_trie_tag
) has good
lookup performance, due to its large fanout in this case. In
this setting, a PATRICIA trie has lookup performance comparable
to a hash table (see HashBased Text
find
Timing Test), but it is order
preserving. This is not that surprising, since a largefanout
PATRICIA trie works like a hash table with collisions resolved
by a subtrie. A largefanout PATRICIA trie does not do well on
modifications (see TreeBased and TrieBased
Text Insert Timing Test). Therefore, it is possibly beneficial in
semistatic settings.
This test inserts a number of values with keys from an
arbitrary text ([ wickland96thirty ]) into
a container, then performs a series of finds using
find
. It is different than TreeBased and
TrieBased Text find
Find Timing Test in the
sequence of finds it performs: this test performs multiple
find
s on the same key before moving on to the next
key. It measures the average time for find
as a
function of the number of values inserted.
It uses the test file:
performance/ext/pb_ds/tree_text_lor_find_timing.cc
The test checks the effect of different underlying data structures in a localityofreference setting.
The graphic immediately below shows the results for the native tree type and several other tree types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_map  
std::map
 
splay_tree_map  
tree

Tag

splay_tree_tag

Node_Update

null_node_update
 
rb_tree_map  
tree

Tag

rb_tree_tag

Node_Update

null_node_update
 
ov_tree_map  
tree

Tag

ov_tree_tag

Node_Update

null_node_update
 
pat_trie_map  
tree

Tag

pat_trie_tag

Node_Update

null_node_update

For this setting, an orderedvector tree
(tree
with Tag
= ov_tree_tag
), a redblack tree
(tree
with Tag
= rb_tree_tag
), and the native redblack
tree all share approximately the same performance.
A splay tree (tree
with Tag
= splay_tree_tag
) does
much better, since each (successful) find "bubbles" the
corresponding node to the root of the tree.
This test a container, inserts into a number of values, splits
the container at the median, and joins the two containers. (If the
containers are one of this library's trees,
it splits and joins with the split
and
join
method; otherwise, it uses the erase
and
insert
methods.) It measures the time for splitting
and joining the containers as a function of the number of
values inserted.
It uses the test file:
performance/ext/pb_ds/tree_split_join_timing.cc
The test checks the performance difference of join
as opposed to a sequence of insert
operations; by
implication, this test checks the most efficient way to erase a
subsequence from a treelikebased container, since this can
always be performed by a small sequence of splits and joins.
The graphic immediately below shows the results for the native tree type and several other tree types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_set  
std::set
 
splay_tree_set  
tree

Tag

splay_tree_tag

Node_Update

null_node_update
 
rb_tree_set  
tree

Tag

rb_tree_tag

Node_Update

null_node_update
 
ov_tree_set  
tree

Tag

ov_tree_tag

Node_Update

null_node_update
 
pat_trie_map  
tree

Tag

pat_trie_tag

Node_Update

null_node_update

In this test, the native redblack trees must be split and
joined externally, through a sequence of erase
and
insert
operations. This is clearly
superlinear, and it is not that surprising that the cost is
high.
This library's treebased containers use in this test the
split
and join
methods,
which have lower complexity: the join
method
of a splay tree (tree
with Tag
= splay_tree_tag
) is quadratic in the
length of the longest rootleaf path, and linear in the total
number of elements; the join
method of a
redblack tree (tree
with Tag
= rb_tree_tag
) or an orderedvector tree
(tree
with Tag
= ov_tree_tag
) is linear in the number of
elements.
Asides from orders of growth, this library's trees access their
allocator very little in these operations, and some of them do not
access it at all. This leads to lower constants in their
complexity, and, for some containers, to exceptionfree splits and
joins (which can be determined
via container_traits
).
It is important to note that split
and
join
are not esoteric methods  they are the most
efficient means of erasing a contiguous range of values from a
tree based container.
This test creates a container, inserts random integers into the
the container, and then checks the orderstatistics of the
container's values. (If the container is one of this
library's trees, it does this with
the order_of_key
method of
tree_order_statistics_node_update
; otherwise, it uses the find
method and
std::distance
.) It measures the average
time for such queries as a function of the number of values
inserted.
It uses the test file:
performance/ext/pb_ds/tree_order_statistics_timing.cc
The test checks the performance difference of policies based on nodeinvariant as opposed to a external functions.
The graphic immediately below shows the results for the native tree type and several other tree types.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_set  
std::set
 
splay_tree_ost_set  
tree

Tag

splay_tree_tag

Node_Update

tree_order_statistics_node_update
 
rb_tree_ost_set  
tree

Tag

rb_tree_tag

Node_Update

tree_order_statistics_node_update

In this test, the native redblack tree can support
orderstatistics queries only externally, by performing a
find
(alternatively, lower_bound
or
upper_bound
) and then using std::distance
.
This is clearly linear, and it is not that surprising that the
cost is high.
This library's treebased containers use in this test the
order_of_key
method of tree_order_statistics_node_update
.
This method has only linear complexity in the length of the
rootnode path. Unfortunately, the average path of a splay tree
(tree
with Tag =
splay_tree_tag
) can
be higher than logarithmic; the longest path of a redblack
tree (tree
with Tag =
rb_tree_tag
) is
logarithmic in the number of elements. Consequently, the splay
tree has worse performance than the redblack tree.
This test inserts a number of pairs into a container. The first item of each pair is a string from an arbitrary text [wickland96thirty], and the second is a uniform i.i.d.integer. The container is a "multimap"  it considers the first member of each pair as a primary key, and the second member of each pair as a secondary key (see Motivation::Associative Containers::Alternative to Multiple Equivalent Keys). There are 400 distinct primary keys, and the ratio of secondary keys to primary keys ranges from 1 to 5.
The test measures the average findtime as a function of the
number of values inserted. For this library's containers, it
finds the secondary key from a container obtained from finding
a primary key. For the native multimaps, it searches a range
obtained using std::equal_range
on a primary key.
It uses the test file:
performance/ext/pb_ds/multimap_text_find_timing_small.cc
The test checks the findtime scalability of different "multimap" designs.
The graphic below show the results for "multimaps" which use a treebased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_mmap  
std::multimap
 
rb_tree_mmap_lu_mtf_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The graphic below show the results for "multimaps" which use a hashbased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_hash_mmap  
std::tr1::unordered_multimap
 
rb_tree_mmap_lu_mtf_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of pairs into a container. The first item of each pair is a string from an arbitrary text [wickland96thirty], and the second is a uniform integer. The container is a "multimap"  it considers the first member of each pair as a primary key, and the second member of each pair as a secondary key. There are 400 distinct primary keys, and the ratio of secondary keys to primary keys ranges from 1 to 5.
The test measures the average findtime as a function of the
number of values inserted. For this library's containers, it
finds the secondary key from a container obtained from finding
a primary key. For the native multimaps, it searches a range
obtained using std::equal_range
on a primary key.
It uses the test file:
performance/ext/pb_ds/multimap_text_find_timing_large.cc
The test checks the findtime scalability of different "multimap" designs.
The graphic below show the results for "multimaps" which use a treebased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_mmap  
std::multimap
 
rb_tree_mmap_lu_mtf_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The graphic below show the results for "multimaps" which use a hashbased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_hash_mmap  
std::tr1::unordered_multimap
 
rb_tree_mmap_lu_mtf_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of pairs into a container. The first item of each pair is a string from an arbitrary text [wickland96thirty], and the second is a uniform integer. The container is a "multimap"  it considers the first member of each pair as a primary key, and the second member of each pair as a secondary key. There are 400 distinct primary keys, and the ratio of secondary keys to primary keys ranges from 1 to 5.
The test measures the average inserttime as a function of
the number of values inserted. For this library's containers,
it inserts a primary key into the primary associative
container, then a secondary key into the secondary associative
container. For the native multimaps, it obtains a range using
std::equal_range
, and inserts a value only if it was
not contained already.
It uses the test file:
performance/ext/pb_ds/multimap_text_insert_timing_small.cc
The test checks the inserttime scalability of different "multimap" designs.
The graphic below show the results for "multimaps" which use a treebased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_mmap  
std::multimap
 
rb_tree_mmap_lu_mtf_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The graphic below show the results for "multimaps" which use a hashbased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_hash_mmap  
std::tr1::unordered_multimap
 
rb_tree_mmap_lu_mtf_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of pairs into a container. The first item of each pair is a string from an arbitrary text [wickland96thirty], and the second is a uniform integer. The container is a "multimap"  it considers the first member of each pair as a primary key, and the second member of each pair as a secondary key. There are 400 distinct primary keys, and the ratio of secondary keys to primary keys ranges from 1 to 5.
The test measures the average inserttime as a function of
the number of values inserted. For this library's containers,
it inserts a primary key into the primary associative
container, then a secondary key into the secondary associative
container. For the native multimaps, it obtains a range using
std::equal_range
, and inserts a value only if it was
not contained already.
It uses the test file:
performance/ext/pb_ds/multimap_text_insert_timing_large.cc
The test checks the inserttime scalability of different "multimap" designs.
The graphic below show the results for "multimaps" which use a treebased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_mmap  
std::multimap
 
rb_tree_mmap_lu_mtf_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The graphic below show the results for "multimaps" which use a hashbased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_hash_mmap  
std::tr1::unordered_multimap
 
rb_tree_mmap_lu_mtf_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of pairs into a container. The first item of each pair is a string from an arbitrary text [wickland96thirty], and the second is a uniform integer. The container is a "multimap"  it considers the first member of each pair as a primary key, and the second member of each pair as a secondary key. There are 100 distinct primary keys, and the ratio of secondary keys to primary keys ranges to about 20.
The test measures the memory use as a function of the number of values inserted.
It uses the test file:
performance/ext/pb_ds/multimap_text_insert_mem_usage_small.cc
The test checks the memory scalability of different "multimap" designs.
The graphic below show the results for "multimaps" which use a treebased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_mmap  
std::multimap
 
rb_tree_mmap_lu_mtf_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The graphic below show the results for "multimaps" which use a hashbased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_hash_mmap  
std::tr1::unordered_multimap
 
rb_tree_mmap_lu_mtf_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of pairs into a container. The first item of each pair is a string from an arbitrary text [wickland96thirty], and the second is a uniform integer. The container is a "multimap"  it considers the first member of each pair as a primary key, and the second member of each pair as a secondary key. There are 100 distinct primary keys, and the ratio of secondary keys to primary keys ranges to about 20.
The test measures the memory use as a function of the number of values inserted.
It uses the test file:
performance/ext/pb_ds/multimap_text_insert_mem_usage_large.cc
The test checks the memory scalability of different "multimap" designs.
The graphic below show the results for "multimaps" which use a treebased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_mmap  
std::multimap
 
rb_tree_mmap_lu_mtf_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
tree

Tag

rb_tree_tag
 
Node_Update

null_node_update
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

The graphic below show the results for "multimaps" which use a hashbased container for primary keys.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details  Parameter  Details  Parameter  Details 

n_hash_mmap  
std::tr1::unordered_multimap
 
rb_tree_mmap_lu_mtf_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

list_update

Update_Policy

lu_move_to_front_policy
 
rb_tree_mmap_cc_hash_mask_exp_1div2_nsth_set  
cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2
 
Mapped

cc_hash_table

Comb_Hash_Fn

direct_mask_range_hashing
 
Resize_Policy

hash_standard_resize_policy

Size_Policy

hash_exponential_size_policy
 
Trigger_Policy

hash_load_check_resize_trigger with
α_{min} = 1/8 and α_{max} = 1/2

This test inserts a number of values with keys from an
arbitrary text ([ wickland96thirty ]) into
a container using push
. It measures the average time
for push
as a function of the number of values
pushed.
It uses the test file:
performance/ext/pb_ds/priority_queue_text_push_timing.cc
The test checks the effect of different underlying data structures.
The two graphics below show the results for the native priority_queues and this library's priority_queues.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

The graphic below shows the results for the binaryheap based native priority queues and this library's pairingheap priority_queue data structures.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

Pairing heaps (priority_queue
with
Tag
= pairing_heap_tag
)
are the most suited for sequences of push
and
pop
operations of nonprimitive types (e.g.
std::string
s). (See Priority Queue
Text push
and pop
Timing Test.) They are
less constrained than binomial heaps, e.g., and since
they are nodebased, they outperform binary heaps. (See
Priority
Queue Random Integer push
Timing Test for the case
of primitive types.)
The standard's priority queues do not seem to perform well in
this case: the std::vector
implementation needs to
perform a logarithmic sequence of string operations for each
operation, and the deque implementation is possibly hampered by
its need to manipulate a relativelycomplex type (deques
support a O(1) push_front
, even though it is
not used by std::priority_queue
.)
This test inserts a number of values with keys from an
arbitrary text ([ wickland96thirty ]) into
a container using push
, then removes them using
pop
. It measures the average time for push
as a function of the number of values.
It uses the test file:
performance/ext/pb_ds/priority_queue_text_push_pop_timing.cc
The test checks the effect of different underlying data structures.
The two graphics below show the results for the native priority_queues and this library's priority_queues.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

The graphic below shows the results for the native priority queues and this library's pairingheap priority_queue data structures.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue adapting std::vector

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

pairing_heap  
priority_queue

Tag

pairing_heap_tag

These results are very similar to Priority Queue Text
push
Timing Test. As stated there, pairing heaps
(priority_queue
with
Tag
= pairing_heap_tag
) are most suited
for push
and pop
sequences of nonprimitive types such as strings. Observing these
two tests, one can note that a pairing heap outperforms the others
in terms of push
operations, but equals
binary heaps (priority_queue
with
Tag
= binary_heap_tag
) if the number
of push
and pop
operations is equal. As the number of pop
operations is at most equal to the number
of push
operations, pairing heaps are better
in this case. See Priority Queue Random
Integer push
and pop
Timing Test for a case which is different.
This test inserts a number of values with integer keys
into a container using push
. It
measures the average time for push
as a
function of the number of values.
It uses the test file:
performance/ext/pb_ds/priority_queue_random_int_push_timing.cc
The test checks the effect of different underlying data structures.
The two graphics below show the results for the native priority_queues and this library's priority_queues.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

The graphic below shows the results for the binaryheap based native priority queues and this library's priority_queue data structures.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue adapting std::vector

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

Binary heaps are the most suited for sequences of
push
and pop
operations of primitive types
(e.g. ints). They are less constrained
than any other type, and since it is very efficient to store
such types in arrays, they outperform even pairing heaps. (See
Priority
Queue Text push
Timing Test for the case of
nonprimitive types.)
This test inserts a number of values with integer keys
into a container using push
, then removes them
using pop
. It measures the average time for
push
and pop
as a function
of the number of values.
It uses the test file:
performance/ext/pb_ds/priority_queue_random_int_push_pop_timing.cc
The test checks the effect of different underlying data structures.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

Binary heaps are the most suited for sequences of
push
and pop
operations of primitive types
(e.g. ints). This is explained in
Priority
Queue Random Int push
Timing Test. (See Priority Queue
Text push
Timing Test for the case of primitive
types.)
At first glance it seems that the standard's vectorbased priority queue is approximately on par with this library's corresponding priority queue. There are two differences however:
The standard's priority queue does not downsize the underlying
vector (or deque) as the priority queue becomes smaller
(see Priority Queue
Text pop
Memory Use Test). It is therefore
gaining some speed at the expense of space.
From Priority Queue Random
Integer push
and pop
Timing Test, it seems that the standard's priority queue is
slower in terms of push
operations. Since
the number of
pop
operations is at most that of push
operations, the test here is the "best" for the standard's
priority queue.
This test inserts a number of values with keys from an arbitrary text ([ wickland96thirty ]) into a container, then pops them until only one is left in the container. It measures the memory use as a function of the number of values pushed to the container.
It uses the test file:
performance/ext/pb_ds/priority_queue_text_pop_mem_usage.cc
The test checks the effect of different underlying data structures.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

The priority queue implementations (excluding the standard's) use memory proportionally to the number of values they hold: nodebased implementations (e.g., a pairing heap) do so naturally; this library's binary heap deallocates memory when a certain lower threshold is exceeded.
Note from Priority Queue Text push
and pop
Timing Test and Priority Queue
Random Integer push
and pop
Timing Test that this does not
impede performance compared to the standard's priority
queues.
See HashBased Erase Memory Use Test for a similar phenomenon regarding priority queues.
This test inserts a number of values with keys from an
arbitrary text ([ wickland96thirty ]) into
two containers, then merges the containers. It uses
join
for this library's priority queues; for
the standard's priority queues, it successively pops values from
one container and pushes them into the other. The test measures
the average time as a function of the number of values.
It uses the test file:
performance/ext/pb_ds/priority_queue_text_join_timing.cc
The test checks the effect of different underlying data structures.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

In this test the nodebased heaps perform join
in
either logarithmic or constant time. The binary heap requires
linear time, since the wellknown heapify algorithm [clrs2001] is linear.
It would be possible to apply the heapify algorithm to the
standard containers, if they would support iteration (which they
don't). Barring iterators, it is still somehow possible to perform
lineartime merge on a std::vector
based
standard priority queue, using top()
and size()
(since they are enough to expose
the underlying array), but this is impossible for
a std::deque
based standard priority queue.
Without heapify, the cost is superlinear.
This test inserts a number of values with keys from an
arbitrary text ([ wickland96thirty ]) into
into a container then modifies each one "up" (i.e., it
makes it larger). It uses modify
for this library's
priority queues; for the standard's priority queues, it pops values
from a container until it reaches the value that should be
modified, then pushes values back in. It measures the average
time for modify
as a function of the number of
values.
It uses the test file:
performance/ext/pb_ds/priority_queue_text_modify_up_timing.cc
The test checks the effect of different underlying data structures for graph algorithms settings. Note that making an arbitrary value larger (in the sense of the priority queue's comparison functor) corresponds to decreasekey in standard graph algorithms [clrs2001].
The two graphics below show the results for the native priority_queues and this library's priority_queues.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

The graphic below shows the results for the native priority queues and this library's pairing and thin heap priority_queue data structures.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

As noted above, increasing an arbitrary value (in the sense of
the priority queue's comparison functor) is very common in
graphrelated algorithms. In this case, a thin heap
(priority_queue
with
Tag
= thin_heap_tag
)
outperforms a pairing heap (priority_queue
with
Tag
= pairing_heap_tag
).
Conversely, Priority Queue Text
push
Timing Test, Priority Queue
Text push
and pop
Timing Test, Priority
Queue Random Integer push
Timing Test, and
Priority
Queue Random Integer push
and pop
Timing
Test show that the situation is reversed for other
operations. It is not clear when to prefer one of these two
different types.
In this test this library's binary heaps
effectively perform modify in linear time. As explained in
Priority Queue Design::Traits, given a valid pointtype iterator,
a binary heap can perform
modify
logarithmically. The problem is that binary
heaps invalidate their find iterators with each modifying
operation, and so the only way to obtain a valid pointtype
iterator is to iterate using a rangetype iterator until
finding the appropriate value, then use the rangetype iterator
for the modify
operation.
The explanation for the standard's priority queues' performance
is similar to that in Priority Queue Text
join
Timing Test.
This test inserts a number of values with keys from an
arbitrary text ([ wickland96thirty ]) into
into a container then modifies each one "down" (i.e., it
makes it smaller). It uses modify
for this library's
priority queues; for the standard's priority queues, it pops values
from a container until it reaches the value that should be
modified, then pushes values back in. It measures the average
time for modify
as a function of the number of
values.
It uses the test file:
performance/ext/pb_ds/priority_queue_text_modify_down_timing.cc
The main purpose of this test is to contrast Priority Queue
Text modify
Up Timing Test.
The two graphics below show the results for the native priority_queues and this library's priority_queues.
The graphic immediately below shows the results for the native priority_queue type instantiated with different underlying container types versus several different versions of library's priority_queues.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

n_pq_vector  
std::priority_queue

Sequence

std::vector

n_pq_deque  
std::priority_queue

Sequence

std::deque

binary_heap  
priority_queue

Tag

binary_heap_tag

binomial_heap  
priority_queue

Tag

binomial_heap_tag

rc_binomial_heap  
priority_queue

Tag

rc_binomial_heap_tag

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

The graphic below shows the results for the native priority queues and this library's pairing and thin heap priority_queue data structures.
The abbreviated names in the legend of the graphic above are instantiated with the types in the following table.
Name/Instantiating Type  Parameter  Details 

thin_heap  
priority_queue

Tag

thin_heap_tag

pairing_heap  
priority_queue

Tag

pairing_heap_tag

Most points in these results are similar to Priority Queue
Text modify
Up Timing Test.
It is interesting to note, however, that as opposed to that
test, a thin heap (priority_queue
with
Tag
= thin_heap_tag
) is
outperformed by a pairing heap (priority_queue
with
Tag
= pairing_heap_tag
).
In this case, both heaps essentially perform an erase
operation followed by a push
operation. As the other
tests show, a pairing heap is usually far more efficient than a
thin heap, so this is not surprising.
Most algorithms that involve priority queues increase values
(in the sense of the priority queue's comparison functor), and
so Priority Queue
Text modify
Up Timing Test  is more interesting
than this test.
In general, hashbased containers have better timing performance than containers based on different underlyingdata structures. The main reason to choose a treebased or triebased container is if a byproduct of the treelike structure is required: either orderpreservation, or the ability to utilize node invariants. If memoryuse is the major factor, an orderedvector tree gives optimal results (albeit with high modificiation costs), and a listbased container gives reasonable results.
Hashbased containers are typically either collision chaining or probing. Collisionchaining containers are more flexible internally, and so offer better timing performance. Probing containers, if used for simple valuetypes, manage memory more efficiently (they perform far fewer allocationrelated calls). In general, therefore, a collisionchaining table should be used. A probing container, conversely, might be used efficiently for operations such as eliminating duplicates in a sequence, or counting the number of occurrences within a sequence. Probing containers might be more useful also in multithreaded applications where each thread manipulates a hashbased container: in the standard, allocators have classwise semantics (see [meyers96more]  Item 10); a probing container might incur less contention in this case.
In hashbased containers, the rangehashing scheme seems to affect performance more than other considerations. In most settings, a maskbased scheme works well (or can be made to work well). If the keydistribution can be estimated apriori, a simple hash function can produce nearly uniform hashvalue distribution. In many other cases (e.g., text hashing, floatingpoint hashing), the hash function is powerful enough to generate hash values with good uniformity properties [knuth98sorting]; a modulobased scheme, taking into account all bits of the hash value, appears to overlap the hash function in its effort.
The rangehashing scheme determines many of the other policies. A maskbased scheme works well with an exponentialsize policy; for probingbased containers, it goes well with a linearprobe function.
An orthogonal consideration is the trigger policy. This presents difficult tradeoffs. E.g., different load factors in a loadcheck trigger policy yield a space/amortizedcost tradeoff.
In general, there are several families of treebased underlying data structures: balanced nodebased trees (e.g., redblack or AVL trees), highprobability balanced nodebased trees (e.g., random treaps or skiplists), competitive nodebased trees (e.g., splay trees), vectorbased "trees", and tries. (Additionally, there are diskresiding or networkresiding trees, such as BTrees and their numerous variants. An interface for this would have to deal with the execution model and ACID guarantees; this is out of the scope of this library.) Following are some observations on their application to different settings.
Of the balanced nodebased trees, this library includes a redblack tree, as does standard (in practice). This type of tree is the "workhorse" of treebased containers: it offers both reasonable modification and reasonable lookup time. Unfortunately, this data structure stores a huge amount of metadata. Each node must contain, besides a value, three pointers and a boolean. This type might be avoided if space is at a premium [austern00noset].
Highprobability balanced nodebased trees suffer the drawbacks of deterministic balanced trees. Although they are fascinating data structures, preliminary tests with them showed their performance was worse than redblack trees. The library does not contain any such trees, therefore.
Competitive nodebased trees have two drawbacks. They are
usually somewhat unbalanced, and they perform a large number of
comparisons. Balanced trees perform one comparison per each
node they encounter on a search path; a splay tree performs two
comparisons. If the keys are complex objects, e.g.,
std::string
, this can increase the running time.
Conversely, such trees do well when there is much locality of
reference. It is difficult to determine in which case to prefer
such trees over balanced trees. This library includes a splay
tree.
Orderedvector trees use very little space [austern00noset]. They do not have any other advantages (at least in this implementation).
Largefanout PATRICIA tries have excellent lookup performance, but they do so through maintaining, for each node, a miniature "hashtable". Their space efficiency is low, and their modification performance is bad. These tries might be used for semistatic settings, where order preservation is important. Alternatively, redblack trees crossreferenced with hash tables can be used. [okasaki98mereable] discusses smallfanout PATRICIA tries for integers, but the cited results seem to indicate that the amortized cost of maintaining such trees is higher than that of balanced trees. Moderatefanout trees might be useful for sequences where each element has a limited number of choices, e.g., DNA strings.
Different mapping semantics were discussed in the introduction and design sections.Here the focus will be on the case where a keys can be composed into primary keys and secondary keys. (In the case where some keys are completely identical, it is trivial that one should use an associative container mapping values to size types.) In this case there are (at least) five possibilities:
Use an associative container that allows equivalentkey
values (such as std::multimap
)
Use a uniquekey value associative container that maps each primary key to some complex associative container of secondary keys, say a treebased or hashbased container.
Use a uniquekey value associative container that maps each primary key to some simple associative container of secondary keys, say a listbased container.
Use a uniquekey value associative container that maps
each primary key to some nonassociative container
(e.g., std::vector
)
Use a uniquekey value associative container that takes into account both primary and secondary keys.
Stated simply: there is a simple answer for this. (Excluding option 1, which should be avoided in all cases).
If the expected ratio of secondary keys to primary keys is small, then 3 and 4 seem reasonable. Both types of secondary containers are relatively lightweight (in terms of memory use and construction time), and so creating an entire container object for each primary key is not too expensive. Option 4 might be preferable to option 3 if changing the secondary key of some primary key is frequent  one cannot modify an associative container's key, and the only possibility, therefore, is erasing the secondary key and inserting another one instead; a nonassociative container, conversely, can support inplace modification. The actual cost of erasing a secondary key and inserting another one depends also on the allocator used for secondary associativecontainers (The tests above used the standard allocator, but in practice one might choose to use, e.g., [boost_pool]). Option 2 is definitely an overkill in this case. Option 1 loses out either immediately (when there is one secondary key per primary key) or almost immediately after that. Option 5 has the same drawbacks as option 2, but it has the additional drawback that finding all values whose primary key is equivalent to some key, might be linear in the total number of values stored (for example, if using a hashbased container).
If the expected ratio of secondary keys to primary keys is large, then the answer is more complicated. It depends on the distribution of secondary keys to primary keys, the distribution of accesses according to primary keys, and the types of operations most frequent.
To be more precise, assume there are m primary keys, primary key i is mapped to n_{i} secondary keys, and each primary key is mapped, on average, to n secondary keys (i.e., E(n_{i}) = n).
Suppose one wants to find a specific pair of primary and
secondary keys. Using 1 with a tree based container
(std::multimap
), the expected cost is
E(Θ(log(m) + n_{i})) = Θ(log(m) +
n); using 1 with a hashbased container
(std::tr1::unordered_multimap
), the expected cost is
Θ(n). Using 2 with a primary hashbased container
and secondary hashbased containers, the expected cost is
O(1); using 2 with a primary treebased container and
secondary treebased containers, the expected cost is (using
the Jensen inequality [motwani95random])
E(O(log(m) + log(n_{i})) = O(log(m)) +
E(O(log(n_{i})) = O(log(m)) + O(log(n)),
assuming that primary keys are accessed equiprobably. 3 and 4
are similar to 1, but with lower constants. Using 5 with a
hashbased container, the expected cost is O(1); using 5
with a tree based container, the cost is
E(Θ(log(mn))) = Θ(log(m) +
log(n)).
Suppose one needs the values whose primary key matches some given key. Using 1 with a hashbased container, the expected cost is Θ(n), but the values will not be ordered by secondary keys (which may or may not be required); using 1 with a treebased container, the expected cost is Θ(log(m) + n), but with high constants; again the values will not be ordered by secondary keys. 2, 3, and 4 are similar to 1, but typically with lower constants (and, additionally, if one uses a treebased container for secondary keys, they will be ordered). Using 5 with a hashbased container, the cost is Θ(mn).
Suppose one wants to assign to a primary key all secondary keys assigned to a different primary key. Using 1 with a hashbased container, the expected cost is Θ(n), but with very high constants; using 1 with a treebased container, the cost is Θ(nlog(mn)). Using 2, 3, and 4, the expected cost is Θ(n), but typically with far lower costs than 1. 5 is similar to 1.
The following table shows the complexities of the different
underlying data structures in terms of orders of growth. It is
interesting to note that this table implies something about the
constants of the operations as well (see Amortized push
and pop
operations).
push  pop  modify  erase  join  

std::priority_queue
 Θ(n) worst Θ(log(n)) amortized  Θ(log(n)) Worst  Θ(n log(n)) Worst _{[std note 1]}  Θ(n log(n)) _{[std note 2]}  Θ(n log(n)) _{[std note 1]} 
priority_queue
<Tag =
pairing_heap_tag >
 O(1)  Θ(n) worst Θ(log(n)) amortized  Θ(n) worst Θ(log(n)) amortized  Θ(n) worst Θ(log(n)) amortized  O(1) 
priority_queue
<Tag =
binary_heap_tag >
 Θ(n) worst Θ(log(n)) amortized  Θ(n) worst Θ(log(n)) amortized  Θ(n)  Θ(n)  Θ(n) 
priority_queue
<Tag =
binomial_heap_tag >
 Θ(log(n)) worst O(1) amortized  Θ(log(n))  Θ(log(n))  Θ(log(n))  Θ(log(n)) 
priority_queue
<Tag =
rc_binomial_heap_tag >
 O(1)  Θ(log(n))  Θ(log(n))  Θ(log(n))  Θ(log(n)) 
priority_queue <Tag =
thin_heap_tag >
 O(1)  Θ(n) worst Θ(log(n)) amortized  Θ(log(n)) worst O(1) amortized, or Θ(log(n)) amortized _{[thin_heap_note]}  Θ(n) worst Θ(log(n)) amortized  Θ(n) 
[std note 1] This
is not a property of the algorithm, but rather due to the fact
that the standard's priority queue implementation does not support
iterators (and consequently the ability to access a specific
value inside it). If the priority queue is adapting an
std::vector
, then it is still possible to reduce this
to Θ(n) by adapting over the standard's adapter and
using the fact that top
returns a reference to the
first value; if, however, it is adapting an
std::deque
, then this is impossible.
[std note 2] As
with [std note 1], this is not a
property of the algorithm, but rather the standard's implementation.
Again, if the priority queue is adapting an
std::vector
then it is possible to reduce this to
Θ(n), but with a very high constant (one must call
std::make_heap
which is an expensive linear
operation); if the priority queue is adapting an
std::deque
, then this is impossible.
[thin_heap_note] A thin heap has
Θ(log(n)) worst case modify
time
always, but the amortized time depends on the nature of the
operation: I) if the operation increases the key (in the sense
of the priority queue's comparison functor), then the amortized
time is O(1), but if II) it decreases it, then the
amortized time is the same as the worst case time. Note that
for most algorithms, I) is important and II) is not.
In many cases, a priority queue is needed primarily for
sequences of push
and pop
operations. All of
the underlying data structures have the same amortized
logarithmic complexity, but they differ in terms of
constants.
The table above shows that the different data structures are
"constrained" in some respects. In general, if a data structure
has lower worstcase complexity than another, then it will
perform slower in the amortized sense. Thus, for example a
redundantcounter binomial heap (priority_queue
with
Tag
= rc_binomial_heap_tag
)
has lower worstcase push
performance than a binomial
heap (priority_queue
with Tag
= binomial_heap_tag
),
and so its amortized push
performance is slower in
terms of constants.
As the table shows, the "least constrained" underlying data structures are binary heaps and pairing heaps. Consequently, it is not surprising that they perform best in terms of amortized constants.
Pairing heaps seem to perform best for nonprimitive
types (e.g., std::string
s), as shown by
Priority
Queue Text push
Timing Test and Priority
Queue Text push
and pop
Timing
Test
binary heaps seem to perform best for primitive types
(e.g., ints), as shown by Priority
Queue Random Integer push
Timing Test and
Priority
Queue Random Integer push
and pop
Timing
Test.
In some graph algorithms, a decreasekey operation is
required [clrs2001];
this operation is identical to modify
if a value is
increased (in the sense of the priority queue's comparison
functor). The table above and Priority Queue
Text modify
Up Timing Test show that a thin heap
(priority_queue
with
Tag
= thin_heap_tag
)
outperforms a pairing heap (priority_queue
with
Tag
= Tag
= pairing_heap_tag
),
but the rest of the tests show otherwise.
This makes it difficult to decide which implementation to use in
this case. Dijkstra's shortestpath algorithm, for example, requires
Θ(n) push
and pop
operations
(in the number of vertices), but O(n^{2})
modify
operations, which can be in practice Θ(n)
as well. It is difficult to find an apriori characterization of
graphs in which the actual number of modify
operations will dwarf the number of push
and
pop
operations.