Trie Maps¶
Example¶
The following code example illustrates how to populate a trie_map
instance and perform prefix searches. The later part of the example also
shows how you can create a packed version of the content for faster lookups
and reduced memory footprint.
#include <mdds/global.hpp> // for MDDS_ASCII
#include <mdds/trie_map.hpp>
#include <iostream>
using namespace std;
typedef mdds::trie_map<mdds::trie::std_string_trait, int> trie_map_type;
int main()
{
// Cities in North Carolina and their populations in 2013.
trie_map_type nc_cities;
// Insert key-value pairs.
nc_cities.insert(MDDS_ASCII("Charlotte"), 792862);
nc_cities.insert(MDDS_ASCII("Raleigh"), 431746);
nc_cities.insert(MDDS_ASCII("Greensboro"), 279639);
nc_cities.insert(MDDS_ASCII("Durham"), 245475);
nc_cities.insert(MDDS_ASCII("Winston-Salem"), 236441);
nc_cities.insert(MDDS_ASCII("Fayetteville"), 204408);
nc_cities.insert(MDDS_ASCII("Cary"), 151088);
nc_cities.insert(MDDS_ASCII("Wilmington"), 112067);
nc_cities.insert(MDDS_ASCII("High Point"), 107741);
nc_cities.insert(MDDS_ASCII("Greenville"), 89130);
nc_cities.insert(MDDS_ASCII("Asheville"), 87236);
nc_cities.insert(MDDS_ASCII("Concord"), 83506);
nc_cities.insert(MDDS_ASCII("Gastonia"), 73209);
nc_cities.insert(MDDS_ASCII("Jacksonville"), 69079);
nc_cities.insert(MDDS_ASCII("Chapel Hill"), 59635);
nc_cities.insert(MDDS_ASCII("Rocky Mount"), 56954);
nc_cities.insert(MDDS_ASCII("Burlington"), 51510);
nc_cities.insert(MDDS_ASCII("Huntersville"), 50458);
nc_cities.insert(MDDS_ASCII("Wilson"), 49628);
nc_cities.insert(MDDS_ASCII("Kannapolis"), 44359);
nc_cities.insert(MDDS_ASCII("Apex"), 42214);
nc_cities.insert(MDDS_ASCII("Hickory"), 40361);
nc_cities.insert(MDDS_ASCII("Goldsboro"), 36306);
cout << "Cities that start with 'Cha' and their populations:" << endl;
auto results = nc_cities.prefix_search(MDDS_ASCII("Cha"));
for (const trie_map_type::key_value_type& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
cout << "Cities that start with 'W' and their populations:" << endl;
results = nc_cities.prefix_search(MDDS_ASCII("W"));
for (const trie_map_type::key_value_type& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
// Create a compressed version of the container. It works nearly identically.
auto packed = nc_cities.pack();
cout << "Cities that start with 'C' and their populations:" << endl;
auto packed_results = packed.prefix_search(MDDS_ASCII("C"));
for (const trie_map_type::key_value_type& kv : packed_results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
// Individual search.
auto it = packed.find(MDDS_ASCII("Wilmington"));
cout << "Population of Wilmington: " << it->second << endl;
// You get an end position iterator when the container doesn't have the
// specified key.
it = packed.find(MDDS_ASCII("Asheboro"));
cout << "Population of Asheboro: ";
if (it == packed.end())
cout << "not found";
else
cout << it->second;
cout << endl;
return EXIT_SUCCESS;
}
One thing to note in the above example is the use of MDDS_ASCII
macro,
which expands a literal string definition into a literal string and its length
as two parameters. This macro comes in handy when you need to define a
literal and immediately pass it to a function that expects a pointer to a
string and its length.
You’ll get the following output when compiling the above code and executing it:
Cities that start with 'Cha' and their populations:
Chapel Hill: 59635
Charlotte: 792862
Cities that start with 'W' and their populations:
Wilmington: 112067
Wilson: 49628
Winston-Salem: 236441
Cities that start with 'C' and their populations:
Cary: 151088
Chapel Hill: 59635
Charlotte: 792862
Concord: 83506
Population of Wilmington: 112067
Population of Asheboro: not found
Here is a version that uses packed_trie_map
:
#include <mdds/global.hpp> // for MDDS_ASCII and MDDS_N_ELEMENTS
#include <mdds/trie_map.hpp>
#include <iostream>
using namespace std;
typedef mdds::packed_trie_map<mdds::trie::std_string_trait, int> trie_map_type;
int main()
{
// Entries must be known prior to creating the instance, and they must be
// sorted by the key in ascending order.
trie_map_type::entry entries[] = {
{ MDDS_ASCII("Apex"), 42214 },
{ MDDS_ASCII("Asheville"), 87236 },
{ MDDS_ASCII("Burlington"), 51510 },
{ MDDS_ASCII("Cary"), 151088 },
{ MDDS_ASCII("Chapel Hill"), 59635 },
{ MDDS_ASCII("Charlotte"), 792862 },
{ MDDS_ASCII("Concord"), 83506 },
{ MDDS_ASCII("Durham"), 245475 },
{ MDDS_ASCII("Fayetteville"), 204408 },
{ MDDS_ASCII("Gastonia"), 73209 },
{ MDDS_ASCII("Goldsboro"), 36306 },
{ MDDS_ASCII("Greensboro"), 279639 },
{ MDDS_ASCII("Greenville"), 89130 },
{ MDDS_ASCII("Hickory"), 40361 },
{ MDDS_ASCII("High Point"), 107741 },
{ MDDS_ASCII("Huntersville"), 50458 },
{ MDDS_ASCII("Jacksonville"), 69079 },
{ MDDS_ASCII("Kannapolis"), 44359 },
{ MDDS_ASCII("Raleigh"), 431746 },
{ MDDS_ASCII("Rocky Mount"), 56954 },
{ MDDS_ASCII("Wilmington"), 112067 },
{ MDDS_ASCII("Wilson"), 49628 },
{ MDDS_ASCII("Winston-Salem"), 236441 },
};
// Cities in North Carolina and their populations in 2013.
trie_map_type nc_cities(entries, MDDS_N_ELEMENTS(entries));
cout << "Cities that start with 'Cha' and their populations:" << endl;
auto results = nc_cities.prefix_search(MDDS_ASCII("Cha"));
for (const trie_map_type::key_value_type& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
cout << "Cities that start with 'W' and their populations:" << endl;
results = nc_cities.prefix_search(MDDS_ASCII("W"));
for (const trie_map_type::key_value_type& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
cout << "Cities that start with 'C' and their populations:" << endl;
results = nc_cities.prefix_search(MDDS_ASCII("C"));
for (const trie_map_type::key_value_type& kv : results)
{
cout << " " << kv.first << ": " << kv.second << endl;
}
// Individual search.
auto it = nc_cities.find(MDDS_ASCII("Wilmington"));
cout << "Population of Wilmington: " << it->second << endl;
// You get an end position iterator when the container doesn't have the
// specified key.
it = nc_cities.find(MDDS_ASCII("Asheboro"));
cout << "Population of Asheboro: ";
if (it == nc_cities.end())
cout << "not found";
else
cout << it->second;
cout << endl;
return EXIT_SUCCESS;
}
This code generates exactly the same output as the first example that uses
trie_map
. The only difference is that you need to provide
the list of entries pre-sorted prior to instantiating the map object.
This example uses another useful macro MDDS_N_ELEMENTS
which
computes the length of an array by dividing the size of the whole array by the
size of its first element. This macro is useful when the array definition is
given in the same compilation unit and therefore its size is known at the call
site where the macro is used.
API Reference¶
Trie Map¶
-
template <typename _KeyTrait, typename _ValueT>
classtrie_map
¶ Trie map provides storage for multiple key-value pairs where keys are either strings, or otherwise consist of arrays of characters. The keys are stored in an ordered tree structure known as trie, or sometimes referred to as prefix tree.
Public Types
-
typedef packed_trie_map<_KeyTrait, _ValueT>
packed_type
¶
-
typedef _KeyTrait
key_trait_type
¶
-
typedef key_trait_type::key_type
key_type
¶
-
typedef key_trait_type::key_buffer_type
key_buffer_type
¶
-
typedef key_trait_type::key_unit_type
key_unit_type
¶
-
typedef _ValueT
value_type
¶
-
typedef size_t
size_type
¶
-
typedef std::pair<key_type, value_type>
key_value_type
¶
Public Functions
-
trie_map
()¶ Default constructor.
-
const_iterator
begin
() const¶
-
const_iterator
end
() const¶
-
void
insert
(const key_type &key, const value_type &value)¶ Insert a new key-value pair.
- Parameters
key
: key value.value
: value to associate with the key.
-
void
insert
(const key_unit_type *key, size_type len, const value_type &value)¶ Insert a new key-value pair.
- Parameters
key
: pointer to the first character of a character array that stores key value.len
: length of the character array storing the key.value
: value to associate with the key.
-
bool
erase
(const key_unit_type *key, size_type len)¶ Erase a key and the value associated with it.
- Return
- true if a key is erased, false otherwise.
- Parameters
key
: pointer to the first character of a character array that stores key value.len
: length of the character array storing the key.
-
const_iterator
find
(const key_type &key) const¶ Find a value associated with a specified key.
- Return
- iterator that references a value associated with the key, or the end position in case the key is not found.
- Parameters
key
: key to match.
-
const_iterator
find
(const key_unit_type *input, size_type len) const¶ Find a value associated with a specified key.
- Return
- iterator that references a value associated with the key, or the end position in case the key is not found.
- Parameters
input
: pointer to an array whose value represents the key to match.len
: length of the matching key value.
-
search_results
prefix_search
(const key_type &prefix) const¶ Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.
- Return
- results object containing all matching key-value pairs.
- Parameters
prefix
: prefix to match.
-
search_results
prefix_search
(const key_unit_type *prefix, size_type len) const¶ Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.
- Return
- results object that contains all matching key-value pairs. The results are sorted by the key in ascending order.
- Parameters
prefix
: pointer to an array whose value represents the prefix to match.len
: length of the prefix value to match.
-
size_type
size
() const¶ Return the number of entries in the map.
- Return
- the number of entries in the map.
-
void
clear
()¶ Empty the container.
-
packed_type
pack
() const¶ Create a compressed and immutable version of the container which provides better search performance and requires much less memory footprint.
- Return
- an instance of mdds::packed_trie_map with the same content.
Friends
-
friend
mdds::trie_map::trie::detail::iterator_base< trie_map >
-
friend
mdds::trie_map::trie::detail::search_results< trie_map >
-
typedef packed_trie_map<_KeyTrait, _ValueT>
Packed Trie Map¶
-
template <typename _KeyTrait, typename _ValueT>
classpacked_trie_map
¶ An immutable trie container that packs its content into a contiguous array to achieve both space efficiency and lookup performance. The user of this data structure must provide a pre-constructed list of key-value entries that are sorted by the key in ascending order, or construct from an instance of mdds::trie_map.
Note that, since this container is immutable, the content of the container cannot be modified once constructed.
Public Types
-
typedef _KeyTrait
key_trait_type
¶
-
typedef key_trait_type::key_type
key_type
¶
-
typedef key_trait_type::key_buffer_type
key_buffer_type
¶
-
typedef key_trait_type::key_unit_type
key_unit_type
¶
-
typedef _ValueT
value_type
¶
-
typedef size_t
size_type
¶
-
typedef std::pair<key_type, value_type>
key_value_type
¶
-
typedef trie::detail::packed_iterator_base<packed_trie_map>
const_iterator
¶
-
typedef trie::detail::packed_search_results<packed_trie_map>
search_results
¶
Public Functions
-
packed_trie_map
()¶ Not implemented.
-
packed_trie_map
(const entry *entries, size_type entry_size)¶ Constructor that initializes the content from a static list of key-value entries. The caller must ensure that the key-value entries are sorted in ascending order, else the behavior is undefined.
- Parameters
entries
: pointer to the array of key-value entries.entry_size
: size of the key-value entry array.null_value
: null value to return when the find method fails to find a matching entry.
-
packed_trie_map
(const trie_map<key_trait_type, value_type> &other)¶ Constructor to allow construction of an instance from the content of a mdds::trie_map instance.
- Parameters
other
: mdds::trie_map instance to build content from.
-
const_iterator
begin
() const¶
-
const_iterator
end
() const¶
-
const_iterator
find
(const key_type &key) const¶ Find a value associated with a specified key.
- Return
- iterator that references a value associated with the key, or the end position in case the key is not found.
- Parameters
key
: key to match.
-
const_iterator
find
(const key_unit_type *input, size_type len) const¶ Find a value associated with a specified key.
- Return
- iterator that references a value associated with the key, or the end position in case the key is not found.
- Parameters
input
: pointer to an array whose value represents the key to match.len
: length of the matching key value.
-
search_results
prefix_search
(const key_type &prefix) const¶ Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.
- Return
- results object containing all matching key-value pairs.
- Parameters
prefix
: prefix to match.
-
search_results
prefix_search
(const key_unit_type *prefix, size_type len) const¶ Retrieve all key-value pairs whose keys start with specified prefix. You can also retrieve all key-value pairs by passing a null prefix and a length of zero.
- Return
- results object that contains all matching key-value pairs. The results are sorted by the key in ascending order.
- Parameters
prefix
: pointer to an array whose value represents the prefix to match.len
: length of the prefix value to match.
Friends
-
friend
mdds::packed_trie_map::trie::detail::packed_iterator_base< packed_trie_map >
-
friend
mdds::packed_trie_map::trie::detail::packed_search_results< packed_trie_map >
-
typedef _KeyTrait
String Trait¶
-
struct
std_string_trait
¶ String trait that uses std::string as the key type. This trait can be used with mdds::trie_map or mdds::packed_trie_map.
Public Types
-
typedef std::string
key_type
¶ type used to store a key value.
-
typedef std::string
key_buffer_type
¶ type used to build an intermediate key value, from which a final key value is to be created. It is expected to be an array structure whose content is contiguous in memory. Its elements must be of key_unit_type.
-
typedef char
key_unit_type
¶ type that represents a single character inside a key or a key buffer object. A key object is expected to store a series of elements of this type.
Public Static Functions
-
static key_buffer_type
to_key_buffer
(const key_unit_type *str, size_t length)¶ Function called to create and initialize a buffer object from a given initial key value.
- Return
- buffer object containing the specified key value.
- Parameters
str
: pointer to the first character of the key value.length
: length of the key value.
-
static key_buffer_type
to_key_buffer
(const key_type &key)¶ Function called to create and initialize a buffer object from a given initial key value.
- Return
- buffer object containing the specified key value.
- Parameters
key
: key value
-
static const key_unit_type *
buffer_data
(const key_buffer_type &buf)¶
-
static size_t
buffer_size
(const key_buffer_type &buf)¶
-
static void
push_back
(key_buffer_type &buffer, key_unit_type c)¶ Function called to append a single character to the end of a key buffer.
- Parameters
buffer
: buffer object to append character to.c
: character to append to the buffer.
-
static void
pop_back
(key_buffer_type &buffer)¶ Function called to remove a single character from the tail of an existing key buffer.
- Parameters
buffer
: buffer object to remove character from.
-
static key_type
to_key
(const key_buffer_type &buf)¶ Function called to create a final string object from an existing buffer.
- Return
- string object whose content is created from the buffer object.
- Parameters
buf
: buffer object to create a final string object from.
-
typedef std::string