Tree data structure tutorial 7. TRIE Data structure

In this chapter we shall look at following things:
7.1. Introduction to TRIE
7.2. Insertion 
7.3. Search
7.4. Auto complete

7.1. Introduction to TRIE

TRIE stands for reTrival. It is based on tree data structure, where a single node will store a single alphabet, and we can search for strings or words by traversing down a branch of the tree.
TRIE data structure is very efficient in storing a key value pair. The nodes in a path will act as the key and once the key is found, we get the value.
For example, in designing a phonebook, to retrieve phone number for a person, we use TRIE data structure. Here we can use Name of the person as key, and the mobile number as value.
When we are doing a google search, it will automatically suggest the search string. This feature employs TRIE implementation.

7.2 So how is TRIE represented?

Similar to tree data structure, TRIE will also have an empty root to start with.
The number of child nodes will always depend on the number of the total possible values.
Suppose we want to store words; we have 26 available alphabets. Hence each node will hold 26 possible values.

7.3 Visual representation of TRIE:

As we discussed above, each node will have placeholder for 26 letter in an alphabet.
TRIE
As you see in the image above, there is a root node to start with. Each node will have elements from A to Z. But It will store only 1 element.
Below is the image after inserting the words
Ant
All
Also
Bat
Ball
TRIE
From the image above, we can see that, each node will have 26 different places. And each node can hold one character.

So how the words will be inserted into TRIE?

In the above words “all” “also”, these 2 share the same prefix. Similarly, “bat” “ball” will share the same prefix. Hence trie will take this as an advantage, and we build a tree as shown in the image.
Now let us look at the implementation
The structure of the tree is as shown below:
struct node
{
	bool isEnd;
	struct node *child[ALPHABETS];
}*head
As you can see in the structure above, we have a child node that holds 26 characters. Then a Boolean variable to check if the node is the end node or not.
The working of insert, search and auto complete has been provided in the code.

TRIE implementation in C++

#include<iostream>
#include<string>
#include<vector>
#include<fstream>
#define ALPHABETS 26

using namespace std;

//structure to hold trie
struct node
{
	bool isEnd;
	struct node *child[ALPHABETS];
}*head;


//initialize the head node
void init()
{
	head = new node();
	head->isEnd = false;
}


//function to insert whole word
void insert(string word)
{
	if(word.empty())
		return;

	node *current = head;

	for(int i =0; i < word.length(); i++)
	{
		if(current->child[word[i] - 'a'] == NULL)
			current->child[word[i]-'a'] = new node();

		current = current->child[word[i]-'a'];
	}
	current->isEnd = true;           
}

//function to check if the word is present or not
bool search(string word)
{
	node *current = head;
	for(int i=0; i < word.length(); i++)
	{
		if(current->child[word[i]-'a'] == NULL)
			return false;
		else
			current = current->child[word[i]-'a'];
	}
	return current->isEnd;
}


//function to print all the words with a given prefix
void print_all(node *tmp, vector<char> word)
{
	if(tmp == NULL)
		return; 
	if(tmp->isEnd)
	{
		vector<char>::iterator it = word.begin();
		while(it != word.end())
		{
			cout << *it;
			++it;
		}
		cout << "\n";         
	}
	for(int i=0; i<26; i++)
	{
		if(tmp->child[i] != NULL)
		{         
			word.push_back((char)i+'a');
			print_all(tmp->child[i],word);
			word.pop_back();
		}
	}
}

//function to auto complete
void auto_complete(string prefix)
{ 
	vector<char> word;

	if(prefix.length() == 0)
	{
		cout << "please enter valid prefix " << "\n";  
		return;
	}   
	node *current = head;
	for(int i=0; i < prefix.length(); i++)
	{
		if(current->child[prefix[i]-'a'] == NULL)
		{
			cout << "prefix does not exist" << "\n";
			return;
		}
		word.push_back(prefix[i]);
		current = current->child[prefix[i]-'a'];
	}
	cout << "The words present with the prfix is as follows:" << "\n";    
	print_all(current,word);
}


int main()
{

    init();
	string s = "ajay";
	insert(s);
	s = "tea";
	insert(s);
	s = "bat";
	insert(s);
	s = "cat";
	insert(s);
	s = "ball";
	insert(s);
	s = "background";
	insert(s);
	s = "bachelorette";
	insert(s);
	

	if(search("ajay"))
		printf("found ajay\n");
	else
		printf("Not found ajay\n");

	if(search("background"))
		printf("found background\n");
	else
		printf("not found background\n");

	auto_complete("ba");

    return 0;
}

Output:

found ajay
found background
The words present with the prefix is as follows:
bachelorette
background
ball
bat
Write a Comment

Leave a Comment

Your email address will not be published. Required fields are marked *