# Aho-Corasick – implementation and animation

Today we will learn how to implement the Aho-Corasick algorithm. It is used for finding occurences of words in text and it is faster than other common algorithms. Code will be in ActionScript 3, but implementation in C, C++, C# or Java will be very similar.

You can also try a fullscreen version of an applett.

## Aho-Corasick algorithm

We can divide the algorithm into 2 steps:
1. constructing a finite state automaton
• constructing a Trie with input words
• finding the right fall function for each node of trie
2. moving throuhg automaton – reading each character from input, storing the results

In the visualization above, I have words “fast”, “sofa”, “so”, “take”. Searching text is in the bottom textfield. If an algorithm finds right words in some step, (more words can end in the same place), the nodes light up and the numbers by their indices are incremented.

## Implementation

### Trie

We learned how to implement a Trie in previous post about Trie. For our purpose, we add 5 more properties to the Node class:

   // pointer to parent node
public var parent:TNode;
public var fall:TNode; // pointer - fall function
// some unnecessary params for drawing
public var x:Number;
public var y:Number;
public var width:Number;


### Construction of fall edges

For constructing back edges (“fall” pointer), we need to do BFS (bredth-first search) on Trie and we need a Queue data structure for this. I used the queue from AS3DS package from PolygonLabs (“q” variable).

function MakeFallFunctions():void
{
trie.fall = trie; // fall function for the root is root
q.enqueue(trie);  // I add the root to the queue
while (q.size != 0) // BFS
{
var n:TNode = q.dequeue();
var no:TNode;
var i:int;
for (i = 0; i < 4; i++) // adding children to queue
{
no = n.cn[i];
if (no == null) continue;
while (no.next != null) { q.enqueue(no); no = no.next; }
q.enqueue(no);
}
if (n == trie) continue; // root already has a fall function
// I move on fall function from parent to find the longest suffix
// or I reach the root
var fall:TNode = n.parent.fall;
while (fall.GetChild(n.c) == null && fall != trie) fall = fall.fall;
// if I found suffix, I set it to "fall" pointer
n.fall = fall.GetChild(n.c);
if (n.fall == null) n.fall = trie; // there is no suffix
if (n.fall == n) n.fall = trie;
}
}


### Searching - moving through an automaton

function Search(e:MouseEvent):void
{
var text:String = "I will search for words in this text.";
var cState = trie; // current state
var n:TNode;       // some other pointers
var no:TNode;
var b:String;      // the character which we read
for(var i:int=0; i < text.length; i++)
{
n = cState;
b = text.charAt(i);

// I move on falls until I find the right child or reach the root
while (n.GetChild(b) == null && n != trie) n = n.fall;

// I got into the root
if (n == trie)
{
n = n.GetChild(b);
if (n == null) n = trie;
}
else n = n.GetChild(b); // or I found the right child

no = n; // I move on falls to root and report all sufices I found
while(no!= trie)
{
if(no.isEnd){/* I found the word, which ends in "no"*/}
no=no.fall;
}
cState = n;
}
}


## Visualization and animation

I used very similar algorithm to draw the automaton, as I used in drawing a trie. You should just add some curves for fall edges.

## Conclusion

My first implementation of Aho-Corasick was in C#, so now I just rewrote it to ActionScript 3. You will probably never need to write this algorithm in AS3, because, if you are looking for just few words, you can use String.indexOf("..."). If you are searching many words in a long text, It will be better to use another (faster) platform and another programming language. This post was meant to be an illustration of algorithm and a hint for programming Aho-Corasick in other languages.

1. Daniel said:

great post, thanks for sharing

December 17th, 2010
2. Chet said:

Full screen functionality really helps out. Thanks!

March 19th, 2011
3. Diogo Neves said:

This is brilliant! Thanks a lot! It really helps

August 14th, 2011
4. Slava said:

It’s awesome! I like this visualization especially because of colors and beatifull animation! Good job, very helpfull

September 16th, 2011
5. hernan said:

Do you do implementation of Aho-Corasick in delphi or pascal

November 10th, 2011
6. Ivan Kuckir said:

Thank you very much!

to hernan: No I didn’t, but it will be very similar in Pascal. If you understand an algorithm and you know Pascal, there shouldn’t be any problems.

December 7th, 2011
7. pnss said:

please mail me the aho curosick code in c
thanks.

January 31st, 2012
8. shhhharry32 said:

please mail me the aho-corasick code in java. It’s really an urgent. Thank you!

February 16th, 2012
9. chandu said:

Actually I am looking for the fall function in ahocorasick algorithm. Here I found the info. Good post.

February 22nd, 2012
10. kala said:

i want verilog code of aho corasick algorithm pls send me

February 23rd, 2012
11. Ivan Kuckir said:

To pnss, shhhharry32 and kala: As you maybe noticed, I haven’t send any code to anyone of you. It’s not because you didn’t leave your e-mails. First, it’s because I have this code written only in ActionScript 3 and C#, and second, you have to make a donation if you want some additional code, which is not on this site.

February 23rd, 2012
12. Girish S said:

Hi this is too good….
Could u pls give me the confirmation that will i b able to implement the same AC algorithm in verilog code……… pls do reply m waiting..

thanks

August 22nd, 2012
13. Ivan Kuckir said:

Hi,
I have never heard about verilog and I don’t know, if you will be able to implement an algorithm. If you are asking about “licencing”, ask Aho and Corasick, not me Feel free to use my code in any way you want.

August 22nd, 2012
14. Finding substrings very very fast | koding notes said:

[...] a neat animation of the algorithm in [...]

January 3rd, 2013
15. vardhan said:

excellent

August 21st, 2013
16. rahul said:

Thanks..:)
Can u please mail me the java implementation of this.

September 5th, 2013
17. rahul said:

my emil id is: goyal.1234rahul@gmail.com

September 5th, 2013
18. Souchan said:

Hi you.
Can u send me your code? I am studying this algorithm but i haven’t understood it .
Thanks alot.

October 6th, 2013
19. nikko said:

will this aho corasick algorithm fit for system using RFID

October 14th, 2013
20. A.Pandit said:

Hey Gr8!!!,
This is very useful and best animation for understanding algorithm.Will you please send me C code of aho corasick on my mail. panditanupama91@gmail.com.
Thank you.

October 15th, 2013
21. revathy said:

hi please send me code in java programming

January 7th, 2014