iDevGames Forums
Dictionary Checking for word games - Printable Version

+- iDevGames Forums (http://www.idevgames.com/forums)
+-- Forum: Development Zone (/forum-3.html)
+--- Forum: Game Programming Fundamentals (/forum-7.html)
+--- Thread: Dictionary Checking for word games (/thread-10524.html)



Dictionary Checking for word games - avidgamer101 - Jan 11, 2013 04:43 AM

So I have been working on a scrabble/jumble like word game for a while now, and I was wondering how I should do the dictionary checks for the game.

Should i run through the dictionary I have which has about 180,000 words everytime I wanna check whether the word is legal or not, or should I do some sort anagram testing before, and compare the words made to a smaller list of words?

What ae your suggestions?


RE: Dictionary Checking for word games - sealfin - Jan 11, 2013 06:56 AM

I'd suggest you begin by testing whether a linear search through the dictionary is fast enough for your needs – if it is, you need do nothing more Wink

If a linear search isn't fast enough for your needs, it is possible to optimise a linear search[1], or you could look into using eg. a hash table.

[1] Eg. keep the dictionary in order of word length, and keep a list of the indexes where the first word of length n occurs in the dictionary; then to test whether or not a word is legal you need to only search a portion of the dictionary (beginning at i = indexes[ strlen( word ) ], and – if the word isn't legal – ending when strlen( dictionary[ i ] ) != strlen( word ).)


RE: Dictionary Checking for word games - Skorche - Jan 11, 2013 10:03 AM

A neat hashing algorithm that I can't remember the name of applies here.

Basically you have a gigantic bit vector, say 100KB long. Then you need a hashing function that for each input word spits out say 10 bit positions. To build the hash, run through your dictionary file and set the 10 bit positions for each word to true. To check if a word is *probably* in the dictionary you check if the 10 bit positions are true.

While it does open up the possibility of false positives, even a bitvector that is a fraction of the size of the original dictionary size you can get very close to 100% accuracy. The /usr/share/dict/words file is 2.4MB as a comparison.

Aha! http://en.wikipedia.org/wiki/Bloom_filter


RE: Dictionary Checking for word games - szymczyk - Jan 11, 2013 12:05 PM

Use a trie to store your word list in RAM and check for valid words. Tries are relatively easy to implement and are efficient in using memory. Another benefit of using tries is the speed of finding a word depends on the word length, not the size of the dictionary, so you can determine if a word is valid in a large dictionary quickly.

A trie is a tree-like data structure. At the top is the root of the trie. For a word game, you would have 26 nodes under the root, one for the letters A-Z. These nodes represent the first letter of words. Under those nodes would be nodes for the second letters of words. Then there would be nodes for the third letters and so on. You can find more information on tries, including links to source code implementations, in the following article:

Using Tries to Store Word Lists in RAM


RE: Dictionary Checking for word games - SethWillits - Jan 11, 2013 12:14 PM

NSSet is an extremely simple choice. I threw 180,000 words into an array and a set, then tested both looking for 10,000 random words.

Linear Time: 0.008735 avg 0.024519 max
Set Time: 0.000001 avg 0.000010 max

The answer is on a Mac they're both ridiculously fast. On iOS, some significant factor slower.

However, writing a linear search is more code than a simple [NSSet containsObject:word.lowercaseString] check, so I'd say go with a set. Unless you're purposefully doing this the long way to learn something interesting.


RE: Dictionary Checking for word games - avidgamer101 - Jan 14, 2013 02:28 AM

thanks a ton guys!!