Random Optimization Questions

Moderator
Posts: 1,140
Joined: 2005.07
Post: #31
What Cochrane is saying is NSDictionary won't come up with any problems because, though hash values may be the same, it still sees if the objects themselves are the same. If they aren't, then it just moves on to the next possible place to insert it into the hash table.

For the big-O notation, technically you could say that everything is Î©(1) and O(n^n^n^n^n^n...), but the values I'm giving are the tightest bounds. In the case of the hash table, it's actually O(n^3) to build the table. This is because of the fact that when inserting a single item, it might need to rebuild the entire table if it grows to a point that the current table size is suboptimal, so inserting that single item is O(n^2). (since you're re-hashing each of the n items, each of which can possibly try every position in the hash table) Since that's the worse-case time for each insertion, and you're inserting n items, it's O(n^3) to build the table. To find the total complexity of the algorithm, since you build the table and afterwards search for the vertices, you add the complexities together. In this case it's O(n^3) + O(n^2), which is O(n^3 + n^2). Since n^3 > n^2, when n gets very large the n^3 term overshadows the n^2 term, so it's O(n^3) total. Like I said, though: this is absolute worse case, and on average it will be a lot lower, otherwise the brute-force way of nested for loops would be a faster choice. Whether the hash table or tree method is a better choice is a much harder question. For the tree you can say with certainty that the algorithm is Î˜(n*lg(n)) and it will always lay in that realm. The hash table will vary based on the input, though, since the lower and upper bounds aren't the same...
Moderator
Posts: 3,584
Joined: 2003.06
Post: #32
Awesome! It looks like I get it! Thanks!

I see where you got the O(n^3) now -- for some reason, it didn't quite register that re-hashing is also a worst-case scenario that must be considered, even though you specifically said that, sorry. If everything went to hell in a hand-basket, hashing could be an extremely costly affair, but luckily that's not what happens in reality. But even though in practice hash table use is usually wickedly fast (considered effectively O(1) as I've read in a few places), it still doesn't change the fact that it's evaluated as possibly being O(n^3) to build it and possibly O(n^2) to lookup. And I also see now that one would *add* the insertion and lookup together, not multiply them as I was thinking, since they are completely different operations.

I get the NSDictionary thing now too. It doesn't matter how it hashes. For some reason, I was thinking it would throw away my original NSString key and I'd lose the effect, but that wouldn't make sense for the case when it needs to do a finer-level discrimination during a hash collision. Two identical vertices will still hash to the same value regardless. I couldn't quite get the reason why if the objects are equal they will hash to the same value, but that doesn't need to be true in reverse (two different objects having the same hash), but I get it now. Even if that hash value is already used it'll find the right one regardless by comparing the objects (keys) -- just like the man said. Cool!

I realize I've been a bit of a bother about this today, but I really can't thank you guys enough for helping me out here.
Moderator
Posts: 1,140
Joined: 2005.07
Post: #33
I was looking for a distraction from studying, so don't worry. Regardless, this is a very important thing to understand when deciding how to implement algorithms, as there's often different choices each with their different strengths and weaknesses. Sometimes you can immediately choose or reject a method, but oftentimes not. Hopefully some other people will also read through this as well and find it helps them understand the finer aspects of time complexity.