Saturday, January 31, 2015

Hash Table (Map) Built In Python

Lately, I've been studying up on my data structures and algorithms. I decided it would be fun to implement a HashMap (the map abstract data type). I decided to rely on Python's idiomatic __hash__ function, and use linear probing as the collision resolution mechanism. In the process of building this, I learned a few things and I would like to share them.

Deletion is the Hardest Part


Overall the data structure was not very difficult to build. Well, except for deletion. It's not as simple as shifting all elements back (overwriting the deleted). Instead, I had to flag the item as being "deleted". This wasn't that difficult, however, maintaining quick-access sizing on the data structure was complected by it.

In the end, I had to maintain two states: one that programmers who called __len__ would see and the actual count of stored elements (being actual plus deleted).

Linear Probing is Not as Hard


My "find index" method, boiled down to just
    def __locate_index(self, key):
        i = hash(key) % self.capacity()

        while self.__values[i] and self.__values[i][0] != key:
            i = (i + 1) % self.capacity()

        return i
Where each value is just a tuple where elements 0 and 1 are the key and its value, respectively.

Everything Else is Just Bookkeeping


In the end, the code wasn't too difficult. In the end, the class had a lot more state than I would've liked, however, "getting it right" wasn't that difficult. As always, I hope I'm not missing any edge cases with my tests.

As usual, the code is on github and I'm open to any suggestions or thoughts.