The Importance of Hashing

People who are just beginning to code make a lot of mistakes and do a lot of stupid things. I once used to struggle with parsing thorough every line of a file before I learned that it would be much easier to use split-functions and similar lists. One common mistake people make is that they need to store large amounts of data and parse through them every time they need to access that information. But, with a bit more expertise, those newbies can throw away those "for" loops and sort-methods. There's a new kid in town. And his name is "Hashing."

The following code uses two different lists indices_start and indices_end. These two lists have the start and end positions of certain areas in a very long sequence, and the computer must go from one list to the other to identify those regions when parsing through a string "s".

for i in range(len(s)):
    c+=1
    if c in indices:
        line=int(indices_start.index(c))
        for i in s[i:indices_end[line]+1]:

            s2+="N"

The problem is that this takes forever and a half to run. But, if we just replaced the two lists with a single hash (or dictionary) which we will name indices, then the program runs like clockwork.

for i in range(len(s)):
    c+=1
    if c in indices:
        line=int(indices[c])
        for i in s[i:line+1]:

            s2+="N"

Just goes to show how much of a difference code optimization can make. 

END

(Excuse me for that last line. My Fortran is leaking.)