Python keyword “in” and Its Different Efficiency with lists and dictionaries

The efficiency of Python keyword “in” may depend on the data structure it is applied to. In this post I will show you the efficiency difference of “in” when used with lists and dictionaries (hash tables).

First, I composed the following two python scripts to test the elapsed time when using lists and dictionaries.

The script for list creates a list of range(10000) and check for 0-9999 whether they are in the list using keyword “in”.


import time

in_list = range(10000)
counter = 0
start = time.clock()
for i in range(10000):
    if i in in_list:
        counter += 1
end = time.clock()
print "hit number:"
print counter
print "elapsed time:"
print (end-start)

The script for dictionary first creates a dictionary consisting 10000 key:value pairs. Then it tests the time needed to check whether 0-9999 are in the dictionary.


import time

in_dict = {k:v for k,v in zip(range(10000),range(10000))}
counter = 0
start = time.clock()
for i in range(10000):
    if i in in_dict:
        counter += 1
end = time.clock()
print "hit number:"
print counter
print "elapsed time:"
print (end-start)

 

The results are as follow, respectively.

list:

hit number:
10000
elapsed time:
0.694395

dictionary:

hit number:
10000
elapsed time:
0.001719

The results show that although the keyword is “in” for both of the data structures, the efficiency varies. In another word, the efficiency of Python keyword “in” depends on the data structure it is applied to.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s