Evolution of Default Dictionaries in Python
I write a lot of code where I use a dictionary of sets (or lists or counters, etc)
Method 1
dict_set = {}
if key not in dict_set:
dict_set[key] = set()
dict_set[key].add(item)
Method 2
dict_set = {}
dict_set.setdefault(key, set()).add(item)
Method 3
from collections import defaultdict dict_set = defaultdict(set) dict_set[key].add(item)
setdefault was added in Python 2.0 and I've been using (and loving) it for years.
It was only a month or two ago that I discovered collections.defaultdict. Now I use it almost every day.
UPDATE: I forgot to mention that defaultdict was added in Python 2.5. And owing to the fact that int() returns 0 you can use defaultdict(int) for a dictionary of counters.
Comments (10)
Brandon Corfman on Feb. 27, 2008:
James Tauber on Feb. 27, 2008:
Brandon, you're welcome. I didn't know about it either until recently which is why I thought I'd share it.
Eduardo Padoan on Feb. 27, 2008:
Hm, unlikely:
http://www.python.org/dev/peps/pep-3100/
"""
To be removed:
...
# dict.setdefault()? [15] [UNLIKELY]
"""
Tennessee Leeuwenburg on Feb. 27, 2008:
James on Feb. 28, 2008:
Mr Me on March 23, 2008:
What are the benefits?
How is default dict better than the builtin dict? Please, some explanations.
James Tauber on March 23, 2008:
Some of us think Method 3 is cleaner.
Tal Einat on May 19, 2008:
For instance, in the original post, using a defaultdict will cause an empty set to be created only when the dict is accessed with a key it doesn't yet have. With setdefault, on the other hand, a new set will be created on every call to setdefault, since it is just a value passed in to setdefault.
However, if you want to initialize to the number zero, using setdefault may be slightly more efficient, since initializing the integer zero is practically free. In such cases the gain in efficiency will likely be negligible (if it exists at all).
After all of this rambling, I would like to suggest to just use whatever is clearer in your eyes and not consider efficiency at all. If you later realize that you must optimize your code and discover that the setdefault calls are the bottleneck (very unlikely!), then that would be the right time to take efficiency issues into consideration.
nkwyrok on July 16, 2008:
Add a Comment
Last Modified: Feb. 27, 2008
Author: jtauber
Michael Foord on Feb. 27, 2008: