Python Tuples are Not Just Constant Lists
Greg Wilson is suggesting things Python 3000 could leave out and suggests tuples.
In the comments, Phillip Eby holds him to task for the assumption that tuples are just constant lists:
Tuples are not constant lists -- this is a common misconception. Lists are intended to be homogeneous sequences, while tuples are hetereogeneous data structures.
I think it was years into my use Python that I realised this and stopped thinking about them as just constant lists. It was a powerful revelation for me.
So while I agree with Greg that many (most?) programmers don't understand the distinction, I think they are missing out and that we'd be better off improving the way lists and tuples are documented than try to conflate what is, in my opinion, a very useful distinction.
One way I'd express it (in addition to Phillip's quote above) is that the index in a tuple has an implied semantic. The point of a tuple is that the i-th slot means something specific. In other words, it's a index-based (rather than name based) datastructure.
This notion of 'tuple' is very important in relational algebra (as Phillip also points out) and so I've been thinking about it in the context of relational python too.
When I started playing around with relational python (which I need to get back to blogging about), it occurred to me that it might be useful to have the notion of a tuple whose slots could additionally be named and then accessed via name. I implemented it that way in Basic Class for Relations.
Comments (35)
Greg Wilson on April 15, 2006:
Hi James,
I don't think it matters how many times we explain it --- the overwhelming majority of Python programmers will continue to say, "Huh?" Right now, lists vs. tuples conflates two useful-but-separate ideas: homogeneity vs. heterogeneity, and mutability vs. immutability. Unless the first is enforced, the second is what people will notice...
James Tauber on April 15, 2006:
Greg, I agree that we've conflated mutability with homogeneity/heterogeneity.
The first step to fixing that would be to have an immutable list type. (frozenlist?)
The second would be to support item assignment (but still not append) on tuples (and maybe introdce a frozentuple that doesn't support this)
The third would be to change the behaviour of things like tuple + tuple. The fact that works the way it does in Python today is the sort of thing that causes the confusion.
Anrie Nord on April 16, 2006:
I think some parts of this article must be in official Python documentation: http://en.wikipedia.org/wiki/Tuple. Finite state machines, Petri nets, queue systems, etc. are formalized using tuples, so computer scientists are very familiar with them and with their semantics. I agree, tuples are useful abstraction in Python, but too many programmers think of them as of immutable lists.
Madjoe on April 17, 2006:
I don't really understand the fuss. If it looks like a duck, quacks like a duck...most people will treat it like a duck. Tuples, for all practical uses behave like immutable lists. The homogeneity vs. heterogeneity doesn't really mean anything, practically speaking, because python lists can be heterogenous too. What can you do with tuples that you *can't* do with lists (apart from enforcing immutability)?
Even when you explain to some one about tuples: "What is a tuple?" Isn't the simplest explanation something like: "It's a list of ...". Lists are more general and imho, more readily understood.
Pat Maupin on April 17, 2006:
Agree with Madjoe -- duck typing says tuples (in their current incarnation) are immutable lists. You can fight this head-on, or attempt to use some jujitsu to add to this understanding, rather than to try to replace it. Characterizing it as a misunderstanding is not nearly as helpful as characterizing it as an incomplete understanding.
Agree that, in general, tuples tend to contain heterogenous elements, and lists homogenous elements, and that teaching this could lead to some "aha" moments, but disagree with suggestion that things need to be "fixed" by taking steps like removing functionality such as '+' from tuples. ('+' is an operation that works fine on other immutable sequences such as strings, and you'd have to have a REALLY good reason to break this sort of orthogonality.)
Agree that item assignment on tuples might be nice, but think it could be handled as a method which returns a new immutable tuple (kind of like string.replace returns a new string). Don't like the idea of separate FrozenTuples or FrozenLists -- barely tolerate this for sets. (BTW, now that FrozenSets have been around for awhile, someone should do a check to see how often they are used...)
Agree that indexed/names slots would be incredibly useful (have wanted this for years, and have occasionally rolled my own). The canonical example of this is the built-in slice object.
DoctorPepper on April 17, 2006:
Personally, I really don't care if there is a distinction. I use tuples as immutable lists. They work very well for that. If there is some other use for them, then I will eventually get around to using that as well.
I do not agree with adding or subtracting anything from/to Python in order to make this more clear. The language "works" the way it is, don't muddy it up any further! This is from a guy that worked in Perl for six years prior to switching to Python. In Perl, there is no such thing as a tuple. Lists are arrays are queues are stacks. You could push, pop, shift, unshift, insert, remove to your heart's content. It took about five minutes to wrap your brain around that concept, and from then on, it was gravy.
One of the most difficult things I had to get used to when I switched to Python (not my idea, the company I work for uses Python heavily), was that you had two types of lists: mutable and immutable. I just really couldn't understand the underlying logic as too why you would need two of them, but hey, when in Rome, ...
Do not mung-up Python just to make us "get" an esoteric fact that doesn't really matter in the long run. It won't add anything useful to the language, and will only make it more difficult for a novice to learn.
Amit Patel on April 17, 2006:
The radical change I'd propose is to make tuple fields named. That strongly encourage heterogeneity and it makes them more readable too. Look at the time module's 9-tuple for an example. And nobody would confuse lists and tuples anymore. (They'd confuse tuples and objects ;-) )
Etienne Posthumus on April 18, 2006:
And another point: tuples are hashable.
So you can use them as the key in dicts.
Very useful in many places.
To illustrate:
>>> hash( (1,2,3) )
-378539185
>>> hash( [1,2,3] )
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable
>>> alist = [1,2,3]
>>> atuple = (1,2,3)
>>> adict = { alist: True}
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable
>>> adict = { atuple: True}
>>> adict
{(1, 2, 3): True}
Tom Lynn on April 18, 2006:
Lists have order, tuples have structure.
David Niergarth on April 19, 2006:
And tuples require somewhat less memory than lists. If you're working with a hundred thousand little lists, using tuples will save a little memory.
If tuples grew named fields, they'd be about the same as a simple class that declared __slots__.
Statements like "Lists are intended to be homogeneous sequences, while tuples are hetereogeneous data structures" explain Guido's original intention but also seem to me to imply that
for char in ['a', 'b', 'c']:
is preferred over
for char in ('a', 'b', 'c'):
because it's a homogeneous sequence of strings. Should we avoid the later example because it implies that we failed to grasp the subtle and idealized use of tuples? In reality, it's a little throw-away sequence in either case.
Also, homogeneity may depend on your point of view. What about "for coord in (x, y, z):"? Should that be better written as "for coord in [x, y, z]:"? Is [x, y, z] a sequence of individual coordinates or is the tuple (x, y, z) a single point? For this reason I like James' explanation "The point of a tuple is that the i-th slot means something specific" much better. However, that's just one sensible use of tuples. Saying it's the (only) point goes too far. The hashable dictionary key usage is also sensible, as is the smaller memory footprint usage. The weakest usage seems to me to be the hetereogeneous data structure usage, simply because lists can also fill that role.
My larger point is that tuples have several justifiable usages that aren't really related to each other. You can't expect to always use them the same way.
Collin Winter on April 20, 2006:
This mindset (tuples == immutable lists) is something I've had to deal with in my typecheck package. I get email from people all the time who are furious that I chose to treat tuples and lists differently (i.e., more like their equivalents in Haskell).
Alex Garel on April 20, 2006:
I quite like the statement abive : Lists have order, tuples have structure.
For tuples are the good way to handle things like
def f():
return 'a', 'b', 'c'
a, b, c = f()
and such kind of things, that is, when *structure is static*.
I don't really understand the Phillip Eby's statement as for me, I can't limit list to homogenous elements.
Jayson Vantuyl on April 26, 2006:
I think of the common programmer's experience with tuples much like their experience with super or metaclasses. Until you need them, they're hard to understand.
Lists are made to be modified because that's their usage pattern. If you maintain a list of, for example, table reservations (think a restaurant), a list is good.
Tuples, on the other hand, usually have an identity. If you're working with coordinates (14,25) is a specific point. The tuple represents it as a unique value (really, a compound value). It has the same identity as the value 5 or 'x'.
My most common run-in with tuples is in network handling. A connection to ('www.slashdot.org',80) concretely identifies the connection.
I also find them in graphs. When working with edges (or more usefully paths), it is possible to represent them as lists, but for many uses, a tuple is much appropriate. For example, if you need to remember, say, a list of people who have traversed a certain path in a graph, that's a good candidate for a dictionary with tuples as keys and lists as values. The path is an identity--a point in the path space. The value is a list--a structure meant to be modified. Depending on your needs, you might use a heap or something instead of the list--but I think this illustrates where tuples are really necessary.
Most programmers don't seem to make a distinction between normal lists and identities. Note in the above examples, those tuples aren't really something you change as they really act to identify some abstract point in a space (whether a geometrical space or the connection space). They cannot be modified or they lose their meaning. They fill the niche between a single value and a full-blown class.
Of course, without a mathematics or CS background, it's hard to appreciate the value here. Until you have defined a DFA (for example, or PDAs or Turing Machines) as a tuple of sets and then done proofs relating to sets of these DFAs (thus sets of tuples of sets) it's hard to appreciate what it all means or why it's useful.
I would bet that most programmers don't think that way. I would also bet that there are classes of problems that they have difficulty solving because of it. Not that they're stupid, but you can't blame someone for not knowing something they don't use.
One thing is for sure, they won't disappear any time soon because I'm pretty sure that the BDFL knows *exactly* what they're for.
Peter Fein on April 28, 2006:
I've found (repeatedly) that tuples are faster than lists.
Though this only an issue when you're writing high-performance Python, as the difference is small.
Eric on April 30, 2006:
I think the problem is inherent in Python. Tuples get treated as a sequence that you can iterate over when they're meant as a data structure. Surprise, surprise, people use them as and think of them as sequences.
Personally, I don't get their presence in the language. They're not supposed to be used as sequences, but Python has dictionaries for nice *explicit* data-structures, as opposed to the admittedly terser, sometimes opaque, and always *implicit* structures tuples offer.
Marc on May 1, 2006:
Having separate types for lists and tuples seems like overkill to me. And with the separate () and [] notations, it seems like the one Perlish thing about Python.
I'm more drawn to the "everything is a list" philosophy of Scheme.
Erik on May 1, 2006:
thanks Eric:
"Tuples get treated as a sequence that you can iterate over when they're meant as a data structure."
As a "non-programnmer" that really clears things up for me.
Kitsu (Ed Blake) on May 1, 2006:
I like Jayson Vantuyl's explanation the best. A tuple is a (compound) value, while a list is a bucket.
If you change the contents of a name it is no longer the same name; while no matter what you do to the contents of a bucket it is still the same bucket.
They both hold multiple values, but serve completely seperate functions.
JimJJewett on May 1, 2006:
Other that efficiency and history, is there any advantage to a tuple over an explicit object with named fields?
Alex Garel on May 4, 2006:
Jayson Vantuyl you express the concept really well. Thanks
Bryan Eastin on April 27, 2007:
From what I can tell, the only purpose of the tuple data type is to keep people from doing things. It seems like it would be better to use the list type for structured or immutable lists and rely on people to excercise some self-control.
Alternately, you might expand the functionality of tuples. I would be very excited by the addition of a multi-dimensional list-like data type with immutable structure (not elements). I'd love a tuple that permitted the reassignment of individual elements, facilitated element extraction along any dimension, and behaved according to the rules of linear algebra. Thus, for x=((1,2),(3,4)) and y=((5,6),(7,8)), I would want x[:,0]==(1,3) and x+y=((6,8),(10,12) not ((1,2),(3,4),(5,6),(7,8)). This would also help to do away with some of the redundancy and confusion associated with tuples and lists.
I do physics for a living, so I spend a lot of time manipulating homogeneous, high dimensional matrices.
Bryan Eastin on April 27, 2007:
From what I can tell, the only purpose of the tuple data type is to keep people from doing things. It seems like it would be better to use the list type for structured or immutable lists and rely on people to excercise some self-control.
Alternately, you might expand the functionality of tuples. I would be very excited by the addition of a multi-dimensional list-like data type with immutable structure (not elements). I'd love a tuple that permitted the reassignment of individual elements, facilitated element extraction along any dimension, and behaved according to the rules of linear algebra. Thus, for x=((1,2),(3,4)) and y=((5,6),(7,8)), I would want x[:,0]==(1,3) and x+y=((6,8),(10,12) not ((1,2),(3,4),(5,6),(7,8)). This would also help to do away with some of the redundancy and confusion associated with tuples and lists.
I do physics for a living, so I spend a lot of time manipulating homogeneous, high dimensional matrices.
Bryan Eastin on April 28, 2007:
From what I can tell, the only purpose of the tuple data type is to keep people from doing things. It seems like it would be better to use the list type for structured or immutable lists and rely on people to excercise some self-control.
Alternately, you might expand the functionality of tuples. I would be very excited by the addition of a multi-dimensional list-like data type with immutable structure (not elements). I'd love a tuple that permitted the reassignment of individual elements, facilitated element extraction along any dimension, and behaved according to the rules of linear algebra. Thus, for x=((1,2),(3,4)) and y=((5,6),(7,8)), I would want x[:,0]==(1,3) and x+y=((6,8),(10,12) not ((1,2),(3,4),(5,6),(7,8)). This would also help to do away with some of the redundancy and confusion associated with tuples and lists.
I do physics for a living, so I spend a lot of time manipulating homogeneous, high dimensional matrices.
Andrew on June 5, 2007:
I totally agree. (1,2) + (3,4) should be (4,6) not (1,2,3,4). I think that it's great that python allows tuples - but I think that it would be better if it followed through with this.
Masklinn on July 17, 2007:
I think those who consider tuples as immutable lists should give a try at languages who actually use tuples as tuples: Erlang, Haskell, OCaml, ...
No need to become even fluent in them, but try to get a feel of how they work and *how they use lists and tuples*.
This is how I "got" it, and how it finally understood the difference between lists being a sequence and tuples being a lightweight data structure (think lightweight ad-hock object).
dbt on July 17, 2007:
I never really got this until I spent some time in Erlang. The difference really shines through there.
nirs on July 17, 2007:
Tuples are constant lists -- Lists and tuples are homogeneous or heterogeneous sequences. The only important difference is the mutability.
nirs on July 17, 2007:
Tuples are constant lists -- Lists and tuples are homogeneous or heterogeneous sequences. The only important difference is the mutability.
MattB on July 17, 2007:
I've read and re-read this debate for years. I understand the intent of Philip Eby's comment, but in practice we have not succeeded in clarifying this to most Python developers. Experimenting with the language teaches you that indeed, tuples act as immutable lists, and since Python enforces nothing about their heterogeneity, that's where it stops.
(I do some Erlang, where tuples are an indispensible type. They're not as essential to python's philosophy, I'd argue, and using them naturally as immutable lists shouldn't be considered "wrong.")
I'd be glad for tuples to disappear in Python 3000. When I've taught "software carpentry" introductions to python, this is one area that inevitably leads to raised eyebrows and the question "why?"
casey on July 17, 2007:
@JimJJewett:
If a few module functions passing tuples around is simple and clear and gets the job done, use that.
If the tuple conventions get too complex, use dicts.
If the data gets passed around too much, move the functions into a class.
Start with tuples, and introduce OO to manage complexity only when necessary. Your code will be highly readable and reusable.
fz on July 17, 2007:
It would be reasonable to assume
that accessing to the i-th element
in a tuple is fast (O(1))
while accessing to the i-th
element in a list wouldn't.
In the other hand, it would be
reasonable to assume that adding
an element in front of, or at the
back of a list is fast (O(1)),
while adding it to a tuple
requires copying the whole
structure.
Rambo Tribble on Feb. 18, 2008:
Okay, to advocate a little deviltry, how immutable is a tuple that has as an element a list, which is mutable? Change the list, you change the tuple.
Rambo Tribble on Feb. 18, 2008:
To rephrase, slightly: Change the list, you change the reported value of the tuple.
david on Feb. 21, 2008:
The argument that tuples are different from lists in advanced math, Erlang, Haskell, or OCaml, is not an argument that tuples in Python are anything more than a list, or another object type.
There is an argument that a tuple is a meta-object, but I haven't seen it made here.
Last Modified: April 15, 2006
Author: James Tauber
deepak on April 15, 2006:
That's my understanding too: tuples are like immutable lists. Your explanation (i-th slot means something specific) seems more about convention -- I could use lists to represent co-ordinates in xyz space knowing that l[2] represents z value for point l.
Could you maybe explain a bit more? :-)