>>> Foo = namedtuple("Foo", ["bar"])
>>> Baz = namedtuple("Baz", ["qux"])
>>> Foo(bar="hello") == Baz(qux="hello")
True
This also happens with the "new-style" namedtuples (typing.NamedTuple).I like the convenience of namedtuples but I agree with the author: there are enough footguns to prefer other approaches.
>>> temperatures_fahrenheit = np.array([10, 20, 30])
>>> temperatures_celsius = np.array([10, 20, 30])
>>> temperatures_fahrenheit == temperatures_celsius
array([ True, True, True])
But yes, they're still the least-bad choice.
(It's surprisingly difficult to implement a rigorous way to detect this vs compile-time constant evaluation though; note that identical objects of certain types are pooled already when generating/loading bytecode files. I don't think any current implementation is smart enough to optimize the following though)
$ python3 -c 'o = object(); print(id(o) is id(o))'
False
$ pypy3 -c 'o = object(); print(id(o) is id(o))'
True
$ jython -c 'o = object(); print(id(o) is id(o))'
True
How so?
Tuples are awkward with immutability, if you put mutable things inside them.
In languages with value semantics, nothing about this problem even makes sense, since obviously a dict's key is taken by value if it needs to be stored.
--
Tuple behavior is sensible if you are familiar with discussion of reference semantics (though not as much as if you also support value semantics).
Still, at least we aren't Javascript where the question of using composite keys is answered with "screw you".
from typing import NamedTuple
class Point(NamedTuple):
x: int
y: int
z: int
I don't see "don't use namedtuples in APIs" as a useful rule of thumb, to be honest. Ordered and iterable return-types make sense for a lot of APIs. Use them where it makes sense.There are other reasons to reach for namedtuples, for example when order or iterability matter. I don't see a general rule of not using namedtuples in APIs to be that useful. Use the right tool when the need calls for it.
Mostly people just reach for the tool that does what they want, named tuples for tuply stuff, typed dicts for mapping applications, and dataclasses when you actually want a class.
You get the stuff you get from `NamedTuple`, but you can also easily add helper methods to the class as needed. And there are other dataclass goodies (though some things I find to be a bit anti-feature-y).
I agree.
> You get the stuff you get from `NamedTuple`, but you can also easily add helper methods to the class as needed. And there are other dataclass goodies (though some things I find to be a bit anti-feature-y).
I've seen examples where dataclasses were used when order matters, however, hence why I'm not comfortable with a general rule against namedtuples. Sometimes order and iterability matter, and dogmatically reaching for a different data type that doesn't preserve that information might be the wrong choice.
You can use frozen=true to "simulate" immutability, but that just overwrites the setter with a dummy implementation, something you (or your very clever coworker) can circumvent by using object.__setattr__()
So you neither get the performance benefits nor the invariants of actual immutability.
This fits pretty well with a lot of other stuff in Python (e.g. there’s no real private members in classes). There’s a bunch of escape hatches that you should avoid (but that can still be useful sometimes), and those usually are pretty obvious (e.g. if you see code using object.__setattr__, something is definitely not right).
Can’t tell whether this is good design or not, but personally I like it.
What bugs me more about frozen dataclasses is how post-init methods have to use the setattr hack.
- Everything in Python is mutable, including the definitions of constants like `3` and `True`. It's much like "unsafe" in Rust; you can do stupid things, but when you see somebody reaching for `__setattr__` or `ctypes` then you know to take out your magnifying glass on the PR, find a better solution, ban them from the repo, or start searching for a new job.
- Performance-wise, named tuples are sometimes better because more work happens in C for every line of Python, not because of any magic immutability benefits. It's similar to how you should prefer comprehensions to loops (most of the time) if you're stuck with Python but performance still matters a little bit. Yes, maybe still use named tuples for performance reasons, but don't give the credit to immutability.
If you store all your data is dataclasses, you end-up having to either convert back to dictionaries or having to rebuild all that tooling. Python's abstract syntax trees are an example. If nodes had been represented with native dictionaries, then pprint would work right out the box. But with every node being its own class, a custom pretty printer is needed.
Dataclasses are cool but people should have a strong preference for Python's native types: list, tuple, dict, and set. Those work with just about everything. In contrast, a new dataclass is opaque and doesn't work with any existing tooling.
Also you can easily convert a dataclass to a dict with dataclasses.asdict. Not so easy to go from dict to dataclass though
Tuples are a core concept and fundamental data aggregation tool for Python. However, this post uses a trivial `Point()` class strawman to try to shoot down the idea of using tuples at all. IMO that is fighting the language and every existing API that either accepts tuple inputs or returns tuple outputs. That is a vast ecosystem.
According the glossary a named tuple "any type or class that inherits from tuple and whose indexable elements are also accessible using named attributes." Presumably, no one disputes that having names improves readability. So really this weak post argues against tuples themselves.
I trust that the maintainers of the Python language & the Python Standard Library are not going to change their tuple-using APIs in a breaking way, without a clear signal (like a major-version bump).
I do not extend that same trust to other Python projects. Maybe I extend that same trust to projects that demonstrate proper use of Semantic Versioning, but not to others.
Using something other than tuples trades some performance for some stability, which is a trade I’m OK with.
def __iter__():
yield self.my_field
yield self.my_other_field
Recreates tuple unpacking.there was a point a while back where python added __slots__ to classes to help with this; and in practice these days the largest systems are using numpy if they're in python at all
not sure what modern versions do. but in the olden days, if you were creating lots of small objects, tuples were a low-overhead way to do it