Avoid unreadable text representations of lattices.
Here's is something that I come accross again and again. What's more, it's a question that it seems will remain relevant even when we redesign systems, so we can as well discuss it here.
The problem is that textual representation of lattices are awfully long. The most awful common case is probably:
>>> lat = kwant.lattice.honeycomb()
>>> print(lat.neighbors())
[HoppingKind((0, 1), kwant.lattice.Monatomic([[1.0, 0.0], [0.5,
0.8660254037844386]], [0.0, 0.0], '0', None), kwant.lattice.Monatomic([[1.0,
0.0], [0.5, 0.8660254037844386]], [0.0, 0.5773502691896258], '1', None)),
HoppingKind((0, 0), kwant.lattice.Monatomic([[1.0, 0.0], [0.5,
0.8660254037844386]], [0.0, 0.0], '0', None), kwant.lattice.Monatomic([[1.0,
0.0], [0.5, 0.8660254037844386]], [0.0, 0.5773502691896258], '1', None)),
HoppingKind((-1, 1), kwant.lattice.Monatomic([[1.0, 0.0], [0.5,
0.8660254037844386]], [0.0, 0.0], '0', None), kwant.lattice.Monatomic([[1.0,
0.0], [0.5, 0.8660254037844386]], [0.0, 0.5773502691896258], '1', None))]
Trying to explain to someone that hopping kinds are actually very simple is doomed to fail!
This could be resolved by printing names of the lattices instead of the lattices. But for that to be useful, names should be unique inside a session, and also survive pickling and unpickling (so that different nodes in a parallel computation share the same lattice names).
But this already shows the problem: what if I try to unpickle two different lattices that share the same name?
There is no fully satisfactory solution to this problem. There is no way to have a globally unique and nice (compact, human readable) naming scheme. But perhaps we can still improve on the current situation? Here's what I propose:
-
Introduce a global (at the level of an interpreter instance) registry of names. That could have the form of two weakref dictionaries:
- One mapping site families to names. It's purpose is to allow automatically assigning names to site families.
- One mapping names to sets of families. It's purpose is to track the multiple use of names.
-
Assign names by default, thus enforcing that a site family always has a name. If no name is provided, assign that first one of, say, "square0", "square1", etc. that is unused. If a name is provided, use it, but warn if it is already used for a different family.
-
When printing a site family, show only the name, but add a note if that name is not unique, like this: "<site family a (name is not unique!)>"
I would expect that names that are not unique would almost never occur in practice, but I think that we still have to support them to avoid problems in corner cases.
What do you think? Do you have a better idea?