| Version: | ordereddict 0.4.4 |
|---|---|
| Author: | Anthon van der Neut |
| Contact: | anthon@mnt.org |
| Date: | 2009-05-11 |
This is an implementation of an ordered dictionary with Key Insertion Order (KIO: updates of values do not affect the position of the key), Key Value Insertion Order (KVIO, an existing key's position is removed and put at the back).
Sorted dictionaries are also provided. Currently only with Key Sorted Order (KSO, no sorting function can be specified, but you can specify a transform to apply on the key before comparison (e.g. string.lower)
Usage:
from _ordereddict import ordereddict
kio = ordereddict()
kvio = ordereddict(kvio=True)
# without relax unordered initalisation is not allowed
d = ordereddict({'a':1, 'b': 2}, relax=True)
sd = sorteddict({'a':1, 'b': 2}) # sorteddict is always relaxed
please note the underscore which is new since version 0.3
This module has been tested under:
| OS | compiler | Python |
| Ubuntu 8.10 | gcc 4.3.2 | 2.6.1 |
| Ubuntu 8.10 | gcc 4.3.2 | 2.5.4 |
| Ubuntu 8.10 | gcc 4.3.2 | 2.4.6 |
Older versions of this module has been tested under and I expect those to still work:
| OS | compiler | Python |
| Ubuntu 7.04 | gcc 4.1.2 | 2.5.1 |
| Ubuntu 7.04 | gcc 4.1.2 | 2.4.4 |
| Ubuntu 6.06 | gcc | 2.5.1 |
| Windows XP | Visual Studio 2003 | 2.5.1 |
Version 0.4.1 was tested and found working on SuSE Linux Enterprise Server (GCC 4.1.0 and Intel C/C++ 10.1) by Stuart Stock.
http://www.xs4all.nl/~anthon/Python/ordereddict is ordereddict's home on the web.
Downloading:To install the package unzip/untar and run:
python setup.py install
For Windows users without a compiler, the .pyd file for Python 2.5.1 has been included in the .zip download. Just copy that into your site-packages directory.
If you find any problems, please let me know, but also realise that I have a spamfilter that catches over 100 emails a day and yours might get in there unnoticed. So if there is no response within a few days please try again.
ordereddict has all of the functionality of dict() except that there is no keyword based initialisation and that you cannot pass a normal dict to the initialisation of the basic ordereddict (however see the relaxed keyword below). sorteddict cannot be initialised from keywords either, but can be initialised from normal dict (ie. they are always relaxed).
As you probably would expect .keys(), .values(), .items(), .iterkeys(), itervalues(), iteritems() and "for i in some_ordereddict" have elements ordered based on the key insertion order (or key value insertion order if kvio is specified, or sort order for sorteddict).
ordered/sorteddicts can be pickled.
Some methods have been slightly changed:
In addition to that ordereddict and sorteddict have some extra methods:
and ordereddict only also has:
.setkeys(), works like the one in the Larosa/Foord implementation. Argument must be an itereable returning a permutation of the existing keys ( that implies having the same length as the ordereddict)
.reverse() - reverses the keys in place
.insert(position, key, value) - this will put a key at a particular position so that afterwards .index(key) == position, if the key was already there the original position (and value) is lost to the new position. This often means moving keys to new positions!
.rename(oldkey, newkey) renames a key, but keeps the items position and value
testordereddict.py in the test subdirectory has been used to test the module. This is best run with py.test (http://codespeak.net/py/dist/test.html):
py.test testordereddict.py
but if you do not use py.test (yet), you can also use:
python testordereddict
to run a large part of the tests as well.
There is a somewhat patched copy of the python lib/Test dictionary testing routines included as well, it fails on the _update test however. You can run it with:
cd test/unit python test_dict.py
ordereddict is directly derived from Python's own dictobject.c file. The extensions and the representation of ordereddicts() are based on Larosa/Foord's excellent pure Python OrderedDict() module (http://www.voidspace.org.uk/python/odict.html).
The implemenation adds a vector of pointers to elements to the basic dictionary structure and keeps this vector compact (and in order) so indexing is fast. The elements do not know about their position (so nothing needs to be updated there if that position changes, but then finding an item's index is expensive. Insertion/deletion is also relatively expensive in that on average half of the vector of pointers needs to be memmove-d one position. There is also an long for bit info like kvio, relaxed.
The sorteddict structure has an additional 3 pointers of which only one (sd_key) is currently used (the others are sd_cmp and sd_value).
Based on some tests with best of 10 iterations of 10000 iterations of various functions under Ubuntu 7.10 (see test/timeordereddict.py and test/ta.py):
Results in seconds: --------------------------------- dict ordereddict OrderedDict empty 0.017 0.017 0.017 create_empty 0.021 0.022 0.100 create_five_entry 0.028 0.029 0.275 create_26_entry 0.126 0.132 1.021 create_676_entry 3.548 3.738 25.116 get_keys_from_26_entry 0.137 0.146 1.038 pop_5_items_26_entry 0.150 0.164 1.364 pop_26_items_676_entry 5.423 5.791 31.420 popitem_last_26_entry 0.136 0.148 1.131 popitem_last_676_entry 3.575 3.769 25.672 popitem_100_676_entry -------- 3.773 25.433 walk_26_iteritems -------- 0.473 2.622 Results normalised against ordereddict == 1.0 --------------------------------- dict ordereddict OrderedDict empty 1.007 1.000 1.014 create_empty 0.962 1.000 4.530 create_five_entry 0.957 1.000 9.494 create_26_entry 0.955 1.000 7.743 create_676_entry 0.949 1.000 6.720 get_keys_from_26_entry 0.935 1.000 7.098 pop_5_items_26_entry 0.910 1.000 8.299 pop_26_items_676_entry 0.936 1.000 5.425 popitem_last_26_entry 0.922 1.000 7.655 popitem_last_676_entry 0.949 1.000 6.811 popitem_100_676_entry -------- 1.000 6.742 walk_26_iteritems -------- 1.000 5.546
Because I am orderly ;-O, and because I use dictionaries to store key/value information read from some text file quite often. Unfortunately comparing those files with diff when written from normal dictionaries often obfucates changes because of the reordering of lines when key/value pairs are added and then written.
I have special routine for YAML files that takes lines like:
- key1: val1
- key2: val3
- key3:
- val3a
- val3b
(i.e. a list of key-value pairs) directly to a single ordered dictionary and back. (I find it kind of strange to finally have a structured, human readeable format, format that does not try to preserve the order of key-value pairs so that comparing files is difficult with 'standard' text tools).