[pybsddb] len() sometimes slow for bsddb3 backed shelves
Gregg Lind
gregg at renesys.com
Fri Sep 19 17:30:17 CEST 2008
Hello list!
I hope you can help me understand a problem I'm having.
Sometimes getting the number of keys in a bsddb database is very slow.
for example:
def fastShelfOpen(filename, flag='c', protocol=None, writeback=False,
cachesize=100, *args, **kwargs):
if cachesize:
cachesize = int(MbToBytes(cachesize))
else: raise ValueError, "cachesize must be defined"
fh = bsddb3.hashopen(filename, flag=flag,cachesize=cachesize) #
handle more optional arguments
fs = shelve.Shelf(fh,protocol=protocol,writeback=writeback,*args,
**kwargs)
return fs
These shelves are reasonably small ~ 115 Mb, with 30,000 keys. I open
them with 25 Mb of cache.
Sometimes len(S) will take upwards of 3 minutes (and sometimes, it is
instantaneous, expecially if it len() has been called recently)! Is
the problem in the shelve layer? File I/O?
Any insight would be appreciated. Working with bsddb files seems to be
something of a black art still.
Thanks,
Gregg Lind
Data Engineer
Renesys Corp.
More information about the pybsddb
mailing list