[pybsddb] Len() operation taking a long time

Gregg Lind gregg at renesys.com
Mon Nov 17 13:34:03 CET 2008


That is (sadly) consistent with my experience as well.  If you look back 
at previous messages, the bsddb does indeed have to do a full scan to 
get the its length.  As a strategy, I keep track of the key count in the 
Python layer.  It's unpleasant and error-prone. 

Gregg

andrew wrote:
> Hi All,
>
> I'm working with pybsddb on a database of around 1.5M keys and around
> 1.2GB on disk, and I've noticed that doing a len() on the database is
> taking around 10 minutes. The database is just a hashed store, btw, not
> a btree. I'm a bsddb novice but I would have thought that the number of
> keys in the database would be stored somewhere, but from what I've read
> so far it looks like the database has to be scanned to do this, which
> seems crazy.
>
> Any ideas if the 10 minutes is reasonable for a database of this size
> (on a fast server-grade machine) ? I was previously fetching all the
> keys and taking the len of that, but then switched to the built-in len()
> mapping on the database - but it still takes 10 minutes.
>
> Thanks for your help.
>
> Cheers, Andrew.
>
> _______________________________________________
> pybsddb mailing list
> pybsddb at argo.es
> http://mailman.argo.es/listinfo/pybsddb
> http://www.argo.es/~jcea/programacion/pybsddb.htm
>   




More information about the pybsddb mailing list