[pybsddb] Len() operation taking a long time

Chris Mulligan chris at polimetrix.com
Mon Nov 17 15:03:39 CET 2008


We ran into the same issue and wrapped the class. We store a simple  
count in a special key, __len__, and return that when you call len().  
Maybe not ideal but definitely worth several minutes of performance  
improvement.

On Nov 16, 2008, at 11:06 PM, "andrew" <andrew at reurbanise.co.nz> wrote:

> Hi All,
>
> I'm working with pybsddb on a database of around 1.5M keys and around
> 1.2GB on disk, and I've noticed that doing a len() on the database is
> taking around 10 minutes. The database is just a hashed store, btw,  
> not
> a btree. I'm a bsddb novice but I would have thought that the number  
> of
> keys in the database would be stored somewhere, but from what I've  
> read
> so far it looks like the database has to be scanned to do this, which
> seems crazy.
>
> Any ideas if the 10 minutes is reasonable for a database of this size
> (on a fast server-grade machine) ? I was previously fetching all the
> keys and taking the len of that, but then switched to the built-in  
> len()
> mapping on the database - but it still takes 10 minutes.
>
> Thanks for your help.
>
> Cheers, Andrew.
>
> _______________________________________________
> pybsddb mailing list
> pybsddb at argo.es
> http://mailman.argo.es/listinfo/pybsddb
> http://www.argo.es/~jcea/programacion/pybsddb.htm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.jcea.es/pipermail/pybsddb/attachments/20081117/ab55443f/attachment.htm>


More information about the pybsddb mailing list