<html><body bgcolor="#FFFFFF"><div>We ran into the same issue and wrapped the class. We store a simple count in a special key, __len__, and return that when you call len(). Maybe not ideal but definitely worth several minutes of performance improvement. <br><br>On Nov 16, 2008, at 11:06 PM, "andrew" <<a href="mailto:andrew@reurbanise.co.nz">andrew@reurbanise.co.nz</a>> wrote:<br><br></div><div></div><blockquote type="cite"><div>
<!-- Converted from text/plain format -->
<p><font size="2">Hi All,<br>
<br>
I'm working with pybsddb on a database of around 1.5M keys and around<br>
1.2GB on disk, and I've noticed that doing a len() on the database is<br>
taking around 10 minutes. The database is just a hashed store, btw, not<br>
a btree. I'm a bsddb novice but I would have thought that the number of<br>
keys in the database would be stored somewhere, but from what I've read<br>
so far it looks like the database has to be scanned to do this, which<br>
seems crazy.<br>
<br>
Any ideas if the 10 minutes is reasonable for a database of this size<br>
(on a fast server-grade machine) ? I was previously fetching all the<br>
keys and taking the len of that, but then switched to the built-in len()<br>
mapping on the database - but it still takes 10 minutes.<br>
<br>
Thanks for your help.<br>
<br>
Cheers, Andrew.<br>
<br>
_______________________________________________<br>
pybsddb mailing list<br>
<a href="mailto:pybsddb@argo.es"><a href="mailto:pybsddb@argo.es">pybsddb@argo.es</a></a><br>
<a href="http://mailman.argo.es/listinfo/pybsddb"><a href="http://mailman.argo.es/listinfo/pybsddb">http://mailman.argo.es/listinfo/pybsddb</a></a><br>
<a href="http://www.argo.es/~jcea/programacion/pybsddb.htm"><a href="http://www.argo.es/~jcea/programacion/pybsddb.htm">http://www.argo.es/~jcea/programacion/pybsddb.htm</a></a><br>
</font>
</p>
</div></blockquote></body></html>