[pybsddb] Problem scanning large hashed database
Jesus Cea
jcea at argo.es
Fri Dec 5 01:09:06 CET 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
andrew wrote:
> On Thu, 2008-12-04 at 21:47 +0100, Jesus Cea wrote:
>> Anyway, fully scanning the database seems a bad thing to do. You don't
>> need BDB for that. Are you sure there is no other way?.
>
> I guess the problem is that 99% of the time we're just reading and
> writing single objects via a hash index (I presume this is what you get
> if you're not using btrees). Another possibility is to create a separate
> sortable index for the update time of each object, since that's the
> second most important access method, i.e., give me the last 50K objects
> updated. However, I have no idea what the maintenance overhead of that
> would be and how much it would slow down the 99% of hashed reads /
> writes.
I would recommend you to store the indexes you need. If you update them
in the same transaction, and your DB cache is big enough, performance
hit should be minimal. Anything is better than scanning the entire
database by hand.
If you want to query by range, like in your example, keep that index in
a btree. That btree can be in another database file, or in the same
file, with a separate logical database created as a btree.
- --
Jesus Cea Avion _/_/ _/_/_/ _/_/_/
jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/
jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/
. _/_/ _/_/ _/_/ _/_/ _/_/
"Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/
"My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iQCVAwUBSThxG5lgi5GaxT1NAQLHsgP+JtHeDaNMHV5FTNP/4vD5W4Te89ZETxoJ
NuQiQI9PEz3NG+QiQaN4Oc9/hrlXEgSNahMr5u9KlOWcksYWKo0j/7c/bPC9Hjwy
0TTiV9kvEf/xNVvT/Bvf9mZNWqjRK4ieSyl2ytNb09DYGt6z4CEhiY93oyKKSywg
/LRleHm4v6I=
=0IIy
-----END PGP SIGNATURE-----
More information about the pybsddb
mailing list