[pybsddb] Batch import is slowing down
Amirouche Boubekki
amirouche at hypermove.net
Wed Jun 24 12:09:41 CEST 2015
Héllo everybody,
On 2015-06-21 19:37, Jesus Cea wrote:
> I really recommend everybody to read the Oracle Berkeley DB
> documentation. It is really good. Berkeley DB is very flexible but that
> flexibility means that you need to learn about the inner working and
> details of implementation. Skipping this will be frustrating. Invest
> some time reading the docs.
>
> <https://docs.oracle.com/cd/E17076_04/html/programmer_reference/index.html>.
Indeed the documentation is very good, you did a very good job with
that.
Some things just went through or I don't remember reading about them.
I've setup checkpoints and log file removal. Right now my script is only
checkpointing at the
end of the batch load of one file. The dataset is split over several
files of 800M.
With syncless transaction is quite faster. But I noticed that the
loading of data is slowing
down over the course of one file:
- The first set of 10 000 entries took 3 seconds to load.
- 49th set took 2 minutes
Is this something that to be expected?
Thanks in advance,
Amirouche
More information about the pybsddb
mailing list