[pybsddb] How to manage logs

Amirouche Boubekki amirouche at hypermove.net
Thu Jun 18 11:58:25 CEST 2015


Héllo,


I'm loading a dataset (conceptnet5) into Ajgu Db [1] backed by pybsddb3 
'6.0.1' and Berkeley DB 5.3.21.

The problem I have is that even when I'm not using transactions (passing 
txn=None) my database fills the disk with log files. There is 2.3 Go of 
database files (including __db.* files) out of 429 Go total disk space 
used by the database directory (du -h .).

How can I remove those log files during the import of the database. 
Right now the script can't even finish the loading of the first file of 
the dataset.

My db environment is configured as follow

```
         # init bsddb3
         self._env = DBEnv()
         self._env.set_cache_max(*max_cache_size)
         self._env.set_cachesize(*cache_size)
         flags = (
             DB_CREATE
             # | DB_INIT_LOG
             | DB_INIT_TXN
             | DB_INIT_MPOOL
         )
         self._env.set_flags(DB_LOG_AUTO_REMOVE, True)
         self._env.open(
             str(self._path),
             flags,
             0
         )
```
https://git.framasoft.org/python-graphiti-love-story/AjguGraphDB/blob/f8bf004ee132ac21fcbbb1c925889a16f1d5388d/ajgu/storage.py#L62

Every single store is created with the following function

```
         # create vertices and edges k/v stores
         def new_store(name, method):
             txn = self._txn()
             flags = DB_CREATE
             elements = DB(self._env)
             elements.open(
                 name,
                 None,
                 method,
                 flags,
                 0,
                 txn=txn._txn
             )
             txn.commit()
             return elements
```



[1] https://git.framasoft.org/python-graphiti-love-story/AjguGraphDB


Regards,

-- 
Amirouche ~ amz3 ~ http://www.hyperdev.fr


More information about the pybsddb mailing list