Berkeley DB Reference Guide:
Transaction Protected Applications

PrevRefNext

Building transaction protected applications

When building transactionally protected applications, there are some special issues that must to be considered:

  1. Recovery must be single-threaded. Complicating this issue is the fact that the Berkeley DB library cannot determine that recovery needs to be performed, the application must decide if recovery is necessary.

  2. If any thread of control exits holding Berkeley DB library resources, recovery must be performed to recover the resources. Further, if any thread of control exits holding Berkeley DB library mutexes, recovery must be performed to avoid starvation as the remaining threads of control convoy behind the failed thread's locks. Complicating this issue is the fact that the Berkeley DB library cannot determine that a thread of control has died, again, the application must decide if recovery is necessary.

It simplifies matters that recovery may be performed regardless of whether recovery strictly needs to be performed, that is, it is not an error to run recovery on a database where no recovery is necessary.

There are two common ways to build transactionally protected Berkeley DB applications.

The most common way to build Berkeley DB applications is as a single, usually multi-threaded, process. This architecture is simplest because it requires no monitoring of other threads of control. When the application starts, it opens and potentially creates the environment, runs recovery, and then opens its databases. From then on, the application can create new threads of control as it chooses. All threads of control share the open Berkeley DB DB_ENV and DB handles. In this model, databases are rarely opened or closed when more than a single thread of control is running, that is, they are opened when only a single thread is running, and closed after all threads but one have exited. The last thread of control to exit closes the databases and the environment.

A less common way to build Berkeley DB applications is as a set of cooperating processes, which may or may not be multi-threaded. This architecture is more complicated.

First, this architecture requires that the order in which processes are created and subsequently access the Berkeley DB environment be controlled, because recovery must be single-threaded. The first process to access the environment must run recovery, and no other process should attempt to access the environment until recovery is complete. (This ordering requirement does not apply to environment creation without recovery. If multiple processes attempt to create a Berkeley DB environment, only one will perform the creation and the others will join an existing environment.)

Second, this architecture requires that processes be monitored. If any process which acquires Berkeley DB resources exits, without first cleanly discarding those resources, recovery is usually necessary. Before running recovery, all processes using the Berkeley DB environment must relinquish all of their Berkeley DB resources (it does not matter if they do so gracefully or because they are forced to exit). Then recovery can be run and the processes continued or re-started.

The safest way to structure groups of cooperating processes is to first create a single process that opens the Berkeley DB environment and runs recovery, and which then creates processes that will actually perform work. The initial process is given no further responsibilities other than to monitor the processes it has created, to ensure that no process unexpectedly exits holding Berkeley DB resources. If one does, it then forces all of the processes that are using the Berkeley DB environment to exit, runs recovery, and restarts the working processes.

If it is not practical to have a single parent for the processes sharing a Berkeley DB environment, each process sharing the environment should log their connection to and exit from the environment in some fashion that permits a monitoring process to detect if a thread of control may have potentially acquired Berkeley DB resources and never released them.

Obviously, it is important that the monitoring process in either case be as simple and well-tested as possible as there is no recourse should it fail.

PrevRefNext

Copyright Sleepycat Software