[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

how to kill object servers



Date: Tue, 11 Sep 90 02:44:18 PDT
   From: xanadu!michael (Michael McClary)

   ... Question:  Does the semantics of object servers include database-style
   reliability (i.e. once you tell 'em the bert's hopped, crash-recovery
   can never hop it back)?  

Every object server is different.  It's only slightly easier to make
generic statements about OODBs than about pre-relational databases.
However I would guess that most provide some notion of a declaration
that a set of changes is consistent, for crash recovery purposes.  I
would expect that most of the exceptions to this are of the
persistent-virtual-memory style (NIL & KeyKOS(sort of)) where the disk
is necessarily consistent given that at every instant the state of the
computation (virtual memory + stack + registers) is consistent.  I
don't know which provide some kind of "sync" or "commit" to guarantee
that recovery will not restore to previous to a given time.  However,
I suspect many do.

   Remember: that's a guarantee we DON'T make,
   and don't intend to make until long after first product, if ever.
   (We're a library, not a bank.)  We only guarantee that the database
   is consistent, not that it doesn't revert to the state a couple seconds
   before the CPU fried.

The guarantee we've talked about making (in addition to normal
serialization consistency) is as follows: We provide a FeBe request
(with a meaning similar to fsync) which guaratees that on normal crash
recovery the backend will be in a consistent state no earlier than
when the fsync request was made.  However, any consistent state after
an fsync is allowed.  Note for our concurrent future: the guarantee is
made as of the time of reception of the fsync request, not the time of
response, but the guarantee isn't in force until the response is sent.
This allows the fsync request to be slow to the requestor while
continuing to give other connections quick service.  For example, an
adequate (but not advocated) implementation of fsync would simply hang
the requestor until the normal Urdi-SnarfXcvr-Shepherd logic committed
at or past the time the fsync request was received.

Note that the guarantee may still be violated if we restore the server
from a backup tape.

   (I suspect that only special-purpose object servers make that guarantee.
    It's very costly.)

Is the above scenario consistent with the costs involved, or is this
even more expensive then I know?  Unlike NFS, we're certainly not
contemplating a commit-to-disk on every FeBe request.  (which is the
substantial performance cost NFS pays for being stateless)