[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

versioning & cell copying



(I started writing this before I received your mail, Tuomas, on zzdev --
that's why it isn't a reply.)

Hi everyone--

I'll try to do clarify on the HTML stuff etc. later, but school has
started again so I have less time. So, just a short note about Tuomas'
versioning draft. Basically, I think it's great, but I have some
comments; actually, I think it has made me find the mathematically
possible implementation of my wishes I've sought.



First, I like the system of d.version and d..cursor-version as the only
dimensions transcending time a LOT. Maybe the latter should be named
d..version-cursor. That sounds better, and it isn't like versioning is
any less a part of its speciality than cursing. It would make sense to
only have version-XXX dimensions transcend time.



Second, timestamps should be real times WHENEVER POSSIBLE. When not
possible (because the system time is earlier or same as the last
timestamp of the space), append a serial number. This allows
"lifestream" versioning: What did this look like in May? What did I
change in this paragraph the day before yesterday?

Problem: When the system clock is wrongly set to some distant future
just one time a user changes the space, lifestream versioning is
impossible for every change later than that. This needs to be resolved.



Third, I don't think that d.cell-version and d.content-version make too
much sense. We will seldomly want to look only at the past versions of
one cell: more often, we will want to look at the past versions of a
whole CLUMP of cells (or maybe a cluster; yeah, they shouln't have to be
connected at EVERY time, so cluster's more appropriate). If we have both
overall timestamps and cell-specific d.version's, these are of course
not at all hard to build. (Inter-slice, it may not work, but we can just
forbid that. Transient cells, too. They're pbly not versioned at all,
are they now?)

How could such a view work? I suggest having a view for the actual
cells, and another view for the different versions of these cells. (We
may use short descriptions of the changes made -- like the connected
dimension name, or the new text in the cell -- to label the different
versions.) Selecting a different cell in the versioning view would
change the version the cell view shows. A-J's generic lazy structures
might be good for implementing the versioning views, or maybe insertion
of a "filter" for what a raster sees. (Easier, probably, but much less
generally useful. Lazy structures would be good for just *so* many
things which are just permutations of persistent structures -- like,
tables of astronomical constallations, or 1x1 tables.) The versioning
view would be bound to the cell view in some as yet unspecified way --
maybe cursor triggers.

Now, if we support "custom versioning" views showing the versions of a
cluster of cells, why should we have additional dimensions generated
from the standard versioning one? d.version might be a good efficiency
hack, because of thousands of changes to zzspace, only one might be
pertinent to one cell; the same is not true for d.cell-version and
d.content-version. There can be custom versioning views for that.



Fourth, I think that we should at this point consider a general COPYING
system for cells -- simple copying is trivial, but we want TRANSCOPYING.
I'm not talking about clones, although they are key-bound to "t" for
"transclude" (if I remember right), I'm talking about classical
Xanadical transcluding, which is very different from cloning. When you
have transcopied something, you can change it without the original
changing -- and you still maintain a connection which can be followed
two ways.

The examples for timestamps Tuomas gives are numbers -- like 100. These
are not sufficient for version branching, like in OSMIC -- and this, I
think, is just fine. A DOCUMENT may have different "newest" version
branches; a COMPUTER WORLD may not. When I fire my computer up, I want
to see the ONE CURRENT version. But we also need branching versions for
PARTS of that world -- because a versioning system for documents which
cannot branch is the madness we have now.

You've probably already guessed the connection: I propose branching by
transcluding an older version of a cell cluster into the current space.
And -- you might not have guessed this -- transcopying creating a
connection on d..version-alt. (This implies that all non-headcell
versions on d..version-alt have either the same connections as the
headcell or less; we thus do not need dummy cells on d..version-alt in
order to be able to branch a branch before changing it, because a subset
of a subset is still a subset of the original set.)

Solving two problems with one solution is always a good way to have the
user learn less, because there are lesser principles to understand. :)

Transcluding from one present "document" into another then is just a
special form of branching -- and can be viewed with the standard
versioning views. A view showing the versions of a cluster should show
the normal versions on one dimension -- Y? -- and the branches on
another -- X?. X for versions and Y for branches seems to be more
natural, but current views are higher than broad, which isn't good here.
Of course, the user can re-configure it the way they like. When only a
part of the cluster was branched, from the branch on only that part is
followed, of course. The view acts the same as if the other cells in the
cluster had been deleted.

Each cell would have its own connection on d..version-branch, but the
whole branch would happen at the same versioning time(stamp).

Trying to edit a past version of a cluster would branch the whole
cluster. If you don't want to branch, just look at past versions, simply
use a view that doesn't allow editing. (There is no way to edit the past
without branching it.)



Fifth, this provides an interesting take on cell exchange (via eMail,
xu://, disks, PDA connections etc. pp.). Sending a cell/a cluster would
mean branching it, preserving the version timestamp in the originating
space, and a unique ID of the originating space, if available. Also, the
version timestamp of the branch headcell is sent, as well as the
timestamps of the next early / next later version. If more clusters from
the same space are received later, the connections between the versions
can be reconstructed. This is a very important part of versioning once
we get past a single machine.

Reconstructing from the branches is possible because each branch is a
subset of the original, with nothing added. So, you can just put
different branches of the same original together, and get something
closer to the original. There CAN'T be ANY conflicts.

Note that we may want to include information about connections to cells
we don't send, so that the connection can be shown without the cell it
leads to -- when using xu://, for example. On the other hand, we may not
want to allow this. Anyway, it's a special case we don't need to resolve
right now, I think.



I believe this is the mathematically possible way to fulfill my earlier
versioning wishes. I *could* have gotten the idea myself without the
help of Tuomas draft -- thing is, I didn't. :) Anyway, what do you think
about it? Anything you want me to clarify on? Do you see the necessity
of the proposed systems? (I *do* think that at least 3-5 are VERY, VERY
necessary; that's why I'm so eager to tell you about this now, and
discuss it with you now.)

Thanks for reading,
- Benja