[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

Re: Why have a multipart document address?



Roger, while we're on addressing considerations, can you shed any light
on the backend <-> backend protocol?  I've come up with an approach I'm
implementing but I've no idea if it is what the Xanadu team was
considering doing.  Finding a way to (a) avoid storing a full copy of
the docuverse at each node, and (b) alerting nodes to the availability
of new documents in some manageable fashion is an interesting problem.
Using the Green division of the tumbler space, the topic at hand, seems
to help.

Given a document address, the node portion can be extracted and use to
first, check the local cache and if not found, second, to issue a
retrieval request at the home node for that document.  No problem.

My concern is how to obtain the document address for new, remotely
created documents, i.e. how does node A become aware that there is new
content at node B, with whom he has never held a conversation.

That breaks down really into how queries for links are handled.  Say I
want to find out who links to my resume, stored on my server.  Those
links may be on my node, or on any of a million other nodes.  If I issue
a query for "links to doc 34 of type 45" to my node, I'll find those
links whose home is on my node, but not those on a remote node.

My solution is that each time any node creates a link, it iterates over
the endpoints and sends a notify to each node involved in some way in
that link.  Those nodes can then pull a copy of the link from my home
copy into their cache, and add it to their internal index.  Now when I
query my own node, I'll find those links that reside far away.

This means that links are guaranteed to reside on any node with which
they have a linking relationship, but not on unrelated nodes, until
referenced by a reader.

This is subject to scalability problems, either accidental in the case
of a commonly used catalog document like a list of all known users in
Xanadu, or intentional in the case of a cracker.

My, admittedly poor, solution is that each time a node receives a notify
from a remote node, it first counts the number of such links it
currently holds for that remote node, and drops the notify message if
some threshold is exceeded.  This loses information but preserves my
local storage from overload.

-Jeff


On Thu, 2005-02-10 at 10:22 -0800, roger gregory wrote:
> Thanks, Andrew, my reply was a little incoherent, but I wanted to point
> to some of the higher level considerations.  The important thing is that
> it's not clear how e\we would design this differently, even though we
> seem to be in a different universe now.  The considerations that led to
> those design considerations still stand.  
> 
> I think it's instructive to notice that Gold didn't have a different
> addressing scheme.  Not that it was intended to use the same scheme, but
> that it was considered well enough solved and modular enough, that any
> design changes could be postponed till closer to shipping.  Given the
> kind of redesign that took place in Gold that's a resounding
> endorsement!  Still it could use some reexamination for the next quantum
> leap, though I'll continue to use it for my green stuff, and green<=>
> html stuff.