[Date Prev][Date Next][Thread Prev][Thread Next][Author Index][Date Index][Thread Index]

Re: PrimArray pointers



> From: mark (Mark S. Miller)
> 
> Does anyone find the ability to index into PrimArrays with square
> brackets terribly convenient?  If not, then (after unblocking Michael
> & Dean) I like to get rid of it.  It is dangerous in the case where it
> is convenient: "p[i]" (read the caveat in parrayx.hxx), and confusing
> in the case where it's safe: "(*p)[i]".  Getting rid of this would
> simplify parray a great deal.  Any objections?

I'm all in favor of eliminating anything that causes confusion.  I
agree that the current square-bracket implementation is a magnet for
off-by-one-level-of-indirection errors, and that this magnetism probably
can't be removed without eliminating the operator.  So while I have my
QC hat on, I'm inclined to agree with eliminating it.  But before we go
blowing away operator[], I'd like to examine this more.

First, what is to replace the functionality?  In a glance through the
.hxx file I saw no member function other than operator[] that was an
obvious candidate for extracting or storing values at a particular
offset.  Arrays without any sort of subscription seem somewhat futile
even if steppers ARE available, so if there is no alternate subscription
operation I suspect we'll need to write one.

Second, isn't this a problem ONLY for people writing directly in C++?
Doesn't smalltalk always generate the right thing when you use at: or
at:put:?  Won't you have to rehack the translator to make it generate
the replacement?

Third, let's look at WHY it's a problem:

As I read it, the confusion is the result of thinking of the variable
which is a pointer to the object as the object itself, and naming it
accordingly.  This is the normal smalltalk idiom, but it's one I avoid
like the plague that it is.

In the example given in the caveat, the variable used is named "array",
implying no indirection to reach an array, or one indirection to reach
a value.  But it uses the declaration:

	PtrArray * array = somePtrArray;

Which has one more indirection than the name "array" implies.  (The
type PtrArray itself is the most confusing of the array types to chose
for an example, because its values are, in turn, pointers-to-Heaper.)

The confusion is made total by what I consider the REAL bug:  The existence
of an operator[] in special overridings of SPTR and CKPTR (but not WPTR!).
This operator forwards the subscript to the pointed-to object, eating one
level of indirection.  But it only eats it if you go through an SPTR or
a CKPTR.  By doing so, it imports the smalltalk confusion into X++, but
does so in an incomplete (and thus defective) manner.  (BTW:  This
overloading appears broken under opaque definition of the Array classes,
and when not broken it in turn appears to break having arrays of SPTRs
or CKPTRs to the Array classes.)

The ambiguity does not appear in the naming convention I use.  This is:
append one P (short for Pointer) to the end of the name for each level
of indirectness necessary to reach something of the type implied by the
base name.  Thus you will see names like "urdiP" for things of type
"Urdi *" or "SPTR(Urdi)", but you will not see a P for the level of
indirection/subscription already implied in names like "buff" or "p".

Using this convention, and removing the operator[] forwarding hack from
SPTR, the caveat becomes:

	PtrArray * arrayP = somePtrArray;
		// (But aren't bare "XanaduObject *"s considered poor form,
		//  one of the things XLint checks for, and about to be made
		//  unnecessary by the *PTRs-to-everything trick?  If so,
		//  any code of the sort in this chunk, if it exists, can
		//  soon disappear.  Unfortunately, WPTRs are still typedefed
		//  as "Whatever *"s - but making them full-blown classes
		//  with no virtual member functions should just eliminate
		//  the undesired case while leaving the code generated for
		//  the remaining cases unchanged.)
	Heaper * anError = arrayP[1];
		// Which reads correctly as an attempt to exercise the
		// pointer/array ambiguity on an array of arrays.  It
		// compiles withoug error only because a PtrArray is a
		// subclass of Heaper, and the analogous case in the
		// other Array classes should produce compile-time errors
		// unless the implied cast is valid.
		// (The fact that this one compiles is still very bad.)
	Heaper * anObj = (*arrayP)[1];
		// The pointer must be dereferenced to pass operator[] to the
		// array itself, but this is now obvious from the name.

	SPTR(PtrArray) array2P = somePtrArray;
	Heaper * anObj = array2P[1];
		// This doesn't work because operator[] is gone.  Good.
	Heaper * anObj = (*array2P)[1];
		// Again does just what the names implies.

	SPTR(PtrArray) array3P[2] = {somePtrArray, someOtherPtrArray};
	Heaper * anObj = array3P[1];
		// This shouldn't work (but I'm guessing REAL hard now)
		// because BombSuperclass doesn't descend from Heaper.
	Heaper * anObj = array3P[0][1];
		// This works now, but didn't before.  Good.
	Heaper * anObj = (*array3P)[1];
		// This works, because it's an exercise of the standard
		// pointer/array ambiguity of C, and equivalent to the
		// previous case.

The examples in your letter are a special case that would not change,
because in C idiom the short-names "p" and "q" are special.  They imply
one level of indirection to the thing being manipulated, and explicitly
call attention to the pointer-nature of the variable itself while punting
the question of the pointed-to type.  Typing one of them as "Array *" means
it's a >pointer to< an >Array<.  This means "p[1]" is an "obvious" error,
and "(*p)[1]" is "obviously" correct.  Further, if p is typed "SPTR(Array)"
rather than "Array *", the error-prone case becomes a compile-time error,
just as in the above example.

(I tend to avoid "p" and "q" in favor of "somethingMnemonicP", so the
special cases wouldn't arise, and we'd be back to "fooArrayP[1]" which
actually IS obviously an error, etc.  Also, the use of "p" or "q" is
stylisticly questionable in a code fragment where the entire intent is
to manipulate a value which is more than one level of indirection away.)

Yes, off-by-one-level-of-indirection bugs are the bane of C, and thus of C++.
But they're a STANDARD bane.  B-)  C programmers deal with them all the time,
and I doubt we'll make a lot of friends by deleting square brackets from our
array classes, even if doing so really IS doing them a favor (which I
consider unproven).

Unless someone has a better idea, I'd (moderately) favor:

 - removing the broken []-forwarder from SPTRs and CKPTRs (which can be
   done by stripping out the special overridings, replacing them with
   the normal macros expansion).

 - checking that the smalltalk translator and the Array-using code
   (if any) in xpp/ do the right thing,

 - perhaps renaming some variables (if the variable-name business strikes
   anyone as a brilliant idea rather than compulsive pickyness - which
   I doubt will happen among native speakers of Smalltalk).

and letting it go at that.

Removing the []-forwarder should leave the smalltalk source untuched, while
producing exactly the (abysmal?) behavior that programmers used to C arrays
and C++ array-class hacks expect.  Such programmers are the only customers/
developers likely to be programming directly in X++.  Though all choices
are bad here, I suspect we'll help them more if we give them an environment
with a pitfall they're used to avoiding than if we eliminate the hazard by
substituting something extra for them to learn.

	michael