How did UCSD Pascal exception mechanism work?

Question

In the UCSD PASCAL II.0 Manual, there is a description (document page 140, PDF page 153) of a peculiar language feature: instead of non-local GOTO statements, as in the Wirth's Pascal, there is

        EXIT is a UCSD extension which accepts as its single parameter
the identifier of a procedure to be exited. The use of an EXIT 
statement to exit a FUNCTION can result in the FUNCTION returning
undefined values if no assignment to the FUNCTION identifier is made
prior to the execution of the EXIT statement.

Then there is an example containing

PROCEDURE Q; FORWARD;

PROCEDURE P;
BEGIN
...
  IF some_condition THEN EXIT(Q);
...
END;

PROCEDURE Q;
BEGIN
  P;
...
END;

Regarding special cases, there is only one detail:

        If the procedure identifier passed to EXIT is a recursive
procedure, the most recent invocation of that procedure will be
exited.

However, somewhat surprisingly, there is no mention of the case when a procedure calls EXIT(name) when there is no name on the call stack. For example, if the procedure P was called directly from the main program, and some_condition was true, executing EXIT(Q) would be nonsensical.

Indeed, on the page 263 of the document (PDF page 276), the table of system errors contains

3       Procedure not present at exit time (XNOEXIT)

How was that feature implemented? Was it specific to P-code, or was there a Pascal implementation compiling to executable binaries which had the same feature?

Are you familiar with "activation frames" and how Pascal handled multiple lexical levels of activation? I've not thought about this in a long time, but if I were to spend some time on your question I think it would all come down to what's available at each nested activation level and frame. It looks very much like something that allows a programmer to skip over some of the nested levels and return to an earlier caller. — jonk, Commented Jul 6, 2022 at 5:48
Can't write a full answer now but the assembly implementation of the exit procedure in version I.5 can be found on page 191 of the source code listing pascal.hansotten.com/uploads/ucsd/ucsd/… — texdr.aft, Commented Jul 6, 2022 at 8:10
I don't see why this would be significantly different to similar mechanisms, whether called 'non-local goto' or 'stack unwinding' during exception delivery. Either way, you need to peel layers off the stack of activation records until you arrive at the frame you want. VMS exception handling (which had resumption semantics, not the more usual termination semantics seen today) exposed a similar mechanism. — dave, Commented Jul 6, 2022 at 12:39
Glancing briefly at the code listing from @texdr.aft, it looks like the 'unwind' happens by overwriting the return address in each activation record (as it is encountered) and then executing a procedure return. Repeat until you're in the desired context. — dave, Commented Jul 6, 2022 at 12:51
And indeed the code bears that out: when it reaches the base of the chain of activation records, it branches to: XBOMB: TRAP NOEXIT — dave, Commented Jul 6, 2022 at 17:32

texdr.aft · Accepted Answer · 2022-07-06 21:51:41Z

Here is the part of the compiler that produces code for a call to exit (which is a normal procedure, rather than a kind of statement):

PROCEDURE EXIT;
  VAR LCP: CTP;
BEGIN
  IF SY = IDENT THEN
    BEGIN SEARCHID([PROC,FUNC],LCP); INSYMBOL END
  ELSE
    IF (SY = PROGSY) THEN
      BEGIN LCP := OUTERBLOCK; INSYMBOL END
    ELSE LCP := NIL;
  IF LCP <> NIL THEN
    IF LCP^.PFDECKIND = DECLARED THEN
      BEGIN GENLDC(LCP^.PFSEG); GENLDC(LCP^.PFNAME);
        IF INMODULE THEN
          BEGIN LINKERREF(PROC,LCP^.PFSEG,IC-2);
            IF SEPPROC THEN LINKERREF(PROC,-LCP^.PFNAME,IC-1);
          END
      END
    ELSE ERROR(125)
  ELSE ERROR(125);
  GEN1(30(*CSP*),4(*XIT*))
END (*EXIT*) ;

Thus the generated P-code for exit(f) is something like

LDCI ⟨segment of f⟩
LDCI ⟨procedure number of f⟩
CSP 4

The limited memory of many computers on which UCSD Pascal expected to run necessitated a way to divide a large program into smaller pieces, which USCD Pascal calls “segments”. If a routine declaration is preceded by the word SEGMENT, its code will be loaded into memory only when it is called.

Routines are assigned numbers (the value of pfname) beginning at 1; counting starts over in each segment, so to refer to a routine unambiguously a ⟨segment, pfname⟩ pair must be used.

Here is the assembly code for the exit procedure:*

 XIT:    ; EXIT PROCEDURE
         MOV     JTAB,IPC        ; FIRST SET IPC TO EXIT FROM CURRENT
         ADD     #EXITIC,IPC     ; PROC ... GET INFO FROM CUR JTAB
         SUB     @IPC,IPC        ; NOW IPC IS SET TO EXIT MY CALLER
         CMPB    @JTAB,@SP       ; IS IT THE PROC # TO EXIT ANYWAY?
         BNE     XCHAIN          ; IF NOT THEN CHAIN DYN LINKS TO FIND
         CMPB    @SEG,2(SP)      ; IF PROC OK, HOW ABOUT SEG#?
         BNE     XCHAIN          ; IF WRONG, THEN CHAIN DYN TOO
         CMP     (SP)+,(SP)+     ; ELSE CHUCK STACK STUFF
         MORE                    ; AND DO THE RETURN CODE
 XCHAIN: MOV     MP,R0           ; OK...START EXITING STACKED PROCS
 XLOOP:  CMP     R0,@BASE        ; ARE WE ABOUT TO EXIT SYSTEM BLOCK?
         BEQ     XBOMB           ; IF SO THEN BIG BOOBOO
         MOV     MSJTAB(R0),R1   ; ELSE OK...GRAB JTAB AND FUDGE MS IPC
         ADD     #EXITIC,R1      ; TO EXIT CODE RATHER THAN NORMAL REENTRY
         SUB     @R1,R1          ; R1 NOW HAS EXIT POINT IPC
         MOV     R1,MSIPC(R0)    ; SO PLACE IN STACK FRAME
         CMPB    @MSJTAB(R0),@SP ; IS THIS THE PROC# TO EXIT FROM?
         BNE     1$              ; IF NOT THEN GO TO NEXT CALLED PROC
         CMPB    @MSSEG(R0),2(SP)        ; AND RIGHT SEG#
         BNE     1$
         CMP     (SP)+,(SP)+     ; WELL, FOUND IT...CHUCK PARAMS
         MORE                    ; AND FALL OUT OF PROC
 1$:     MOV     MSDYN(R0),R0    ; CHAIN DOWN DYNAMIC LINKS!
         BR      XLOOP
 XBOMB:  TRAP    NOEXIT

I'll assume that you know how the P-system's “mark stack” works; if not, Pemberton and Daniels explain it in the book Pascal Implementation. The algorithm first sets the P-code instruction pointer to the epilogue of the caller of exit. Then we check whether we're exiting the current routine, to take care of a common case. (Note: MORE is a macro that transfers control to the next P-code instruction.) Otherwise, it searches up all the open activation records to find one corresponding to the routine number and segment that were pushed prior to the exit call. If one is found, the “return address” field of the record is changed to point to the routine's epilogue.

In any case, the epilogue of the caller of exit is executed, which causes control to transfer to the epilogue of the routine being exited from. By transferring to the epilogue, rather than just to the return address, an exited function will return a value.

However, if the search reaches the top of the stack, error trap 3 is signaled.

So that is how exit works. But a bigger mystery remains: Why didn't they just support nonlocal goto statements? The code for doing so would surely be no more complicated than for exit, and the feature is part of the standard language (while exit is not). Furthermore, nonlocal goto has lexical scope, making it easier to reason about than exit, which has indefinite scope.

Maybe exit was considered more structured than goto—but then why not add a return statement and keep it lexically scoped?

Its dynamic nature does make exit more powerful than goto. If UCSD Pascal supported routines as arguments, you could implement a condition system:

type signaltype = (endoffile, overflow, syntaxerror);

var ok: boolean;
    signaled: signaltype;
    handlers: set of signaltype;

procedure signal(s: signaltype);
  begin
    if not (s in handlers) then
      exit(defaulthandler);

    ok := false;
    signaled := s;

    case s of
    endoffile: exit(catchendoffile)
    overflow: exit(catchoverflow)
    syntaxerror: exit(catchsyntaxerror)
    end
  end;

procedure handle(s: signaltype; procedure p; procedure c);
  begin
    handlers := handlers + [s];
    p;
    handlers := handlers - [s];
    if (not ok) and signaled = s then
      c;
    ok := true
  end;

procedure endoffilehandler(procedure p; procedure c);
  procedure catchendoffile;
    begin
      p
    end;
  begin
    handle(endoffile, catchendoffile, c)
  end;

procedure overflowhandler(procedure p; procedure c);
  procedure catchoverflow;
    begin
      p
    end;
  begin
    handle(overflow, catchoverflow, c)
  end;

procedure syntaxerrorhandler(procedure p; procedure c);
  procedure catchsyntaxerror;
    begin
      p
    end;
  begin
    handle(syntaxerror, catchsyntaxerror, c)
  end;

Then you could say, e.g., overflowhandler(addtwonumbers, reportoverflow), where addtwonumbers attempts to add two numbers and does signal(overflow) if the result is too large, and reportoverflow prints a message like “number too big”. Only the most recently established handler will be invoked for any given condition, although you could change this by having the handlers call signal again.

This provides capabilities similar to Common Lisp's handler-case. If signaled held a record, you could convey even more information and support a fairly sophisticated exception handling mechanism.

* Unlike other P-system derivatives, the P-code generated by the compiler is interpreted by an assembly program akin to the Apollo AGC “interpreter”, rather than using the P-machine or generating native code directly. The code is taken from the I.5 sources (pdf, zip), in file USCD Pascal 1.5 ProcOp.txt, because it seems that the PDP-11 interpreter isn't included with the II.0 sources (pdf, zip). The II.0 distribution does, however, include an implementation of the P-code interpreter for the Z80 (the exit source is in the file z80_p-code_source/proc2.text), which uses the same algorithm as is described above.

Using exit instead of goto has its advantages, one of them being indefinite scope, making it closer to the contemporary exceptions, and its disadvantages, one of them having to distinguish between a regular return and an exceptional return manually. USCD Pascal being more of an educational tool, I presume that the main reason was an attempt to instill a more structured approach to programming even for a construct, for which the manual says "the routine use of the EXIT statement is, nevertheless, discouraged". In all fairness, the same is true for the C++ "throw/catch", — Leo B., Commented Jul 6, 2022 at 20:28
"But a bigger mystery remains: Why didn't they just support nonlocal goto statements?" Now that answer is simple: The workings of EXIT-TO are still part of the principles of structured programming. It does not branch to a random location, like (non-local) Goto would. It will not break structure. It always return to a procedure within the calling structure - if this is not possible (i.e. not being called from the noted procedure), the runtime system will throw an error. After all, the main point about pascal is Structured Programming. — Raffzahn, Commented Jul 6, 2022 at 20:30
@LeoB. I've added some Pascal code that uses exit to implement a general exception handling mechanism. — texdr.aft, Commented Jul 6, 2022 at 20:45
Thanks you; now a question is, whether an implementation of Pascal existed which supported both exit and routines as arguments. — Leo B., Commented Jul 6, 2022 at 20:51
@LeoB. Actually, I don't think my code would work, because the “catchers” aren't lexically visible to signal. They'd have to be hoisted to the top level and couldn't “close over” p. I might make the change later, if I remember to. — texdr.aft, Commented Jul 6, 2022 at 22:39

davidbak · Accepted Answer · 2022-07-06 18:37:15Z

PASCAL, like ALGOL before it, and unlike the entire C family of languages, has both static and dynamic scoping of activation records (i.e., frames, i.e., procedure calls).

☞ This is not the same as "dynamic scoping" as used in "dynamically scoped" languages like scripting languages, etc. This is about activation records of function/procedure calls.

Dynamic scoping of activation records is all that anyone is familiar these days since the C language family has long since completed its takeover of the entire procedural programming language space⁽¹⁾. That's where you have a stack of frames for procedure/function/method and each one is linked to its caller ("parent") - either explicitly with a link or more typically just implicitly by being placed adjacent to each other on the stack.

But ALGOL and PASCAL also had static scoping of activation records. Each record contained a link (conceptually) to its static - i.e., lexical - parent. These links had to be maintained at each procedure call.

Thus, this kind of exit in PASCAL required no search up the stack. There was an actual chain of links from the current procedure up through its lexical parents up to the outer scope. All you needed for this exit was the count of how many lexical scopes up the named procedure was - and that was available at compile time simply by looking at the parse tree (or AST).

(Remember, PASCAL doesn't have any new-fangled stuff like destructors that had to run as you exited procedure/function scopes. You could just pop the stack to the right place - you found the address to return to in the usual place in the desired parent's dynamically scoped callee (i.e., the frame below the desired parent).

There were three implementation techniques used, depending on your compiler.

The static parent link could actually be present in each stack frame. Referencing up-level variables required chaining up the required (compile-time known) number of levels and then going to the proper offset in that frame. This was reasonably straightforward to implement, was a small tax on each procedure call, and made up-level variable access somewhat slow (especially as you climbed higher up the lexical scopes).
You could keep a display of static parents. This was a separate area of memory (or actual registers in a Burroughs stack machine!) that was an array of pointers to the current set of static parents of the currently executed procedure. Maintaining this was also a small tax on each procedure call and return but had the advantage that up-level variable access was a direct index through this array plus an offset.
You could hedge by doing both: Each frame, when built, would contain the display of the current parents. This could potentially be a large tax on each procedure call but in fact there's a great optimization: The display for a particular frame only needed those parent frame pointers for which there was an actual reference to one of its variables⁽²⁾. Since most often a procedure didn't reference any of its up-level parent's variables - those procedure didn't need a display in their frame at all, thus didn't pay the tax. And the next most often case was a procedure that referenced up-level variables of only one of its parents: the tax then was no higher than full static linked frames. Only the really rare case of a procedure accessing up-level variables in more than one parent frame was more expensive at procedure call time. Meanwhile, an additional payback was that accessing those up-level variables no longer had to chain up the static frame links.

I don't know which technique the UCSC PASCAL interpreter used for its stack frames. It can probably be found somewhere⁽³⁾.

I found it surprising difficult to find any links to this subject, else I would have provided links. Maybe someone else can supply links. To me, that just indicates this is a dead end of computer language development. Well, with the exception of the LISP community and the entire functional programming community where it lives on in glory and usefulness. And it's also coming back as procedural-style languages begin to support closures ("lambdas") (I'm thinking of C++ and C#, not the cruddy fake "closures" of Java).

⁽¹⁾ But they haven't wiped out the functional programming language space, nor LISP, which still lives! But also see my comment in the last paragraph above.

⁽²⁾ Or when it called a procedure - necessarily a lexically nested procedure, that itself referenced variables in one of it's up-level parents not including this one. It's a bit tricky, and the tradeoffs aren't clear, and it's why it wasn't implemented very often. I do know it was in fact implemented at least once but sadly I can't remember which compiler it was ...

⁽³⁾ Oh, and BTW, you can also find it in the documented ABIs of some older systems. You'll see it as a space reserved in the stack frame/calling sequence - but unused by all languages today! - for the static link. I'm pretty sure I remember this coming up in at least one other answer here at retrocomputing, but in any event you'll probably find evidence for it on bitsavers or somewhere.

Actually, the system does search up the stack. It's more like Common Lisp's catch/throw than like block/return-from; the exit point isn't always statically determinable and is not bound by lexical scope. — texdr.aft, Commented Jul 6, 2022 at 20:01
PASCAL doesn't have any new-fangled stuff like destructors that had to run as you exited procedure/function scopes Strictly speaking, that is not exactly true. There were "destructors" for local files, and they were executed while unwinding the stack, as mentioned in the manual. — Leo B., Commented Jul 6, 2022 at 20:30
"Thus, this kind of exit in PASCAL required no search up the stack." This is not true. Even the example in the question shows this. The procedures P and Q are defined in the same lexical scope so, when P calls EXIT(Q) it will not find Qs stack frame by using the static links. — JeremyP, Commented Jul 11, 2022 at 10:08

Stack Exchange Network

How did UCSD Pascal exception mechanism work?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
pascal
ucsd-p-system
.

Hot Network Questions

How did UCSD Pascal exception mechanism work?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged pascalucsd-p-system.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
pascal
ucsd-p-system
.