What's the convention for < > low/high-byte in 8-bit assembler?

Question

It's a little hard to Google for greater-than and less-than symbols in assembler math...

If you saw, let's say, 6502 code like...

lda #>SOME_LABEL

or Z80 code like...

ld a,>SOME_LABEL

...would you expect that we're loading the low byte or the high byte of SOME_LABEL? Is there a consensus among 8-bit assemblers on the convention for which way round the < and > are, or is it pretty unpredictable?

Maybe worth mentioning that I don't know of any Z80 assembler that would handle the "</>" notation. That's sort of 6502-specific — tofro, Commented Aug 16, 2023 at 12:27
on Z80 asez80 assembler by Alan R. Baldwin seems to handle those too shop-pdp.net/pub/asxxxx/av5p10.zip — Jean-François Fabre, Commented Aug 16, 2023 at 19:10
@Jean-FrançoisFabre This assembler seems to apply most of the 6502 grammar to Z80 - It also uses "#" for immediates, which is very uncommon on Z80 — tofro, Commented Aug 17, 2023 at 6:30
agreed. I had to adapt some Z80 parsing code just for this source. Normally the fact that it isn't parenthesized should be enough — Jean-François Fabre, Commented Aug 17, 2023 at 8:42

Raffzahn · Accepted Answer · 2023-08-17 00:46:56Z

Today

...would you expect that we're loading the low byte or the high byte of SOME_LABEL?

> is rather universal for the high byte while
< selects the low byte.

Rather easy to remember when assuming that writing is left to right as well as most 8 bit micros being little endian.

< thus points to the left, to the beginning, to the lowbyte, while
> points to the right, to the later, to the high byte.

The Day Before Today

Then again, as Fadden reminds, Apple's EDASM did use it the other way around as documented on p.196 of the ProDOS Assembler Tools Manual:

So there may have been other assemblers following that notation...

Exit Strategy

To get away from the ambiguity some assemblers also offer text based operators, like

CC65 having .LOBYTE and .HIBYTE
BeepASM offering LO(val) and HI(val)

The x80s

For the 8080/Z80 part it might be noteworthy that the original CP/M ASM did not provide an operator to select either byte. And I can't come up with any genuine (*1) 8080/Z80 that does, which makes sense as the 8080 does have basic 16 bit capabilities for address handling. An 8080 can load a 16 bit immediate direct into BC/DE/HL/SP, eliminating the need to load either half in next to all situations.

In turn x80 and x86 Assemblers could use those symbols as brackets for macro parameters or structures.

6500 World View

In contrast the 6502 had to load and store each byte separate, making those operators almost mandatory. 'Almost' as the original MOS Cross Assembler (*2) did not feature either. This gets even more interesting as the whole First Book of KIM, dedicated entirely to assembly (*3), doesn't use it either.

Are we doing something wrong today, when using tons of such expressions? (*4)

Going Down the Rat Hole

The Assembler used for the Apple II monitor ROM did as well not utilize either but simply assumed the low 8 bit of any (16 bit) expression when an 8 bit value was to be inserted as seen in the reference manual on p.171 (*5):

All of that becomes even more strange as Commodore already announced the Resident Assembler in 1977 - which does mention that syntax:

Same goes for the 1978 AIM65 assembler as seen on page 5-20 of the Users Guide:

Those are all assemblers of the 'official' MOS linage - but it becomes curious when looking at the PET Resident Assembler of 1978. Its manual seems to be a modification of the KIM Manual (*6). Except it leaves out the section about Immediate Address Handling. But at the same time it uses those operators in an example on p.18:

Confusing, isn't it?

*1 - That is assemblers made back then and specific for those CPUs. Not any later multi platform/CPU assembler.

*2 - MOS offered an online package on a GE mainframe as tool to get 6502 systems going - after all, how else to bootstrap development :))

*3 - It seems as if several different assemblers have been used - which might not have mattered at all, as users would have typed in the hex codes only :))

*4 - That or simply having more dynamic setup for more complex environments.

*5 - The 1977 version used for the original Monitor seem to have no instruction for 16 bit constants, only DFB to define 8 bit constants. The 1978 version used for the Autostart-Monitor added DW, but still used the same way of handling low/high byte in immeditate values.

This is especially remarkable as the MOS Cross Assembler already offered .WORD for 16 bit constants.

*6 - Which in turn got the same structure as the Cross Assembler manual. They all look like made by editing the previous one.

Thanks. That was my belief too. Then I happened upon retrocomputing.stackexchange.com/questions/8197/… and an unrelated YouTube video (which I screenshotted but can't find the original of) suggesting the other way around, and realized I needed some perspective. — Luxocrates, Commented Aug 16, 2023 at 2:48
FWIW, the original Apple II EDASM did them the other way around (#>expression is the low 8 bits). I can't think of anything else that works that way. — fadden, Commented Aug 16, 2023 at 4:55
@fadden this starts to be a rathole for early assembler research :)) — Raffzahn, Commented Aug 16, 2023 at 11:46
That < means 'low byte' flies in the face of every programming language that has << as a shift to higher bits. — dave, Commented Aug 16, 2023 at 12:45

TonyM · Accepted Answer · 2023-08-16 11:13:40Z

5

They've always been used the same in the various different assemblers I've seen using them over the years.

Their function's taken from the symbols' common names. For a 2-byte value:

> is 'greater than' so it returns the larger part of the value i.e. high byte.
< is 'less than' so it returns the smaller part of the value i.e. low byte.

Once you see that naming association then you see that they're unambiguous in their function.

answered Aug 16, 2023 at 11:13

TonyM

4,5741 gold badge21 silver badges35 bronze badges

3

"larger part of the value" is named "most significant byte (MSB)". It may or may not also be the "high byte" depending on endian-ness.
– Ben Voigt
Commented Aug 16, 2023 at 16:08
@BenVoigt, yes, I know very well. Adding allllll that stuff into the answer muddies it, losing simplicity and therefore clarity. Saying it in a few clear words is better than in ten times as many. Less is nearly always more :-)
– TonyM
Commented Aug 16, 2023 at 20:32

Add a comment |

Jean-François Fabre · Accepted Answer · 2023-08-16 20:37:42Z

I stumbled on that notation in Galaga reverse-engineered source code and had the same interrogation. Then I figured it out myself

ld   h,#>(m_tile_ram + 0x0300)             ; tile rows 32-35:  $83C0 - 83FF
ld   l,a

The code above loads half memory addresses. The base is loaded in the MSB, and the index is loaded in the LSB. You just need an assembler that can understand those, like the one recommended by the team who reversed Galaga: asez80 assembler by Alan R. Baldwin which qualifies as "modern" as it was developped between 1989 and 2021.

In this game, they often use a start of a table aligned on 0x100 boundary. Then they just have to change the lower byte which matches with the index in the table.

They could have written:

ld   hl,#(m_tile_ram + 0x0300)
ld   l,a

but it would use 1 more byte in ROM. Note that they probably didn't have this < notation at the time they coded the game and hardcoded the MSB, but the reversed source code (which assembles into the same binary code as the original!) uses it for more clarity.

Note that ELF-capable power-pc assemblers (GNU as) also have this half-address notation, which allows the code to load addresses in a relatively simple way: without it, it's possible but the power-pc instructions are all 4 bytes long and there's also relocation issues, so they really need this half-address stuff (xcoff format doesn't have half-relocation and xcoff assembler doesn't undertstand half-address notation)

From CodeWarrior assembler documentation

You also can use the top or bottom half-word of an immediate word value as an immediate operand by using one of the @ modifiers

Example:

lis r6, gTheLong@ha
addi r6, r6, gTheLong@h
lis r7, gTheLong@h
ori r7, br7, gTheLong@l

Interesting data point. When were these two assemblers introduced? — Raffzahn, Commented Aug 16, 2023 at 20:14

Stack Exchange Network

What's the convention for < > low/high-byte in 8-bit assembler?

3 Answers 3

Today

The Day Before Today

Exit Strategy

The x80s

6500 World View

Going Down the Rat Hole

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
assembly
z80
6502
disassembly
.

Linked

Hot Network Questions

What's the convention for < > low/high-byte in 8-bit assembler?

3 Answers 3

Today

The Day Before Today

Exit Strategy

The x80s

6500 World View

Going Down the Rat Hole

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged assemblyz806502disassembly.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
assembly
z80
6502
disassembly
.