AGAL seq opcode works on hardware, but doesn't on software emulation (float number comparison is different on both?)

Question

From docs: seq set-if-equal destination = source1 == source2 ? 1 : 0, component-wise

I haven't yet tested it thoroughly, but so far my fragment shader worked on both machines (desktop pcs), where context3D initialization succeeded as DirectX, but doesn't work on machines where flash falls back to software rendering.

seq ft2.x, ft0.x, fc0.x

ft.x is set to 1 on hardware, when current pixel red value, stored in ft0.x is equal to constant fc0.x, which stores 50/255. So what I want to happen, does happen on #32???? (50 == 0x32) colored pixel on hardware, but doesn't on software.

I already tested for a workaround, and I can replace seq opcodes with a more complex algorithm involving slt (set if less than) or sge (set if greater or equal).

So it seems the problem lies in a comparison of a constant I supply to the GPU (50/255) and the actual red value (which is 50 in the texture). If it was anything else (e.g. RGBA values had a different order), slt and sge would fail as well.

Am I doing something wrong here? Should I somehow round compared values (e.g. multiply by 255 then remove the fractional) in order to be sure it will work in all devices and modes?

Update: One of the machines with software rendering fallback was set to 16 bit graphics, however changing it to 32 bit didn't fix the issue. I also did a blind try to divide the color value by 256, 128 and 127 instead of 255, hoping it could maybe fix the issue if the float had a different precision (and higher and lower numbers would work as long as they would equal to one of pixels inside a 256px long gradient), but my hopes didn't pay off.

Then I tried the workaround of storing the constant as an integer, and inside shader multiplying the value by 255 and removing the fractional, and to my surprise, while it worked on GPU, it failed on software rendering:

mul ft0.x, ft0.x, fc0.y convert ft0.x (red channel) to integer by multiplying it by the constant 255

frc ft4.x, ft0.x get a fractional

sub ft0.x, ft0.x, ft4.x remove fractional, to truncate the integer

Now do the comparisons, e.g. seq ft2.x, ft0.x, fc0.x

add ft0.x, ft0.x, ft4.x add fractional back, this step is probably not necessary

div ft0.x, ft0.x, fc0.y divide the integer value by 255 to convert it back to float (by this I mean a number in 0..1 range)

The next thing I'm going to try as a workaround is to simply make a series of less-than comparisons that set a temp register to 1, which is added to another temp register (a counter), so that by checking the counter I can see inside which range is the value.

Markus von Broady · Accepted Answer · 2017-01-31 14:49:44Z

Here's the workaround that finally did the trick for me.

I had 4 colors on the red alpha channel, that were informing the shader what to do. If the red value was 50, the shader would take left pixel as a source, if it was 100, it would take top pixel and so on. So all I had to do was 4 seq commands to set 0 or 1 offsets to 4 components of a register that I can later add or remove from the the register with the position for the sampler.

Because seq failed to compare the red value of the pixel from the first sampling with the constant supplied, I made a 'ladder' of set-if-greater-or-equal opcodes:

"mov ft3.x, fc0.x \n" + //ft3 = 49/0xFF

"sge ft2.x, ft0.x, ft3.x \n" + //if red > 49/0xFF, set 1 to ft2.x

"add ft3.x, ft3.x, fc0.x \n" + //ft3 = 98/0xFF
"sge ft4.x, ft0.x, ft3.x \n" + //if red > 98/0xFF, set 1 to ft4.x
"add ft2.x, ft2.x, ft4.x \n" + //if 49 < red < 98, ft2.x = 1, if red > 98, ft2.x = 2

"add ft3.x, ft3.x, fc0.x \n" + //ft3 = 147/0xFF
"sge ft4.x, ft0.x, ft3.x \n" + //if red > 147/0xFF, set 1 to ft4.x
"add ft2.x, ft2.x, ft4.x \n" + //if 49 < red < 98, ft2.x = 1, if 98 < red < 147, ft2.x = 2, if red > 147, ft2.x = 3

"add ft3.x, ft3.x, fc0.x \n" + //ft3 = 196/0xFF
"sge ft4.x, ft0.x, ft3.x \n" + //if red > 196/0xFF, set 1 to ft4.x
"add ft2.x, ft2.x, ft4.x \n" + //ft2.x is between 0 and 4 including, where 0 means no control color

Now I had a register ft2 that stored:

0 for red below 49 (actually all of these values of red color are divided by 255 as in comments in the code above)

1 for red between 49 and 98

2 for red between 98 and 147

3 for red between 147 and 196

4 for red above 196

Then instead of comparing a pixel color with a constant, I would compare the ft2.x counter state with a constant (and the constants would be 1,2,3,4 instead of 50,100,150,200).

Unfortunately it means whole code above is an additional overhead that I can spare the GPU, but can't avoid on CPU unless I can find out the solution to the seq opcode always returning 0 on CPU when comparing a pixel color and a constant.

Please note: while AS3 docs say that software emulation of GPU rendering is still faster than standard (old) CPU rendering, it probably means just rendering vectors, while operating on the bitmapData pixel by pixel manually by looping through getPixel and setPixel is way faster than using an AGAL shader. — Markus von Broady, Commented Feb 3, 2017 at 21:02

Collectives™ on Stack Overflow

AGAL seq opcode works on hardware, but doesn't on software emulation (float number comparison is different on both?)

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
shader
fragment-shader
stage3d
agal
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Not the answer you're looking for? Browse other questions tagged shaderfragment-shaderstage3dagal or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
shader
fragment-shader
stage3d
agal
or ask your own question.