Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

images aren't serialized #47

Open
PerBothner opened this issue Mar 9, 2023 · 15 comments
Open

images aren't serialized #47

PerBothner opened this issue Mar 9, 2023 · 15 comments

Comments

@PerBothner
Copy link

I got xterm.js + xterm-add-image working under DomTerm. However, it doesn't appear a sixel-produced image is saved by the serialize addon.

Specifically, I printed some images (using lsix), and then did a "detach session". When I then did "attach" on the session, the images were gone.

Serialization is important not only for detach+attach, but also when dragging a window to a new top-level window - this also uses serialization on most platforms.

In "regular" DomTerm (not based on xterm.js), an image is represented as a <img> element with a data: URL. When a buffer is serialized as HTML, this automatically handles such images. (FWIW, I'm actively working on improving the sixel handling in regular DomTerm.)

@jerch
Copy link
Owner

jerch commented Mar 9, 2023

Unfortunately thats not possible atm and also not easy fixable, as there are several conceptual issues to overcome:

  • serialize addon creates an escape sequence string of the buffer state. Thx to reflow logic this data string works on almost any other terminal (given it supports the used sequences), which is not the case for image protocols. For xtermjs --> xtermjs that can be fullfilled though.
  • image serialization back to an escape sequence is a tough problem, viewport size might differ, thus lines might be shorter with early wrapping which leads to image breakage if re-encoded as a whole. Furthermore the image handling is cell-aware, means cells can be overwritten with FG/BG content in the middle of an image. Both could be mitigated partially by doing it line-wise, but that will split a single image resource into multiple ones. Still thats the only way I see atm to solve it.
  • Missing protocol with strict cell-pixel adherence. Sixel is def. not a good candidate for back encoding due its awkward pixel behavior and a quite slow encoder, IIP is not much better. We still lack a good protocol here with a baked-in guarantee for certain cell coverage, which also must be pixel-perfect for line stitching. (Thats the main reason why I have not looked into serialization yet...)
  • serialize addon does not expose any events where I could hook in custom logic. This is also partially an addon interdependency issue (easy solvable though).

To quick draft a possible solution:

  • serialize addon would need to expose an event onSerialize((bufferLine: number) => [xStart, xEnd, content][]) (well something like that)
  • image addon gets registered on that event
  • Whenever the serialize addon enters a line, it requests partial line content from onSerialize listeners, ideally its own FG/BG/attribute content generator is also only a listener here.
    • Image addon creates content strings by encoding all image parts found in that line. The used protocol for encoding must ensure proper cell and line stitching.
  • Finally serialize addon fuses the content strings in x progression for that line.

I am not very fond of the current state of the serialize addon, not sure if such an extension is wanted (argued in the past about it, but kinda gave up). Also the proper image encoding protocol remains unclear atm.

@jerch
Copy link
Owner

jerch commented Mar 9, 2023

@PerBothner Drafted a quick demo PR to test the idea above (see #48). Ofc this cannot be used until the serialize addon sees major changes. Also the image re-encoding is very rough atm (using IIP for every single cell, which is not how it should be done).

@jerch
Copy link
Owner

jerch commented Mar 9, 2023

There is prolly a simpler solution possible - the image addon currently always draws on top of FG/BG content not erasing it. Thus it should be possible to keep most of the serialize addon as it is, but invoke a full line image serializer for every line, and just append its sequence to the current line. This has several advantages:

  • serialize addon needs only tiny change
  • no complicated fuse logic needed
  • no stitching artefacts in x-progression
  • most of the image render logic can be re-used

Will see if that works better...


Update: I've implemented the line based serialization with a23a372, which works much better and is also reasonably fast.

@jerch
Copy link
Owner

jerch commented Mar 10, 2023

Addendum regarding a proper protocol:

It turns out that IIP has all the basics to guarantee a stable cell coverage, if dimensions are set with plain numbers denoting row/col extend, eg. width=80;height=1 for a single terminal line of 80 cols. It still needs the overprinting patch of 64244d8 to level out float rounding issues, in fact the floor/ceil logic must be applied all over the place to avoid stitching/striping artefacts. Furthermore early ending images need a proper cell tiling patch to fully cover a cell to keep aspect ratio stable, when read in back (commit be72b06). This might be some food for a better protocol, most likely we need to address into fractions of a cell, as already suggested by @egmontkob years ago.

Nonetheless IIP with col/row metrics works good enough atm to get a working solution for serialization running. 😸

Edit:
The line serialization is basically the same issue any multiplexer would have when updating screen parts / certain cells - to keep the image behind "intact" and to avoid uploading data over and over we also gonna need the "upload to image ref + draw from image ref" scheme as already discussed in several places.

@jerch
Copy link
Owner

jerch commented Mar 23, 2023

The QOI image format might be a good choice for image serialization, as it provides very fast encoding/decoding with okish compression rate.

Some early numbers testing QOI decoding speed with the sixel-bench test data:

qoi: 2.72s (48.06 MB/s, 251.76 fps)
png: 7.56s (18.88 MB/s, 90.62 fps)
jpg: 2.25s (7.18 MB/s, 304.94 fps)
sixel: 3.03s (39.16 MB/s, 226.2 fps)

(Note: The JPEG numbers are flawed, as I did a simple 75% conversion, which reduces data from ~130MB to 17MB for jpeg.)

Did not test yet the encoding path, but I expect this to be much faster than any PNG/JPEG encoding.

@PerBothner
Copy link
Author

A simple way to serialize a canvas is to use canvas.toDataUrl(). That seems simple, portable, and reasonably efficient.
To unserialize, you can create a new Image, set the src attribute, and optionally drawImage into the canvas. There may be a way to avoid creating a temporrary Image object.
You still have to decide on an escape sequence to wrap the data url in.

@jerch
Copy link
Owner

jerch commented Mar 24, 2023

@PerBothner Well, thats what my playground branch does atm:

public serializeLine(num: number): string {
const canvas = this._storage!.extractLineCanvas(num);
if (!canvas) return '';
const w = this._terminal!.cols;
const data = canvas.toDataURL('image/png').slice(22);
const iipSeq = `\x1b]1337;File=inline=1;width=${w};height=1;preserveAspectRatio=0;size=${atob(data).length}:${data}`;
return '\r' + iipSeq + `\x1b[${w}C`; // CR + IIP sequence + cursor advance by line width
}

and the encoding speed is underwhelming - it takes ~70ms for an image of 420x420 pixels. To me this seems not to be viable option to serialize a longer scrollbuffer with possibly multiple images in it, as it will add up to seconds of blocking time pretty fast.

Thats where the promise of QOI may take place - it encodes slightly worse than PNG, but at 30-50x faster speed, which would reduce the runtime of 70ms easily to <<10ms.

NB: The high runtime is partially artificial from line-wise handling, a single toDataUrl() on the whole image takes only ~28ms. To fix both, prolly QOI + ref painting might be needed. But as said above, I still have to see whether QOI can hold its promise when written in wasm. Furthermore it needs a base64 encoder + string conversion on top, thus I think we might not see the full 30-50x speedup in the end, maybe around 5-10x.


Update:

A quick test with a hacky not yet optimized QOI encoder finishes the same image in ~10 ms, thus runs in a third of the time compared to toDataUrl() with PNG with its 28-32 ms. The decrease is much less than I expected, and it turns out, that the main limiting factor is the back transfer of data from the GPU. This becomes clear, when looking at the profiler data:

  • getImageData: ~7.7 ms
  • qoiEncode: ~2.8 ms

Also tried to avoid getImageData and read pixel values back from webgl texture, which was slightly better (~4x faster):

  • readPixels: ~4.5 ms
  • qoiEncode: ~3.1 ms

Still way to go...

@jerch
Copy link
Owner

jerch commented Mar 25, 2023

Haha - stumbled over the next hassle - fast bytearray to string conversion. I really dont get it why JS has no fast Uint8Array.toByteString method, thats literally a malloc + memcpy away for the engine. But nope, we have to use slow JS looping with String.fromCharCode calls, 🙈
Well did not investigate much yet beside testing various concat tricks, thats what I ended up for now:

const data = ... // uint8 array from encoder
let p = 0;
let s = '';
while (p < data.length) {
  s += String.fromCharCode.apply(null, data.subarray(p, p += 65536) as any);
}

Also TextDecoder seems to be on par with this, but I am not sure yet, whether it will run into issues for certain charcodes. Needs some more testing...

Some interim results:

  • Overall QOI encoding is 2.5 to 5 times faster than toDataUrl/PNG, depending on the image at hand. Thats a lesser gain than I was hoping for, but still remarkable.
  • The implementation overhead with custom encoders is quite big, but it is only ~10kB bundled, with ~3kB in wasm.
  • getImageData gets much faster with a software canvas (willReadFrequently setting), and is much more reliable and stable in timings than any webgl hacks. It is also not hardware dependent. Clearly the better choice.
@jerch
Copy link
Owner

jerch commented Mar 26, 2023

@PerBothner Imho QOI is the right decision, as it is able to serialize things 2.5 - 5x faster than with browser methods (see full roundtrip numbers and explanation here: #48 (comment)).

@jerch
Copy link
Owner

jerch commented Mar 27, 2023

@PerBothner
Hmm, currently the serialize plugin is pretty slow, it takes ~170 ms to serialize a 10k scrollbuffer of 87 cols (tested with ls -lR /usr/ generating ~500kB of string data). Thats ~8 times slower than a wasm version with almost the same functionality (takes ~20 ms).
Therefore I might ditch the idea to extend the serialize addon, instead a rewrite seems more appropriate to me. It makes it also easier to rethink the serialization problem and come up with a proper interface, where custom extension can hook into.

@PerBothner
Copy link
Author

Does it really matter if serialization is a bit slow? How often do you need to do it? Can you think of any usecase for a terminal where even half a second would be a problem? Of course an extensible interface is valuable even if performance isn't the main motivator.

Where I think speed might be more valuable is detecting grapheme clusters, which can be combined with detecting wide characters - see my library. Though detecting clusters while rendering seems sub-optimal - perhaps cluster (and wide character) detection should be done lazily, the first time a line is rendered. Anyway, this is a whole different set of issues, which I have some thought and experience in.

@jerch
Copy link
Owner

jerch commented Mar 27, 2023

Well the issue with the serialize addon is simple - it is convoluted for no good reason, so I dont like to alter things there and introduce an interface on top. I would basically end up rewriting it anyway, so a clean start makes things alot easier. The perf bonus is just the brick to get me doing it.

@ grapheme clusters
I had a first attempt on those with xterm.js back in 2019, but it did not make it into the release for code bloat (from storing codepoint attributes) and big runtime issues (was multitudes slower). Imho it is not yet worth to be added in xterm.js with its current input data route.
In the long term I plan to replace critical hots paths with wasm implementations, but I dont know yet when, and if those ever will make it back into xterm.js. Thus I am more focused on doing things on addon side.

@PerBothner
Copy link
Author

For grapheme clusters, why do you need to store codepoint attributes, assuming you're using either dom or canvas renderers? The browser takes care of that. You may need to know cluster boundaries but you can detect those at the same time you're detecting double-width characters, using an efficient trie lookup (which would probably be a good candidate for wasm).

"If those ever will make it back into xterm.js" If I get to the point of switching DomTerm to use the xterm.js engine by default, I'm assuming it will have to be a fork, since the xterm.js maintainers don't seem to be interested in rich data or variable-width text.

@jerch
Copy link
Owner

jerch commented Mar 28, 2023

For grapheme clusters, why do you need to store codepoint attributes, assuming you're using either dom or canvas renderers? The browser takes care of that. You may need to know cluster boundaries but you can detect those at the same time you're detecting double-width characters, using an efficient trie lookup (which would probably be a good candidate for wasm).

Yes, I meant that lookup data for the unicode codepoint range to eval incoming chars. I dont quite remember, what I used back then, but it was quite big (I think it was a condensed bitmap for fast lookup). So no, I was not storing additional attributes in the buffer, it was just the segmentation logic, that needed additional lookup data.

Btw I think other TEs implementing grapheme support put the segmentation burden early right after the utf-8 decoding before doing wcwidth, thus they keep things/graphemes cell aligned. Technically thats a grapheme_wcwidth(Grapheme*) in the end, and imho the only way to keep things within that nasty grid metaphor.
The impl impact is quite remarkable though - unicode does not limit graphemes in x direction, so it is theoretically possible to end up with cells >2 half widths. Thats what most TEs will have issues with, and so does xterm.js in its renderer logic. This gets really horrible for heavily stacking language systems with tons of subscripted perceivable characters (hebrew, arabic, hindi languages etc.), widths may even be in fractions of a cell here. But unicode did not bother with formalizing that at all, they left it on purpose to font developers to get the best visual output. Thats really a nightmare for proper grapheme dealing on a grid system, and I have not seen anyone to fully address this. So all TEs are currently broken in their unicode support. (And last but not least - there is still the RTL issue on top...)

Anyway, its kinda offtopic here.

@PerBothner
Copy link
Author

Off-topic - about grapheme clusters: The table for codepoint attributes used in DomTerm is a Uint32Array of length 12764, initialized from a 4032-character atob string. I consider that acceptable.

Basically DomTerm treats a cluster as wide if any of the constituents is wide, or it sees an emoji_presentation_selector or a regional_indicator pair.

This gets really horrible for heavily stacking language systems with tons of subscripted perceivable characters

That is why terminals need to figure out how to handle variable-width fonts. On my roadmap is a wrapper for input editors (such as GNU readline) where the terminal uses a variable-width font (layout handled by the browser engine), but it pretends to the application that it's all regular fixed-width columns. This would be enabled by some extra shell integration flags without applications having to be otherwise modified. Applications that know they are running under DomTerm can request that a variable-width font be used for output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants