SlideShare a Scribd company logo
You Can Work on the Web Platform!
GOSIM
WORKSHOP
About Me
● Partner at Igalia and member of the Web Platform team
● Long time contributor to WebKit, Chrome, Firefox and
other browsers
● Now working on Servo
The goal of this talk
What is a browser
engine?
The browser engine is the “web view” part
of the browser, separated with an API boundary.
● Most browsers have a separate browser engine:
○ Chrome Blink
○ WebKit Safari
○ Firefox Gecko
● Examples of things not included:
○ Address bar
○ Bookmarks
○ Tabs
Wait.
I don’t know if I can work on a
browser engine. Aren’t they…
big
& complicated?
● Modular
● Well-tested
● Accessible to different
experience levels
With a bit of dedication
and willingness to learn
you can work on the web
platform!
Let’s start with some
Basic
Definitions
Visually
↻ https://www.webpage.com/index.html
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum
dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa
qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor
sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore
eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
deserunt mollit anim id est laborum.
Engine / Web View
Browser
Inside the web view
Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum
dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa
qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor
sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
commodo consequat.
iframe / subframe
main frame
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor
Typical Browser Engine
● Written in C++
● Around 20 years old
● Many different components
● Must handle a huge variety of content:
● Backwards compatibility with over 30 years of content
● Millions of content authors (many malicious)
● Everything is security critical
● Multiprocess / multi-threaded
● But each page’s DOM + layout is synchronous
is different
The embeddable, independent, memory-
safe, modular, parallel web rendering
engine
Servo
● Written in Rust
● “Only” 10 years old
● Memory safety without garbage collection
● More concurrency than other browsers (layout included)
● Multi-process and single-process modes
● Modular and reusable components
● Stylo & WebRender shared with Firefox
● Much easier to hack on!
Major Components
● Network
● HTML parser
● CSS parser & selector
● JavaScript engine & DOM
● Style engine
● Layout engine
● Paint & composite
Network
● Opens sockets with web servers
● TLS in order to process secure connections
● Knows how to parse and produce HTTP headers
● Typically deals with Response and Request objects
● Handles mime types and HTTP authentication
● Not only main frame navigation:
○ <iframe> loading
○ Images and media
○ Fetch / XMLHttpRequest APIs
○ WebSockets
HTML Parser
<html>
<body>
<div>
<span>Hello</span>
<span>GOSIM</span>
</div>
</body>
</html>
Document
Body
<div>
<span> <span>
text content
“Hello”
text content
“GOSIM”
HTML Parser
● HTML5 parsing specification
○ Backward compatible
○ Specifies how to handle bad markup
● Alternate parsers for XML / nested XML
● Parsing iteratively produces a DOM Document
○ Script and style tags may pause parsing
● Handled in Servo by a crate called html5ever
JavaScript Engine
● Responsible for executing JavaScript
● Usually has multiple tiers
○ An interpreter
○ Hot code is compiled to machine code via JIT
● Examples:
○ WebKit: JavaScriptCore
○ Chrome: V8
○ Gecko & Servo: SpiderMonkey
● Servo has Rust bindings for SpiderMonkey called mozjs
DOM
● DOM are native objects exposed to script
● Wrapped with an engine-specific glue code
● Wrappers generated from WebIDL
● Wrappers expose native objects and properties to JavaScript
● In Servo these are in components/script/dom
DOM
WebIDL
WebIDL
WebIDL
Generated
Glue Code
DOM
Code Generator
myElement.style
JavaScript
Rust
Python
DOM
Code
CSS Parsing, Selector Matching, and Style
● Parsing
○ Takes CSS source and turns it into selectors and rules
○ Like <script>, <style> can also block page load
● Selector matching:
○ Given all of the CSS selectors like .class > li:first-child,
determine which DOM elements are targeted
○ Performance critical
● Styling
○ Calculate the style of all DOM elements, using cascading rules
Stylo:
A parallel style engine written
in Rust.
● Basic idea is to traverse the DOM and create a render tree
● Modern approach
○ Create a box tree based on style of nodes
○ Perform fragmentation (lines, columns, pages)
○ Final output is a fragment tree
● Considerations
○ Incremental layout
○ Parallelism
Layout
● components/layout_2020
● components/layout_2013 is legacy layout
● Top down traversal, fragments collected bottom up
● Parallelism
○ Using thread pools via rayon to execute things in parallel
○ Can switch between serial and parallel layout in same traversal
○ Can be configured at runtime
● Still no support for incremental layout
Servo’s Layout
Painting
● Three phases (differs between browsers)
○ Display list collection: fragments into display list items
○ Rasterization: turns items into bitmaps
○ Compositing: different groups of content composited together
● Some browser engines use layers to group composited content
● This involves the GPU
○ GPUs are good at compositing, blending, and filtering
○ 3D CSS transforms, WebGL, WebGPU are already textures!
○ Arbitrary 2D vector rasterization conditionally
Servo + WebRender
● Fragment tree is converted into stacking context tree
○ Respects CSS 2 Appendix E painting order rules
○ Gathers elements together that have transforms, filters, etc
● Stacking context tree is converted into a display list
● WebRender
○ Takes display list and produces OpenGL calls
○ Only supports CSS (no SVG, no Canvas)
○ Handles grouping content based on filters automatically
○ Shared with Firefox (but is written in Rust)
Layout Flow
CSS
CSS
HTML
Box
Tree
DOM
CSS Rules
Fragment
Tree
Stacking
Context
Tree
Display List
How do we test such a complex
piece of software?
Testing the Web Platform
● Web Platform Tests
○ Project shared between all browsers
○ 2 million tests
○ Tests are typically written as part of browser changes
○ Automatic bidirectional sync
● Browser-specific Tests
○ Use functionality not exposed to Web Platform tests or browser
specific
○ Legacy tests written before Web Platform Tests or before test driver
gained features
Testing the Web Platform
● Performance Tests
○ Benchmarks or regression tests for performance
○ Always browser-specific
● Fuzzing
○ Testing method where “fuzzer” generates many source documents
○ Used to catch crashes (security vulnerabilities)
○ Usually managed outside the context of browsers
● Manual Tests
○ Last resort test used to test behavior that can’t be tested in an
automated way
Now we know a bit about how a web
engine works. How do we start working on
it?
Fetch & Build Servo
$ git clone git@github.com:servo/servo.git
Follow build preparation instructions at:
https://github.com/servo/servo
$ ./mach build
Quick Tour of Servo
● python/ — Build and support scripts
● components/ — Individual Rust components
○ components/layout_2020/ — Layout
○ components/layout_thread_2020/ — Layout driver
○ components/compositor/ — Drawing and input handling
○ components/script/ — DOM and Document lifecycle
○ components/style/ — CSS Style
○ components/selectors/ - CSS Selectors
● ports/ — Embedding APIs and applications
○ ports/winit/ — Application run with ./mach run
○ ports/libsimpleservo/ — Simple Servo embedding API (Android)
Servo’s Build System
● Most Rust projects use cargo and Servo does too
● cargo doesn’t support all of the behavior Servo needs
● mach is a python build tool that:
○ Set up environment properly for building (important on Windows)
○ Makes running tests more consistent
● mach can cause issues if used with rust-analyzer
Getting Started
● Requirements:
○ Interest in contributing
○ Communication skills
○ Curiosity
○ A little bit of grit
● Helpful:
○ Knowledge of git
○ Experience with HTML / CSS
QA or Frontend Developers
● The Web Platform Tests are a great place to start
● Can work on any major browser engine
● Checked in to the browser repository
● tests/wpt/tests in Servo
○ Bidirectional sync with upstream repository
● Two million tests, but
○ Many features not tested
○ Always room for improvement
● Break your browser for fun!
Python Developers
● Python is used extensively for making browser engines:
○ mach and most other browser build tools
○ DOM bindings generators
○ All support scripts and test servers for the WPT
● If you know Python, you can look for relevant issues on any of these
repositories
Rust Developers
● Servo is a Rust project
● Good first contributions:
● Upgrade a dependency
● Fix a lint warning
● Other types of technical debt
● More ambitious
○ Run WPT tests and find failures
○ Hunt down code which implements that feature
○ Write a fix
Example:
Porting a legacy test
Example:
Adding a new DOM API
● Make clean commits
● Make pull requests that are small functional units
● Write full and clear commit messages that give the whole story
● Carefully read contribution guidelines
● Use the project-provided formatting tool and lint
● Run as many tests locally as you can
● Make sure your changes don’t fail on CI
General Advice
Contact
● Try an experimental build of Servo
● ↝ Mastodon: https://floss.social/@servo
● ↝ GitHub: https://github.com/servo
● ↝ Chat: servo.zulipchat.com
● ↝ Email: join@servo.org
Questions?
THANK YOU

More Related Content

You Can Work on the Web Patform! (GOSIM 2023)

  • 1. You Can Work on the Web Platform! GOSIM WORKSHOP
  • 2. About Me ● Partner at Igalia and member of the Web Platform team ● Long time contributor to WebKit, Chrome, Firefox and other browsers ● Now working on Servo
  • 3. The goal of this talk
  • 4. What is a browser engine?
  • 5. The browser engine is the “web view” part of the browser, separated with an API boundary. ● Most browsers have a separate browser engine: ○ Chrome Blink ○ WebKit Safari ○ Firefox Gecko ● Examples of things not included: ○ Address bar ○ Bookmarks ○ Tabs
  • 6. Wait. I don’t know if I can work on a browser engine. Aren’t they…
  • 8. ● Modular ● Well-tested ● Accessible to different experience levels
  • 9. With a bit of dedication and willingness to learn you can work on the web platform!
  • 10. Let’s start with some Basic Definitions
  • 11. Visually ↻ https://www.webpage.com/index.html Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Engine / Web View Browser
  • 12. Inside the web view Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. iframe / subframe main frame Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Lorem ipsum dolor
  • 13. Typical Browser Engine ● Written in C++ ● Around 20 years old ● Many different components ● Must handle a huge variety of content: ● Backwards compatibility with over 30 years of content ● Millions of content authors (many malicious) ● Everything is security critical ● Multiprocess / multi-threaded ● But each page’s DOM + layout is synchronous
  • 15. The embeddable, independent, memory- safe, modular, parallel web rendering engine
  • 16. Servo ● Written in Rust ● “Only” 10 years old ● Memory safety without garbage collection ● More concurrency than other browsers (layout included) ● Multi-process and single-process modes ● Modular and reusable components ● Stylo & WebRender shared with Firefox ● Much easier to hack on!
  • 17. Major Components ● Network ● HTML parser ● CSS parser & selector ● JavaScript engine & DOM ● Style engine ● Layout engine ● Paint & composite
  • 18. Network ● Opens sockets with web servers ● TLS in order to process secure connections ● Knows how to parse and produce HTTP headers ● Typically deals with Response and Request objects ● Handles mime types and HTTP authentication ● Not only main frame navigation: ○ <iframe> loading ○ Images and media ○ Fetch / XMLHttpRequest APIs ○ WebSockets
  • 20. HTML Parser ● HTML5 parsing specification ○ Backward compatible ○ Specifies how to handle bad markup ● Alternate parsers for XML / nested XML ● Parsing iteratively produces a DOM Document ○ Script and style tags may pause parsing ● Handled in Servo by a crate called html5ever
  • 21. JavaScript Engine ● Responsible for executing JavaScript ● Usually has multiple tiers ○ An interpreter ○ Hot code is compiled to machine code via JIT ● Examples: ○ WebKit: JavaScriptCore ○ Chrome: V8 ○ Gecko & Servo: SpiderMonkey ● Servo has Rust bindings for SpiderMonkey called mozjs
  • 22. DOM ● DOM are native objects exposed to script ● Wrapped with an engine-specific glue code ● Wrappers generated from WebIDL ● Wrappers expose native objects and properties to JavaScript ● In Servo these are in components/script/dom
  • 24. CSS Parsing, Selector Matching, and Style ● Parsing ○ Takes CSS source and turns it into selectors and rules ○ Like <script>, <style> can also block page load ● Selector matching: ○ Given all of the CSS selectors like .class > li:first-child, determine which DOM elements are targeted ○ Performance critical ● Styling ○ Calculate the style of all DOM elements, using cascading rules
  • 25. Stylo: A parallel style engine written in Rust.
  • 26. ● Basic idea is to traverse the DOM and create a render tree ● Modern approach ○ Create a box tree based on style of nodes ○ Perform fragmentation (lines, columns, pages) ○ Final output is a fragment tree ● Considerations ○ Incremental layout ○ Parallelism Layout
  • 27. ● components/layout_2020 ● components/layout_2013 is legacy layout ● Top down traversal, fragments collected bottom up ● Parallelism ○ Using thread pools via rayon to execute things in parallel ○ Can switch between serial and parallel layout in same traversal ○ Can be configured at runtime ● Still no support for incremental layout Servo’s Layout
  • 28. Painting ● Three phases (differs between browsers) ○ Display list collection: fragments into display list items ○ Rasterization: turns items into bitmaps ○ Compositing: different groups of content composited together ● Some browser engines use layers to group composited content ● This involves the GPU ○ GPUs are good at compositing, blending, and filtering ○ 3D CSS transforms, WebGL, WebGPU are already textures! ○ Arbitrary 2D vector rasterization conditionally
  • 29. Servo + WebRender ● Fragment tree is converted into stacking context tree ○ Respects CSS 2 Appendix E painting order rules ○ Gathers elements together that have transforms, filters, etc ● Stacking context tree is converted into a display list ● WebRender ○ Takes display list and produces OpenGL calls ○ Only supports CSS (no SVG, no Canvas) ○ Handles grouping content based on filters automatically ○ Shared with Firefox (but is written in Rust)
  • 31. How do we test such a complex piece of software?
  • 32. Testing the Web Platform ● Web Platform Tests ○ Project shared between all browsers ○ 2 million tests ○ Tests are typically written as part of browser changes ○ Automatic bidirectional sync ● Browser-specific Tests ○ Use functionality not exposed to Web Platform tests or browser specific ○ Legacy tests written before Web Platform Tests or before test driver gained features
  • 33. Testing the Web Platform ● Performance Tests ○ Benchmarks or regression tests for performance ○ Always browser-specific ● Fuzzing ○ Testing method where “fuzzer” generates many source documents ○ Used to catch crashes (security vulnerabilities) ○ Usually managed outside the context of browsers ● Manual Tests ○ Last resort test used to test behavior that can’t be tested in an automated way
  • 34. Now we know a bit about how a web engine works. How do we start working on it?
  • 35. Fetch & Build Servo $ git clone git@github.com:servo/servo.git Follow build preparation instructions at: https://github.com/servo/servo $ ./mach build
  • 36. Quick Tour of Servo ● python/ — Build and support scripts ● components/ — Individual Rust components ○ components/layout_2020/ — Layout ○ components/layout_thread_2020/ — Layout driver ○ components/compositor/ — Drawing and input handling ○ components/script/ — DOM and Document lifecycle ○ components/style/ — CSS Style ○ components/selectors/ - CSS Selectors ● ports/ — Embedding APIs and applications ○ ports/winit/ — Application run with ./mach run ○ ports/libsimpleservo/ — Simple Servo embedding API (Android)
  • 37. Servo’s Build System ● Most Rust projects use cargo and Servo does too ● cargo doesn’t support all of the behavior Servo needs ● mach is a python build tool that: ○ Set up environment properly for building (important on Windows) ○ Makes running tests more consistent ● mach can cause issues if used with rust-analyzer
  • 38. Getting Started ● Requirements: ○ Interest in contributing ○ Communication skills ○ Curiosity ○ A little bit of grit ● Helpful: ○ Knowledge of git ○ Experience with HTML / CSS
  • 39. QA or Frontend Developers ● The Web Platform Tests are a great place to start ● Can work on any major browser engine ● Checked in to the browser repository ● tests/wpt/tests in Servo ○ Bidirectional sync with upstream repository ● Two million tests, but ○ Many features not tested ○ Always room for improvement ● Break your browser for fun!
  • 40. Python Developers ● Python is used extensively for making browser engines: ○ mach and most other browser build tools ○ DOM bindings generators ○ All support scripts and test servers for the WPT ● If you know Python, you can look for relevant issues on any of these repositories
  • 41. Rust Developers ● Servo is a Rust project ● Good first contributions: ● Upgrade a dependency ● Fix a lint warning ● Other types of technical debt ● More ambitious ○ Run WPT tests and find failures ○ Hunt down code which implements that feature ○ Write a fix
  • 44. ● Make clean commits ● Make pull requests that are small functional units ● Write full and clear commit messages that give the whole story ● Carefully read contribution guidelines ● Use the project-provided formatting tool and lint ● Run as many tests locally as you can ● Make sure your changes don’t fail on CI General Advice
  • 45. Contact ● Try an experimental build of Servo ● ↝ Mastodon: https://floss.social/@servo ● ↝ GitHub: https://github.com/servo ● ↝ Chat: servo.zulipchat.com ● ↝ Email: join@servo.org