2

This is an example of a more general question, regarding the relationship between the HTML5 parser and the DOM API. Some things aren't allowed in HTML that apparently aren't relevant to the DOM API — and so you can create the "unallowed" HTML situation via the DOM.

E.g. according to the HTML5 spec, the p element has a content model of only "phrasing content". Now the "content model" is "A normative description of what content must be included as children and descendants of the element." and "phrasing content" is basically text and "intra-paragraph" markup like links and spans, NOT div elements.

Indeed, if I make an HTML document or cause an HTML snippet to be parsed like this, the div gets forcibly "unnested":

var containerEl = document.createElement('body');
containerEl.innerHTML = "<p><div></div></p>";
console.log(containerEl.innerHTML); // -> "<p></p><div></div><p></p>"

Seemingly during parsing, the "original" paragraph gets split into two, with the div between.

However this code lets me insert a div into a p without issue:

 let pEl = document.createElement('p'),
     divEl = document.createElement('div');
 pEl.appendChild(divEl);
 console.log(pEl.outerHTML);     // -> "<p><div></div></p>"

Now the DOM Level 3 spec says that the .appendChild method can raise a DOMException if the wrong "type" of node is inserted:

HIERARCHY_REQUEST_ERR: Raised if this node is of a type that does not allow children of the type of the newChild node

I suspect in this case "type" might be referring more to, e.g. you can't append an Element node as a child of a Text node.

Is there anything in the standard that clarifies the behavior here, acknowledging the discrepancy? What are the consequences to making a DOM hierarchy via JavaScript that's not allowed when parsing HTML?

3
  • Related: stackoverflow.com/questions/33613970/…
    – Ry-
    Commented Jul 28, 2017 at 21:34
  • Is the behaviour consistent across all browsers? I would take a pragmatic view and not construct unusual/incorrect element structure.
    – Andy G
    Commented Jul 28, 2017 at 21:39
  • @AndyG - Yes it's completely consistent across all modern browsers, and the HTML5 standard and the DOM Standard require them to work that way.
    – Alohci
    Commented Jul 28, 2017 at 22:19

1 Answer 1

2

Is there anything in the standard that clarifies the behavior here, acknowledging the discrepancy?

Yes, the html5 standard mentions that DOM != HTML != XHTML

1.8 HTML vs XML syntax

The DOM, the HTML syntax, and the XML syntax cannot all represent the same content. For example, namespaces cannot be represented using the HTML syntax, but they are supported in the DOM and in the XML syntax. Similarly, documents that use the noscript feature can be represented using the HTML syntax, but cannot be represented with the DOM or in the XML syntax. Comments that contain the string "-->" can only be represented in the DOM, not in the HTML and XML syntaxes.


What are the consequences to making a DOM hierarchy via JavaScript that's not allowed when parsing HTML?

Depends on what you're doing. It could lead to inconsistent behavior across browsers. It could lead to surprising styling. Or it could lead to content not being rendered. E.g. inserting a <p> into a <select> will just not make it render.

Direct node manipulation APIs (e.g. appendChild) will show different behavior compared to the fragment parsing algorithms (e.g. insertAdjacentHTML and innerTHML) because the latter essentially run the text through the document parser and perform adjustments while creating the DOM tree based on HTML-specific rules from text, while the node manipulation APIs is more generic and not aware of such adjustments.

Not the answer you're looking for? Browse other questions tagged or ask your own question.