585

I’m Ham and I’m a developer on the Teams team here at Stack Overflow. Over the past few months, I’ve been heads down working on the way we turn Markdown into HTML when writing and editing posts across the network. I’d love to share what I’ve come up with.

In a nutshell: We're planning to use CommonMark for all posts across the network moving forward. To do so, we switch to CommonMark-compliant Markdown renderers on the client and the server side. We have to make sure that all existing posts work with the new renderers so we will run a big migration across the network that will convert existing posts to use the new CommonMark format. Writing, editing and reading posts should look and feel mostly the same after the change.

As of June 20, 2020, all sites are on CommonMark now. For individual sites, see the migration schedule here.


We’re using Markdown throughout the Stack Exchange network. Markdown was one of the early technology bets when Jeff and Joel started out building Stack Overflow. If you write a question, an answer or a comment anywhere on the Stack Exchange network, you’re going to write it in Markdown.

Over the years, Markdown has become a common way of writing content in online communities. It has become a wild success and even got a formalized specification with CommonMark.

Stack Exchange’s way of handling user-created Markdown today is largely the same as it was when we started. We’re using our own, home-grown Markdown parsers and renderers on the client and server side. Both of these implementations have proven to be a solid foundation and received a lot of tweaks over the years.

However, they come with their own quirks. Being created before there was a CommonMark spec they show some non-spec-compliant behavior. They’re using regular expressions for transforming Markdown into HTML (I’ll leave it to your imagination how much sweat and tears this has cost us over the years) which is perfectly doable but makes maintaining and adding new features to our Markdown parsers extra hard.

The idea

We think it’s time to move forward. You’ve asked if we’re ever going to adopt CommonMark on the Stack Exchange network a few years back, balpha ran the numbers, and while he found out that it wasn’t impossible, it didn’t seem to be easy peasy, either. With some of the past and upcoming changes, we think that now’s a great time to tackle this challenge and to migrate all network posts over to CommonMark. This includes:

  • Changing the Markdown renderer on the client side
  • Changing the Markdown renderer on the server side
  • Automatically editing and re-rendering all posts across the network that are not CommonMark-compliant

To give you a better feeling for the changes under the hood: When you write a post on the Stack Exchange network, you write it in Markdown. On the client side, you see a preview of your post as you’re writing it. This preview is created by our client-side Markdown renderer. It takes the Markdown you write, transforms it into HTML and shows you a preview of what your post will look like.

Once you save your post, we send your Markdown over to our servers where the same Markdown-to-HTML conversion takes place, again (you can’t trust user input, so we don’t blindly accept the HTML generated on the client side).

Our plan

We will migrate sites across the network to CommonMark site by site over the next couple of weeks. We plan to start with Meta Stack Exchange and Meta Stack Overflow on Wednesday, June 3rd 2020.

I’ve prepared a feature that will swap out our current, home-grown Markdown renderers with well-tested open-source implementations that adhere to the CommonMark specification. For the curious: this means we’re replacing PageDown with markdown-it on the client side and MarkdownSharp with markdig on the server side.

Once we enable that feature, new and edited posts will automatically be rendered with those new renderers. Most likely, you won’t even notice a difference when looking at posts.

With the new renderers in place, we’re going to move all existing posts across the network over to CommonMark. For the vast majority of the posts across the network (80% and up), this means nothing will change. Most of the posts on our network have been written in a way that is completely compliant to the CommonMark specification already, yay! If we convert this Markdown to HTML using a new renderer, the results will be exactly the same.

Then there are those posts that are written in a Markdown flavor that was cool for our current renderers but isn’t what CommonMark would expect. Balpha’s analysis is giving you more details on that. We’re talking about ##headlines without spaces after the hashes and other minor oversights. For these posts, we’ve built a tool that automatically fixes these well-known issues by changing a post’s Markdown source directly and re-rendering the HTML of the post in question. When we change a post’s Markdown automatically, this will end up looking like a regular edit but we’re making sure that this won’t bump posts to the top.

So now we’ve got about 80% of our posts that are already good to go. With the auto-fixing utility we estimate that we’re going to land at over 96% of all network posts being rendered completely identical after migrating to CommonMark and using the new renderers. This leaves us with a few percents of posts that end up looking differently when rendered with the new CommonMark renderers.

What you can expect

We avoid breaking existing posts by erring on the side of safety. If a post looks different using the new renderer (and if it’s just one whitespace off) we won’t automatically re-render the post and put it up for investigation first. This way we can be sure that all changes are safe.

I’ve played around with our data to get a feeling for the posts that will be rendered slightly different after using the new renderers. I found out that the differences fall into three buckets:

  1. False positives: the HTML markup changed slightly but doesn’t change semantics or presentation of the post
  2. Improvements: things where the CommonMark specification fixes some oversights in our current Markdown flavor
  3. Actual issues: things that we didn’t anticipate and need to fix

The "actual issues" category should be a tiny fraction but I won’t naively assume that they won’t happen. There will be some changes caused by the new Markdown renderer that we need to investigate because they will cause posts to look different than before in one way or another. We can’t foresee all edge cases that this change will introduce so we will surface all posts that look different when rendered with the new Markdown renderer, review them and if necessary fix them.

Ultimately, we hope to make this transition as smooth and frictionless as possible. We don’t want to break (and manually fix!) thousands of posts. We don’t want to mess with your writing experience. At the same time, I know that we won’t get this feature perfect from the get-go, so I need to ask for your patience and understanding.

As you write new posts after we’ve made the switch to the new CommonMark renderers, you will have the exact same writing experience as before. The preview will show you what your post will look like and once you save your post, it should appear just as you saw it in the preview. If you notice any differences between preview and saved post, please let us know!

Things might get funky when you're editing a post that renders differently with the new CommonMark renderer. Again, if we detected that a post would look differently when rendered with the new CommonMark renderer during the migration, we wouldn't save a new version of this post as part of the migration. This way, all posts continue to look the same when being viewed. However, once someone comes in and edits it, it will be rendered using the new CommonMark renderer and this might cause the post to look slightly different than what we had before. This will only be a small fraction of all of our posts, and of that small fraction a smaller fraction will actually be edited moving forward. However, it's important to keep in mind that editing old posts has a slight chance that you run into differences between our old and our new markdown renderers.


Frequently Asked Questions

When is this going to happen?

The new CommonMark renderers are being merged into master within the next few days. They’re hidden behind a feature flag, so they won’t do any harm until we flip the switch.

We will migrate sites across the network site-by-site over the course of the next couple of weeks. We will start with Meta Stack Exchange and Meta Stack Overflow on Wednesday, June 3rd, 2020 (assuming everything goes well and we don't discover a major blocker until then). Since we can't exactly predict what kind of dragons we'll encounter along the way, plans might change slightly. I'll post a plan for sites and their switchover dates as an answer to this question and will keep it updated as we go.

Every site is different and we need to learn as we go. Most sites can be migrated within a few hours. For our biggest network sites, changing all posts to CommonMark will probably take up to 4 days. Keep an eye on the schedule I'll post to see how we're doing.

Why are we migrating to CommonMark?

In the past, changes to our Markdown renderers have been rather risky and high-effort. We needed to carefully evaluate if a change breaks anything for the millions of existing posts we have in place. By sticking to a well-defined specification like CommonMark, we can make sure that implementations that stick to this specification will work for us. If the specification gets extended, adopting changes will be easy and safe.

Another reason is that this reduces some of the maintenance burdens of our development teams. Instead of maintaining two distinct Markdown renderers, we can now pick something off the shelf and use that instead. With markdig and markdown-it we’ve found two reputable libraries that are beating our own implementations when it comes to performance and functionality. Both are great pieces of software that we're more than happy to use in our product.

Are there some changes to the way I can write Markdown in the future?

Yes, there will be a few changes to the set of supported Markdown on Stack Exchange. For the vast majority of your writing, you won’t see a difference at all. We’re doing our best to continue to allow most of the syntax you can use right now. We’re adopting the CommonMark standard, so everything that’s valid CommonMark will work on Stack Exchange moving forward (here’s a short cheat sheet for the curious).

At the same time, we want to take this opportunity to remove some quirks we’ve built during times where there was no such thing as a CommonMark standard. Some features in Stack Exchange’s current Markdown flavor have been built during a time where there was no standardized way of doing things. Now that we’re adopting CommonMark, we want to replace some of those homegrown features with standardized notation, a notation that you know from other places all around the web as well.

The most noticeable changes will be around lists, nested lists, headlines and blockquotes.

Lists: When creating nested lists, you’ll need to indent your nested list items or paragraphs with the right amount of spaces. While one space was enough before, you’ll need to add a few more now, depending on your type of list.

To make a paragraph part of a list item it used to be enough to add one space in front of the paragraph

* this is a list item

 that goes on here

with CommonMark, the paragraph has to line up with the text of the parent, so we need a few more spaces here:

* this is a list item

  that goes on here

Headlines: Moving forward, you have to add a space after the leading # characters.

#this was cool before
# this is what's cool now

Blockquotes: Previously, empty lines between two lines marked as blockquotes would make one big blockquote. Moving forward, you'd get two distinct blockquotes this way, unless you start the empty line with a > character, too:

> old blockquotes  

> with multiple lines
> new blockquotes
> 
> with multiple lines

Indented code block highlighting: Our biggest breaking change will be around indented code blocks and the possibility to declare the language to be used for syntax highlighting.

In a nutshell: If you want to declare the language for syntax highlighting in your code block, use the code-fence notation and not indented code blocks. You can still use indented code blocks, but declaring the preferred language explicitly for them is no longer supported moving forward.

Until now, you could do this to declare the language for an indented code block:

<!-- language: python -->

    def hello():
        print("Hello, World");

Moving forward, this style is considered deprecated. Ever since we’ve introduced code fences, you can explicitly declare the language of a code block using the code fence notation:

``` python
def hello():
    print("Hello, World");
```

This is the way the CommonMark standard is proposing and this is what other websites are doing, too. We know that you might have gotten used to using the old syntax featuring a <!-- language: lang --> comment. As we’re adopting new Markdown parsers, we want to avoid patching quirky behavior into those parsers when there’s an official, standards-compliant way of achieving the same goal that we can adopt with no extra effort. This style will continue to work for the time being, but is subject to removal in the future, at which point posts using it will no longer recognize it.

Note that setting the syntax highlighting language based on the tags you’ve associated with your post will continue to work. Here’s a full overview of the current behavior of our syntax highlighting if you need a refresh. We’re going to update that post as we move forward.

What happens to SE-specific syntax elements?

On the Stack Exchange network we support some syntax elements that are not part of the CommonMark standard. Things like spoilers, MathJax, circuit diagrams, stack snippets, etc. are used on several network sites. We're going to continue to support all of those custom syntax elements even if they're not part of the official CommonMark specification.

Will this finally enable table support?

Maybe! Support for tables has been discussed intensely in the past. There are many creative workarounds out there but never any official support for rendering tables. If other sites can pull this off, why can’t we?

One of the main reasons, the fact that our Markdown parsers and renderers didn’t support tables, is now no longer valid as we’ve switched to Markdig and Markdown-it. Both support parsing and rendering tables out of the box. Still, introducing table support is a change we don’t want to blindly stuff into this big migration.

Let’s get everything to work nicely with the official CommonMark specification first – and just to be clear, tables are not part of that spec. This change is massive; we need to see how this plays out and make sure this doesn’t introduce more than a few acceptable cosmetic issues across all our communities.

Once the dust has settled and we’re all comfortable with the new Markdown renderers under the hood, we can re-evaluate if the time is right to bring table support back to the (drumroll) table!

68
  • 52
    Does this also mean we're getting header IDs? Commented Jun 1, 2020 at 11:56
  • 18
    Spoiler syntax is going to remain the same - although it's not part of the CommonMark specification.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 11:59
  • 3
    @ZoeTheLockdownPrincess "You can still use indented code blocks but can’t declare the preferred language explicitly moving forward." Commented Jun 1, 2020 at 12:00
  • 39
    This migration won't enable header IDs. This migration is already a big thing so we don't want to conflate adding new features with running the migration itself. Both, markdown-it and markdig support header IDs via plugins so implementing this feature will now be easier than before - but it remains a different discussion.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 12:05
  • 4
    @Laurel the apps likely don't render the HTML themselves, but rather it's done in the API level. If that's the case, apps won't need any change. But if the render is done in the app itself.... this is essentially the final straw and they'll have to shut them down. Waiting for official response. Commented Jun 1, 2020 at 12:34
  • 11
    I run the markdown through the old renderer, run the markdown through the new renderer, scrub both HTML versions with good ol' regular expressions and compare the two HTML strings. It's not sophisticated but gets the job done just fine and is fast enough to handle millions of posts in my lifetime. I hope to get a blog post out soon where I can share more insights.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 18:17
  • 11
    What about RTL direction, currently unsupported by CommonMark?
    – Zev Spitz
    Commented Jun 1, 2020 at 18:39
  • 13
    Oh boy, here we go again... Commented Jun 1, 2020 at 18:55
  • 9
    @Mast Huh? If the displayed text of a post would get altered by the new Markdown engines, then its Markdown will not be updated, even if the difference is a single whitespace. It will continue to be displayed via its current HTML, which was created by the old Markdown engine. So its appearance will be unaltered. However, when someone attempts to edit such a post they will have to comply with the new Markdown rules. This may cause problems. Eg, someone edits a post to fix some minor thing but then discovers that they need to make major changes so that the post renders correctly.
    – PM 2Ring
    Commented Jun 2, 2020 at 10:44
  • 77
    Tables ASAP, PLEASE!
    – Ian Kemp
    Commented Jun 2, 2020 at 11:49
  • 4
    What about placeholders of the form [...], such as tag ([tag:discussion] for discussion) or site reference ([scifi.se] for Science Fiction & Fantasy)? Are those just modelled as links whose definition is invisible at the time of writing? Commented Jun 2, 2020 at 11:52
  • 32
    Because we can't reasonably support two different active markdown renderers without tripping eventually. There are good reasons to move forward outlined in the post - compatibility, user experience, ease of maintenance, simpler future feature development being some of them.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 15:31
  • 4
    This is good, in my opinion. It is absolutely true that Markdown is an incomplete spec, and some solid flavor needs to be used instead. My peronal favorite happens to be kramdown, but it's not a good choice to substitute for standard Markdown in most contexts. CommonMark sounds like a good selection.
    – matt
    Commented Jun 2, 2020 at 21:05
  • 4
    @Sean You can nest blockquotes pretty much the same way you nested them before. Instead of using > you'd start each line of a nested blockquote with >> (or more characters if you want to nest even deeper).
    – Ham Vocke StaffMod
    Commented Jun 4, 2020 at 6:35
  • 11
    Yay! Fantastic! I've updated the commonmark.org website to reflect this change! Commented Jun 28, 2020 at 17:29

55 Answers 55

132

For these posts, we’ve built a tool that automatically fixes these well-known issues by changing a post’s Markdown source directly and re-rendering the HTML of the post in question. When we change a post’s Markdown automatically, this will end up looking like a regular edit but we’re making sure that this won’t bump posts to the top.

What will this do for posts which are currently licensed under CC BY-SA 3.0 (or 2.5)? I see that previous edits of a similar kind (e.g. replacing HTTP links with HTTPS ones) trigger a license notification in the timeline (example). I don't think edits like this should, especially not if the rendered content doesn't change.

@Yaakov says he's working on a fix, which is good news, but that fix needs to be applied retroactively, as can be seen e.g. here:

enter image description here

8
  • 5
    You could read that version history as saying the edits made by Community are licensed under CC BY-SA 3.0... But the text on top, "Current License: CC BY-SA 3.0" is obviously wrong. And worse, if the last edit changes the license shown for the whole post, it'll change not only for automated modifications, but even if some other human user than the author makes some minor edit. It's not that rare for a user to write a substantial answer, which is then then copy-edited/clarified/expanded by another. Usually close to the original posting, sure, but sometimes much later.
    – ilkkachu
    Commented Jun 1, 2020 at 14:11
  • 14
    This is key. I think it's already problematic that all edits, including minor edits and the automatic HTTP->HTTPS edits changed the license. This would affect even more posts than those, I would suspect. Handling the license needs to be thought through very carefully. Commented Jun 1, 2020 at 14:52
  • 102
    These edits will not cause a license change (I am working on a bunch of follow-up items for licensing, this will be included) Commented Jun 1, 2020 at 15:04
  • 1
    The edits cannot really cause a license change because they may be inserted in the timeline, and you don't want to jump to a higher CC version and then down again. It could become tricky if someone inserted a space after a leading # by him/herself and changed other things in an edit. Not sure, what the community bot wants to do in that case so enable earlier revisions to be in commonmark. Commented Jun 1, 2020 at 20:57
  • 3
    From what I understand, the fix encompasses things like the change to HTTPS, so it's going to be retroactive by default since most of the changes are years-old.
    – Catija
    Commented Jun 3, 2020 at 14:09
  • 12
    For the initial test sites, these edits will show up as being licensed. A fix in is being worked on right now, and will be retroactive when it goes up. See here for more information. Commented Jun 3, 2020 at 14:38
  • 2
    @YaakovEllis Retroactive fixes are not enough for license changes. This essentially means you can't trust the license displayed on Stack Exchange sites since you can't assume that the correct license is shown. Personally, I don't care because I don't reuse content in a commercial setting but others do that ...
    – Roland
    Commented Jun 11, 2020 at 14:37
  • This appears to be fixed. Commented Jun 16, 2020 at 7:57
107
+500

Migration Schedule

Here's an overview of the sites we're going to migrate, when we're planning to run the migration and the current status of that site. I'll keep it updated as we go. We might run into some issues along the way, so please understand that predicting an exact timeline is hard and we're going to adapt as we go.

Current Status

All sites have been migrated. CommonMark is used in our editor on all sites now. Thanks for looking out for and letting us know about issues you've found. This was a fun ride.

Done

CommonMark is active, posts have been migrated for these sites

  1. 2020-06-03: Meta Stack Exchange ✔
  2. 2020-06-03: Meta Stack Overflow ✔
  3. 2020-06-04: Physics (Meta + Main) initial run passed, another pass on 2020-06-11
  4. 2020-06-04: Movies & TV (Meta + Main) ✔
  5. 2020-06-10: TeX - LaTeX Stack Exchange ✔
  6. 2020-06-10: Blender Stack Exchange ✔
  7. 2020-06-10: Code Review Stack Exchange ✔
  8. 2020-06-10: Android Enthusiasts Stack Exchange ✔
  9. 2020-06-10: Chemistry Stack Exchange ✔
  10. 2020-06-10: Academia Stack Exchange ✔
  11. 2020-06-11: Server Fault ✔
  12. 2020-06-11: Stack Overflow en español ���
  13. 2020-06-11: Unix & Linux Stack Exchange ✔
  14. 2020-06-11: Cross Validated ✔
  15. 2020-06-11: Stack Overflow em Português ✔
  16. 2020-06-11: Electrical Engineering Stack Exchange ✔
  17. 2020-06-11: Geographic Information Systems Stack Exchange ✔
  18. 2020-06-12: Mathematics ✔
  19. 2020-06-12: Stack Overflow на русском ✔
  20. 2020-06-12: Super User ✔
  21. 2020-06-12: Ask Ubuntu ✔
  22. 2020-06-15: MathOverflow ✔
  23. 2020-06-15: English Language & Usage Stack Exchange ✔
  24. 2020-06-15: Ask Different ✔
  25. 2020-06-15: Salesforce Stack Exchange ✔
  26. 2020-06-15: WordPress Development Stack Exchange ✔
  27. 2020-06-15: Magento Stack Exchange ✔
  28. 2020-06-15: SharePoint Stack Exchange ✔
  29. 2020-06-15: Arqade ✔
  30. 2020-06-15: Database Administrators Stack Exchange ✔
  31. 2020-06-15: Drupal Answers ✔
  32. 2020-06-16: English Language Learners Stack Exchange ✔
  33. 2020-06-16: Mathematica Stack Exchange ✔
  34. 2020-06-16: Science Fiction & Fantasy Stack Exchange ✔
  35. 2020-06-16: Information Security Stack Exchange ✔
  36. 2020-06-16: Software Engineering Stack Exchange ✔
  37. 2020-06-16: Home Improvement Stack Exchange ✔
  38. 2020-06-16: Game Development Stack Exchange ✔
  39. 2020-06-16: Travel Stack Exchange ✔
  40. 2020-06-16: Role-playing Games Stack Exchange ✔
  41. 2020-06-16: Computer Science Stack Exchange ✔
  42. 2020-06-16: Webmasters Stack Exchange ✔
  43. 2020-06-16: Mi Yodeya ✔
  44. 2020-06-16: Graphic Design Stack Exchange ✔
  45. 2020-06-16: Web Applications Stack Exchange ✔
  46. 2020-06-16: Raspberry Pi Stack Exchange ✔
  47. 2020-06-16: Personal Finance & Money Stack Exchange ✔
  48. 2020-06-16: User Experience Stack Exchange ✔
  49. 2020-06-16: Ethereum Stack Exchange ✔
  50. 2020-06-16: The Workplace Stack Exchange ✔
  51. 2020-06-16: Worldbuilding Stack Exchange ✔
  52. 2020-06-16: Data Science Stack Exchange ✔
  53. 2020-06-16: Biology Stack Exchange ✔
  54. 2020-06-16: Bitcoin Stack Exchange ✔
  55. 2020-06-16: Photography Stack Exchange ✔
  56. 2020-06-16: Seasoned Advice ✔
  57. 2020-06-17: スタック・オーバーフロー ✔
  58. 2020-06-17: Motor Vehicle Maintenance & Repair Stack Exchange ✔
  59. 2020-06-17: Cryptography Stack Exchange ✔
  60. 2020-06-17: Japanese Language Stack Exchange ✔
  61. 2020-06-17: Software Recommendations Stack Exchange ✔
  62. 2020-06-17: Arduino Stack Exchange ✔
  63. 2020-06-17: Puzzling Stack Exchange ✔
  64. 2020-06-17: Signal Processing Stack Exchange ✔
  65. 2020-06-17: Music: Practice & Theory Stack Exchange ✔
  66. 2020-06-17: Emacs Stack Exchange ✔
  67. 2020-06-17: Aviation Stack Exchange ✔
  68. 2020-06-17: Русский язык ✔
  69. 2020-06-17: Law Stack Exchange ✔
  70. 2020-06-17: Quantitative Finance Stack Exchange ✔
  71. 2020-06-17: Bicycles Stack Exchange ✔
  72. 2020-06-17: Philosophy Stack Exchange ✔
  73. 2020-06-17: Gardening & Landscaping Stack Exchange ✔
  74. 2020-06-17: Network Engineering Stack Exchange ✔
  75. 2020-06-17: German Language Stack Exchange ✔
  76. 2020-06-17: Space Exploration Stack Exchange ✔
  77. 2020-06-17: ExpressionEngine® Answers ✔
  78. 2020-06-17: Craft CMS Stack Exchange ✔
  79. 2020-06-17: Christianity Stack Exchange ✔
  80. 2020-06-17: Hinduism Stack Exchange ✔
  81. 2020-06-17: CiviCRM Stack Exchange ✔
  82. 2020-06-17: Board & Card Games Stack Exchange ✔
  83. 2020-06-17: History Stack Exchange ✔
  84. 2020-06-17: Code Golf Stack Exchange ✔
  85. 2020-06-17: Anime & Manga Stack Exchange ✔
  86. 2020-06-17: Islam Stack Exchange ✔
  87. 2020-06-17: Politics Stack Exchange ✔
  88. 2020-06-17: Theoretical Computer Science Stack Exchange ✔
  89. 2020-06-17: French Language Stack Exchange ✔
  90. 2020-06-17: Software Quality Assurance & Testing Stack Exchange ✔
  91. 2020-06-17: Economics Stack Exchange ✔
  92. 2020-06-17: Skeptics Stack Exchange ✔
  93. 2020-06-17: Writing Stack Exchange ✔
  94. 2020-06-17: Engineering Stack Exchange ✔
  95. 2020-06-17: Sound Design Stack Exchange ✔
  96. 2020-06-17: Vi and Vim Stack Exchange ✔
  97. 2020-06-17: Sitecore Stack Exchange ✔
  98. 2020-06-17: Astronomy Stack Exchange ✔
  99. 2020-06-17: Computational Science Stack Exchange ✔
  100. 2020-06-17: Physical Fitness Stack Exchange ✔
  101. 2020-06-17: Linguistics Stack Exchange ✔
  102. 2020-06-17: Chinese Language Stack Exchange ✔
  103. 2020-06-17: Biblical Hermeneutics Stack Exchange ✔
  104. 2020-06-17: elementary OS Stack Exchange ✔
  105. 2020-06-17: Video Production Stack Exchange ✔
  106. 2020-06-17: Spanish Language Stack Exchange ✔
  107. 2020-06-17: Reverse Engineering Stack Exchange ✔
  108. 2020-06-17: Tridion Stack Exchange ✔
  109. 2020-06-17: Psychology & Neuroscience Stack Exchange ✔
  110. 2020-06-17: Buddhism Stack Exchange ✔
  111. 2020-06-17: Artificial Intelligence Stack Exchange ✔
  112. 2020-06-17: Pets Stack Exchange ✔
  113. 2020-06-17: Medical Sciences Stack Exchange ✔
  114. 2020-06-17: Joomla Stack Exchange ✔
  115. 2020-06-17: Parenting Stack Exchange ✔
  116. 2020-06-17: Expatriates Stack Exchange ✔
  117. 2020-06-17: Chess Stack Exchange ✔
  118. 2020-06-18: Homebrewing Stack Exchange ✔
  119. 2020-06-18: Project Management Stack Exchange ✔
  120. 2020-06-18: The Great Outdoors Stack Exchange ✔
  121. 2020-06-18: Robotics Stack Exchange ✔
  122. 2020-06-18: Open Data Stack Exchange ✔
  123. 2020-06-18: Tor Stack Exchange ✔
  124. 2020-06-18: Earth Science Stack Exchange ✔
  125. 2020-06-18: Sports Stack Exchange ✔
  126. 2020-06-18: Russian Language Stack Exchange ✔
  127. 2020-06-18: Ask Patents ✔
  128. 2020-06-18: Monero Stack Exchange ✔
  129. 2020-06-18: Latin Language Stack Exchange ✔
  130. 2020-06-18: Interpersonal Skills Stack Exchange ✔
  131. 2020-06-18: DevOps Stack Exchange ✔
  132. 2020-06-18: Windows Phone Stack Exchange ✔
  133. 2020-06-18: Literature Stack Exchange ✔
  134. 2020-06-18: Bricks ✔
  135. 2020-06-18: Hardware Recommendations Stack Exchange ✔
  136. 2020-06-18: Amateur Radio Stack Exchange ✔
  137. 2020-06-18: 3D Printing Stack Exchange ✔
  138. 2020-06-18: Retrocomputing Stack Exchange ✔
  139. 2020-06-18: Italian Language Stack Exchange ✔
  140. 2020-06-18: Bioinformatics Stack Exchange ✔
  141. 2020-06-18: Genealogy & Family History Stack Exchange ✔
  142. 2020-06-18: Quantum Computing Stack Exchange ✔
  143. 2020-06-18: Open Source Stack Exchange ✔
  144. 2020-06-18: Woodworking Stack Exchange ✔
  145. 2020-06-18: Computer Graphics Stack Exchange ✔
  146. 2020-06-18: History of Science and Mathematics Stack Exchange ✔
  147. 2020-06-18: Mathematics Educators Stack Exchange ✔
  148. 2020-06-18: Lifehacks Stack Exchange ✔
  149. 2020-06-18: Music Fans Stack Exchange ✔
  150. 2020-06-18: Stack Apps ✔
  151. 2020-06-18: EOS.IO Stack Exchange ✔
  152. 2020-06-18: Ukrainian Language Stack Exchange ✔
  153. 2020-06-18: Portuguese Language Stack Exchange ✔
  154. 2020-06-18: Poker Stack Exchange ✔
  155. 2020-06-18: Freelancing Stack Exchange ✔
  156. 2020-06-18: Martial Arts Stack Exchange ✔
  157. 2020-06-18: Sustainable Living Stack Exchange ✔
  158. 2020-06-18: Mythology & Folklore Stack Exchange ✔
  159. 2020-06-18: Internet of Things Stack Exchange ✔
  160. 2020-06-18: Arts & Crafts Stack Exchange ✔
  161. 2020-06-18: Esperanto Language Stack Exchange ✔
  162. 2020-06-18: Ebooks Stack Exchange ✔
  163. 2020-06-18: Korean Language Stack Exchange ✔
  164. 2020-06-18: Stellar Stack Exchange ✔
  165. 2020-06-18: Coffee Stack Exchange ✔
  166. 2020-06-18: Tezos Stack Exchange ✔
  167. 2020-06-18: Language Learning Stack Exchange ✔
  168. 2020-06-18: Beer, Wine & Spirits Stack Exchange ✔
  169. 2020-06-18: Operations Research Stack Exchange ✔
  170. 2020-06-18: Iota Stack Exchange ✔
  171. 2020-06-18: Computer Science Educators Stack Exchange ✔
  172. 2020-06-18: Veganism & Vegetarianism Stack Exchange ✔
  173. 2020-06-18: Community Building Stack Exchange ✔
  174. 2020-06-18: Constructed Languages Stack Exchange ✔
  175. 2020-06-18: Drones and Model Aircraft Stack Exchange ✔
  176. 2020-06-18: Materials Modeling Stack Exchange ✔
  177. 2020-06-18: CS50
  178. 2020-06-18: Stack Overflow Teams ✔
  179. 2020-06-20: Stack Overflow ✔
23
  • 16
    I'm going to add more dates to this schedule as we learn more and grow more confident over time. For starters I'm keeping this small because I don't want to over-promise without having a real idea for the first few batches we're going to run. I know that more clarity would be better but it's pretty hard to pull off. I expect us to be able to post a way larger schedule once we've migrated the first few sites.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 12:38
  • 2
    @HamVocke It seems to me the least generally risky method would be to apply this change to smaller sites first, so any issues found would usually affect fewer posts, and then can be taken care of first before applying to the changes to larger sites later. Are there any particular reasons you can tell us about why the particular sites you show above have been chosen, and in the particular order you show them in? Commented Jun 1, 2020 at 13:21
  • 13
    @JohnOmielan Not trying to answer for Ham - but a note, both M&TV and Physics have integrations that may prove ... complicating factors - the YouTube embedding and MathJax.
    – Catija
    Commented Jun 1, 2020 at 13:35
  • 22
    We're trying to strike a good balance between size, risk, and impact. We want to learn on small sites first that include some unique features (mathjax, spoilers, video embedding). Then we want to move to the bigger sites in the network soon to get these changes into the hands of many users. I've picked the first few sites somewhat arbitrarily based on the idea of being small enough to allow fast feedback while using some of our special features. Hope this gives you rough idea of the general reasoning.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 13:52
  • 20
    We've built an automated rollback if things go horribly wrong. This will require undoing the entire migration post by post so it will be yet another intense calculation. If we can avoid a rollback, we'd love to avoid a rollback. I think we have to live with some level of confusion for the time being and make sure get all sites migrated soon :)
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 14:56
  • 16
    (Physics mod here) @HamVocke In case you're not already planning it, would you consider making a post on our meta site a day or two in advance to let people know to describe the change and look out for possible errors in post rendering? Sometimes it takes a little while for people to notice things on our meta site.
    – David Z
    Commented Jun 1, 2020 at 19:51
  • 46
    May we respectfully request that the "old" contents of a site don't get deleted until the denizens of that site have confirmed that nothing essential has changed? When the tex.stackexchange site was last mass-changed, a significant proportion of our example code was trashed, with double-backslashes reduced to singles, rendering the code invalid, and many of us spent hundreds of hours repairing the damage manually. There are still occasional instances of this carnage found from time to time. We are shivering in our shoes. Commented Jun 1, 2020 at 19:53
  • 6
    The SE.Physics migration on 2020-06-04 seems notable since SE.Physics has MathJax (for TeX) enabled.
    – Nat
    Commented Jun 1, 2020 at 21:00
  • 15
    @barbarabeeton we won't delete anything. When changing a post to CommonMark we're creating a new revision of the existing post that can be rolled back manually or as a batch operation (if we really have to). Also, we're only creating those new revisions when we're confident that it's safe to do so.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 7:22
  • 5
    @DavidZ good call, let me get something out to Physics and Movies & TV. I don't think we'll announce this for every single site but for the first few it's certainly reasonable to do so explicitly.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 7:24
  • 4
    I would consider scheduling Puzzling prior to Movies because this would allow testing for spoilers isolated from video embedding
    – gnat
    Commented Jun 2, 2020 at 10:33
  • 4
    @HamVocke -- The last time it wasn't intentional, and we learned about the problem only when we started getting complaints that answers weren't working, in fact were causing compilations to crash. Please let us know when this is scheduled (at least our moderators) so that results can be reviewed by TeX-knowledgeable users before activation. Commented Jun 2, 2020 at 12:04
  • 4
    @HamVocke as there is some disturbance around the licensing shown in the timeline would you consider holding off on switching over SO (and other technical sites with lots of code snippets) untill Yaakov has deployed the fix for the licensing issue mentioned in Glorfindel's answer? Just to avoid creating too much confusion.
    – Luuklag
    Commented Jun 3, 2020 at 19:45
  • 3
    I can see User Experience is a long way down the list, but don't forget BML. It might be good to annotate the list with particular issues you foresee (like MathJax and other oddities). Commented Jun 3, 2020 at 22:40
  • 3
    @JosephSible-ReinstateMonica I imagine that means someone will be in on the weekend to check it. Friday deployments are only bad if no one is in to use it/test it Commented Jun 16, 2020 at 9:07
53

If you deprecate the use of <!-- language: lang-html --> in favor specifying the prettifier at the start of the code-fence, will you still support the overall syntax highlight hint for all code blocks?

<!-- language-all: lang-none -->

I have used that feature very occasionally so I doubt it has much impact if it can't be used anymore.

For putting things in perspective, this feature was used 2254 times in posts on Stack Overflow during the first 5 months of 2020. (Yes, I did try to run it for all posts but doing a full table scan over the body field isn't going to complete within 2 minutes. I'm sure SE staff can run the query on the internal SEDE instance when needed).

Across all other sites (excluded Stack Overflow) this is the usage since 2017:

stats showing 65 sites with max count of over 17,000 posts
click image for query

13
  • 3
    I like being able to do this, but I’m not a fan of the syntax. If we could just mark the first codefence as “all-none” (or generally: “all-X”) and have it work the same, that would be a win. (Maybe give a warning for using the old syntax too.)
    – Laurel
    Commented Jun 1, 2020 at 13:05
  • 41
    Yes, the <!-- language-all: lang-something --> syntax will continue to work. Same goes for inferring the syntax highlighting language from the tags attached to a question.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 14:52
  • 8
    FWIW, SO is not necessarily representative WRT the use of this feature, as tag-based inference tends to work pretty well. Sites where tags are not (or cannot) be configured for this purpose may tend to make more use of this as a convenient way to turn on highlighting.
    – Shog9
    Commented Jun 1, 2020 at 16:16
  • 2
    @Shog9 I've ran it across the other main-site databases for posts since 2017 and numbers are still low. But as this is confirmed to continue to work the whole point is moot.
    – rene
    Commented Jun 1, 2020 at 17:02
  • 3
    Does ``` none work now? Commented Jun 2, 2020 at 0:46
  • 2
    @Michael-Where'sClayShirky it works for me, yes. At least on MSO and MSE. If it doesn't work for you, please share the site where you are trying that.
    – rene
    Commented Jun 2, 2020 at 5:42
  • 2
    @HamVocke Yet in your post you mention "Moving forward, this won’t work anymore.". How should we interpret this?
    – Mast
    Commented Jun 2, 2020 at 7:49
  • 6
    @Mast declaring the language for a specific indented code block is going away. With CommonMark's fenced code block notation we've got a perfect, standards-compliant replacement. Setting the overall language either by inferring it from tags or by declaring a <!-- language-all --> is going to stay for now. A new syntax highlighter might require us to revisit this in the future.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 8:00
  • 2
    @HamVocke Ah, so setting it for the entire post the old style is still supported while for separate code-blocks only the new style is. We can work with that.
    – Mast
    Commented Jun 2, 2020 at 8:05
  • FWIW, I think your numbers are a little off, I've personally added that tag to multiple questions/answers on Code-Review 1,2,3. Is the query missing that site?
    – Greedo
    Commented Jun 2, 2020 at 15:25
  • 2
    @Greedo now fixed, edited, new screenshots and links to new queries, thanks for catching that!
    – rene
    Commented Jun 2, 2020 at 16:55
  • Tables: Feature Preview: Table Support Commented Nov 27, 2020 at 8:00
  • @P.Mort.-forgotClayShirky_q are you challenging me? ;)
    – rene
    Commented Nov 27, 2020 at 10:16
35

Things might get funky when you're editing a post that renders differently with the new CommonMark renderer.

If someone starts editing one of these posts that cannot be automatically updated, will there be some kind of notification that the editor should pay extra attention to the render preview because the edit may alter the appearance of the post? This can be particularly important when making small edits to large posts.

6
  • 12
    No, for now we won't show notifications. You're raising a valid point and we've discussed this idea before. We'd have to do an extra database lookup for each post we're going to show just to find out if there have been any issues while the vast majority of posts will be perfectly fine. We might have to reconsider this decision if the first migration runs show that there's a need for notifications. Until then we sure hope we get to a place where the differences are so trivial that we can just ignore showing a heads-up warning altogether.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 17:45
  • 1
    Presumably if someone starts editing and runs into difficulties, they're responsible to either get it working or bail out, so we should not see mangled posts? You could set yourself a notification or special review queue for these special cases?
    – Mr. Boy
    Commented Jun 1, 2020 at 21:28
  • 2
    @Ham How about putting some kind of indicator in the Timeline / edit history that states that the post couldn't be automatically updated? OTOH, I guess that may not be very useful, since it'd be annoying to have to check the timeline before making an edit.
    – PM 2Ring
    Commented Jun 2, 2020 at 0:47
  • 15
    @Mr.Boy: The risk here is that you're reading a large post and see one paragraph or code block you want to improve. You make that edit in the markdown, find that paragraph in the preview, and hit save. You might not notice some subtle or even major breakage elsewhere in the post, if it's large and you weren't expecting any effect on areas you didn't change. Not all users follow meta and will know to be aware of this. Having a quote block split into two would be easy to miss; although not significant, it's still somewhat worse. (Probably that specific thing would be auto-fixed, though.) Commented Jun 2, 2020 at 4:43
  • 3
    @PM2Ring It might be useful for searching, if only via SEDE, in order to find troublesome posts to fix manually (as time, effort and inclination allow). Commented Jun 3, 2020 at 22:35
  • This just happened to me. Commented Sep 14, 2020 at 16:20
34

Is this going to apply to chat as well? That has its own quirks in its implementation that are different from the main site (such as having to do > quote for a quote when >quote works on the main site). Is that going to change in any way?

6
  • 11
    No, this migration is not going to touch chat. It's only going to touch posts (questions and answers) across the network. Personally I think we should keep things consistent (and that includes chat and other places) but it's an entirely different can of worms.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 11:58
  • 3
    @HamVocke - so it doesn't affect comments either?
    – Mithical
    Commented Jun 1, 2020 at 11:59
  • 8
    Correct! Comments won't be affected.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 12:09
  • 89
    Adding my voice here - now we're running 3 different, distinct dialects of markdown (Main, comments and chat) , and it would be really nice to have subsets of one running everywhere. Commented Jun 1, 2020 at 12:30
  • 2
    @Ham talking about can of worms, what will happen to Area 51? Will it be affected? Commented Jun 1, 2020 at 12:36
  • 3
    @ShadowKeepsSocialDistance I don't know of anywhere on Area 51 that even use post Markdown. All the boxes there would use the watered-down comments Markdown system. The Area 51 meta site already runs off the same code as any other meta site.
    – animuson StaffMod
    Commented Jun 1, 2020 at 12:56
34

How will old revisions, in cases where they would trigger edits if they were current, be displayed when viewed?

To clarify my motivation for this question: As I understand it, each post that is currently not valid CommonMark will be updated by one non-bumping edit (which I presume will be shown as having been performed by the Community bot), translating the latest revision of the post from Stack Exchange's current Markdown dialect into CommonMark.

When an old (i.e., already non-current) version of a post, accessible through the post's revision history, contains Markdown that is incompatible with CommonMark, how will that version be rendered when a user accesses it? Will it still show the same HTML it once had?

And when diffs are viewed in the revision history--in the "inline" and "side-by-side" views--how will they appear? Will existing diffs (i.e., those between two successive revisions that already exist now) still render the same, no matter how old?

5
  • 17
    Your understanding is perfectly correct. In the revision history we calculate and diff a post's HTML on the fly based on the revision's markdown source. That means that after switching over to CommonMark, even revisions that predate the CommonMark migration will be rendered with the new CommonMark renderer. I know, that's less than stellar but it's all we can do if we don't want to keep the old renderer around forever.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 8:40
  • 1
    @HamVocke - perhaps, once the switcheroo has been performed on all the sites, and everything that can go wrong has gone wrong (and subsequently been fixed) - consider running the updated migrator script on the earlier post revisions?
    – Robotnik
    Commented Jun 3, 2020 at 5:52
  • 8
    @Robotnik I think that would make it impossible to audit the changes. While the current way of updating simply makes a new revision, what you suggest will change the revisions themselves, leaving nothing to compare them with. That'd be awful IMO.
    – Ruslan
    Commented Jun 3, 2020 at 18:20
  • 2
    @HamVocke "That means that after switching over to CommonMark, even revisions that predate the CommonMark migration will be rendered with the new CommonMark renderer." In that case, please consider open-sourcing the client-side renderer so that it can be turned into an add-on/userscript for viewing old revisions.
    – kelvin
    Commented Jun 4, 2020 at 11:39
  • 3
    @kelvin you can find the source code for our old client-side markdown renderer on GitHub. Note that the published version is missing the latest significant feature addition, which is the ability to handle nested code blocks. I hope this can be helpful nevertheless.
    – Ham Vocke StaffMod
    Commented Jun 4, 2020 at 12:26
28

Block Quote migration

I received strange "CommonMark migration" edits here:

  1. https://meta.stackexchange.com/posts/344867/revisions
  2. https://meta.stackexchange.com/posts/345953/revisions
  3. https://stackoverflow.com/posts/37844312/revisions

diff

Both quote whitespace formats appear to be valid CommonMark syntax, so I don't know why they were migrated in the first place.

The post-migration format is clearly worse as the quote marks no longer line up in plaintext.

https://spec.commonmark.org/0.12/#block-quote-marker

7
  • 12
    Wow, awesome find! Turns out there's a bug in the markdown autofixer: When a line preceding a blockquote contains a hyphen, a subsequent blockquote will be indented. I've got a repro and will fix this now so we don't run into this for upcoming migrations. In those two cases, feel free to edit the markdown manually.
    – Ham Vocke StaffMod
    Commented Jun 4, 2020 at 8:07
  • 3
    @HamVocke It's great that you've fixed this bug, but it is a bit concerning that the autofixer let that edit go through. That doesn't appear to be consistent with: "If a post looks different using the new renderer (and if it’s just one whitespace off) we won’t automatically re-render the post and put it up for investigation first."
    – PM 2Ring
    Commented Jun 4, 2020 at 9:37
  • 14
    @PM2Ring I understand where your concern is coming from but that statement still holds true. "Looks different" refers to the rendered HTML, not the Markdown version of a post. In the cases outline here the Markdown was changed but still resulted in equal HTML (because CommonMark allows some ambiguity).
    – Ham Vocke StaffMod
    Commented Jun 4, 2020 at 9:39
  • 8
    @HamVocke Ah, I see. The new version looks the same as the previous version, and the HTML is the same. It's only the Markdown that has extra indentation.
    – PM 2Ring
    Commented Jun 4, 2020 at 9:48
  • 4
    It might be a good idea to also run a commonmark linter on the transformed results. A linter might be able to warn in cases the generated commonmark is legal, but weird.
    – tkruse
    Commented Jun 4, 2020 at 15:43
  • @HamVocke FYI: added a bad/weird block quote edit that I found on Stack Overflow.
    – pkamb
    Commented Sep 1, 2020 at 17:57
  • another: meta.stackoverflow.com/posts/254472/revisions
    – pkamb
    Commented Sep 24, 2021 at 0:44
27
  1. Do you have a list of all the 'well-known issues' that will be automatically converted? For example, I make heavy use of the <!-- language: python --> syntax. Will that be converted to code fences?
  2. Will we be notified if one of our own post can't be converted, so that we can edit them ourselves? Or will it go to a dedicated queue?
  3. Should we try to preemptively correct the Markdown content of our own posts if we suspect it might fail, or would it be preferable to wait until the automated migration?
11
  • 1
    I want to see exact list of what has changed.
    – Sinatr
    Commented Jun 2, 2020 at 8:33
  • 7
    1. Language hints will not be converted to code fences. For a while our renderer will understand the old language hint syntax. This is going to change in the future so don't rely on it and start using code fences instead. 2. No notifications, no review queue for now. 3. I'd let the migration do the heavy lifting. No harm in correcting posts ahead of time but also no need to do that.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 8:57
  • @HamVocke Language hints are processed by the migration, though, right? This is just about the transition period with the CMark renderer for new content and edits, yes? Commented Jun 2, 2020 at 13:23
  • 5
    Yes, strictly speaking processing language hints is living outside of the renderer itself so they will be picked up still for a while. However, we'd like to take this as an opportunity to deprecate a hand-rolled solution that's been superseded by an official one and encourage everyone to move away from using old-fashioned language hints for new content because we'll get rid of them eventually.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 15:34
  • 3
    @HamVocke does it mean that some long-forgotten posts using this feature will break after this "a while", when the renderer understands these comments, ends?
    – Ruslan
    Commented Jun 3, 2020 at 18:27
  • 3
    @Ruslan old posts are going to be just fine as we render the HTML of a post once and then serve that until someone changes a post again. Once language hints become unsupported, editing old posts that still use them means that you'd have to switch over to the new syntax.
    – Ham Vocke StaffMod
    Commented Jun 3, 2020 at 18:37
  • 1
    "Language hints will not be converted to code fences." @HamVocke why not? Generally I've gone with the auto-detected language; if I override it, it's for a good reason.
    – miken32
    Commented Jun 5, 2020 at 17:08
  • 2
    @miken32 because it's tricky to get right. It's again a matter of trade-offs: How many users are using that feature vs. how long would it take to build an automated conversion vs. how big is the risk of introducing side-effects. And then there's the question whether everyone would be okay if we change their indented code blocks to fenced code blocks. To me risk and effort clearly outweighed the benefits.
    – Ham Vocke StaffMod
    Commented Jun 6, 2020 at 10:19
  • 1
    @HamVocke "Language hints will not be converted to code fences" -- please reconsider that decision. As I understand it, that would mean that all code which has been highlighted using the old language hints will (a) cease to be highlighted unless manually edited subsequently to add the new syntax; and (b) such manual edits will be changing every single line, which is (i) arduous, (ii) error-prone, and (iii) creates a diff for every single line of the code, making it ever harder to notice errors. To me, this is a clear case for automated translation.
    – phils
    Commented Jun 15, 2020 at 22:06
  • Further to the point about diffs -- I regularly edit other people's questions and answers to add language hints to existing code. If they used indented code, I would add the SGML comment hints; if they used triple backticks, I would use that. Either way, the edit diff I created was minimal, so everyone reviewing the edit could see at a glance that it was trivially correct. The proposed change will mean that no one can add language hints to indented code blocks without changing every line of the code, which requires far more effort to do in the first place, and far more effort to review.
    – phils
    Commented Jun 15, 2020 at 22:13
  • 3
    Not automatically migrating the indented code will likely ensure that few people will ever bother improving indented code by adding (or correcting) the syntax highlighting; and when they do make such improvements, it requires more effort from not only the person performing the edit, but from everyone who checks it.
    – phils
    Commented Jun 15, 2020 at 22:18
23

Do you also change the syntax highlighter at this occasion?

If I remember correctly, SE still uses Google Prettify, which has been discontinued. Support for more languages and new language versions would be great!

2
  • 16
    No, not at this time. Let's get this big and hairy migration out of the way first, then we'll have a good foundation for those kinds of changes :)
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 17:48
  • 2
    I vote for Pygments (Python) Commented Jun 4, 2020 at 16:41
20

Mathjax

Things like spoilers, MathJax, circuit diagrams, stack snippets, etc. are used on several network sites. We're going to continue to support all of those custom syntax elements even if they're not part of the official CommonMark specification.

Just to be clear, Physics SE and Mathematics SE would be severely crippled if MathJax support was damaged. It is an essential for many sites. Worldbuilding SE and Chemistry SE also use it and plenty of posts would be broken if the migration fails to support MathJax properly.

Is there a backup plan to undo the changes if the move to the new system (for obviously unforeseen reasons) should make using the new system not practical on sites that depend on the extras? Or is going back not an option at all?

At the risk of insulting your IT department, is the existing site data being permanently backed up somewhere at some freeze date prior to the change? If you have to translate existing questions to the new format there is (presumably) a higher risk this won't work well for sites with "extras" like MathJax and in the event changes (for who knows what reason) have to be undone, knowing the data was safe in its original form would be good.

9
  • 4
    The rendered html is stored in the database and so it is the original text. Also, the migration would be incremental rather than all of it at the same time.
    – Braiam
    Commented Jun 1, 2020 at 22:16
  • 4
    We're only going to re-render a post if we figured out that it will look the same it looked before. Re-rendering is a matter of adding a new revision to the post history that includes the changes we're applying to the post's markdown. If this fails, we can always go and undo the latest revision. We've built a tool that does exactly that as a batch operation - if stuff's really broken we're going to run this automatic rollback and restore what was there before.
    – Ham Vocke StaffMod
    Commented Jun 2, 2020 at 7:02
  • 1
    Worldbuilding ?
    – TylerH
    Commented Jun 2, 2020 at 20:51
  • 1
    @TylerH Physics SE, Chemistry SE, Astronomy SE and Worldbuilding SE are all sites I use and which use MathJax. Some Worldbuilding questions require science-based answers and equations are not uncommon there - a fair proportion of WB SE posts would be "broken" if Mathjax support went away. Commented Jun 2, 2020 at 21:08
  • @StephenG Sounds like there's a swathe of bad-fit questions for Worldbuilding then, but that's a subject for Worldbuilding Meta...
    – TylerH
    Commented Jun 2, 2020 at 21:14
  • 12
    @TylerH I assure you that questions requiring real science are common on WB SE and not at all "bad fit". Because they're almost all have a fictional component/framework to them those types of questions would be off-topic on the "real" physical science sites. It is, for example, not unusual for Physics SE posts to be closed and recommended for WB SE (and the reverse also happens). Commented Jun 2, 2020 at 21:18
  • 5
    Add stats.SE to sites that use MathJax heavily...
    – Glen_b
    Commented Jun 3, 2020 at 6:26
  • RPG.SE does also see some significant amount of MathJax use, albeit mostly for formatting tables (though there is some use of it for statistics and the like relating to dice rolls/RPGs).
    – V2Blast
    Commented Jun 7, 2020 at 23:05
  • Crypto.SE uses MathJax a lot too, but the migration happened without a hitch in that respect (it seems). Commented Jun 23, 2020 at 9:41
19

I seem to remember that CommonMark includes additional syntax beyond what SE supported so far, specifically bracketed enumeration, i.e. 1), in addition to dotted numbers for creating enumeration lists.

Is this true and would this mean bracketed numbers will now be automatically turned into enumeration lists (ordered lists, or <ol>)? This would be an amazing step forward with regards to Markdown's aspirations towards making formatting as intuitive as possible, since every second user who isn't aware of Markdown writes their numbered lists that way and it would be great if their posts suddenly worked automatically without requiring manual revision.

Previously requested here: Add parenthesis as an accepted Markdown ordered list delimiter

7
  • 7
    You're right, this is one of those places where CommonMark will bring extended functionality to our markdown syntax. After switching over, 1)-style enumerations will automatically be transformed into ordered lists
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 12:33
  • 2
    @HamVocke: But only for new and edited posts, as you won’t automatically update the cache for those posts where there is a positive change – right?
    – Wrzlprmft
    Commented Jun 1, 2020 at 12:59
  • 17
    Correct! Even though I'd consider this to be an improvement, our safeguards prevent us from automatically updating posts that would apply here. New posts and edits, however, get all the new commonmark goodness.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 13:04
  • 3
    I have always thought it was a way to avoid it being formatted as a regular Markdown list (that is, some users go out of their way to control how the rendered result looks). Commented Jun 1, 2020 at 18:56
  • 3
    @P.Mort.-forgotClayShirky_q Another advantage of this finally getting fixed and this loophole for eschewing markdown getting closed. Commented Jun 1, 2020 at 19:15
  • @HamVocke There's probably a sizable corpus of old posts with 1)-style lists. If I understand correctly, these won't be migrated automatically, and would need to be checked manually?
    – E.P.
    Commented Jun 4, 2020 at 10:21
  • @E.P.: Correct. Unless the post includes one of the "changes to the way [you] can write markdown" listed in the original post (e.g. the stuff mentioned for lists, blockquotes, code block highlighting), it won't be re-rendered if it's an existing post. The change would only be applied if the post was edited or a new post made.
    – V2Blast
    Commented Jun 4, 2020 at 22:29
17

Abbr.SE shortcuts are parsed as domains in preview

When I write an abbreviated site name, such as rpg.se or meta.se, it's now being auto-parsed as a link in the post preview only. It points to that exact domain and not one of our domains, i.e. http://rpg.se/ or http://meta.se/.

The same happens with meta.so, meta.rpg.se, etc.

Hit the "edit" button on this post to repro.

3
  • 4
    Looks like the preview renderer detects them as country-code TLDs. See this formatting sandbox post for more tests. Commented Jun 6, 2020 at 14:21
  • 8
    Confirmed, this is our new markdown renderer being too eager on auto-linking. I've got a fix out that will remove this behaviour and make auto-linking stricter. Going out later today.
    – Ham Vocke StaffMod
    Commented Jun 9, 2020 at 11:56
  • 3
    @HamVocke This looks like it's been fixed to me! Meta Andrew T.'s formatting sandbox post is handled correctly too. Thank you very much. :) Commented Jun 10, 2020 at 11:18
15

What was broken that needed fixing?

I apologize if I come off as ungracious, as you spent a lot of time working on this, but what was it that required fixing?

Seriously. I've had very few problems using this interface. What pressing need does this serve?

This question is based on a lot of years of experience of being immersed in "change for the sake of change" where in the end no value was accrued.

How will I see value added from this change?


I am adding in the comment response that answers my concern, since comments are ephemeral and I'd like to ensure that the value added explanation remains:

(From @HamVocke, thank you)

With this switch, we get: A consistent user experience that aligns with what users know from other websites, predictable formatting, reduced maintenance burden on our software engineers, reduced risk when changing markdown formatting in the future, a stable foundation to build future feature enhancements around formatting and editing. There's value to our end users and there's a lot to win for our engineering teams in the form of massively reduced tech debt.

11
  • 38
    With this switch, we get: A consistent user experience that aligns with what users know from other websites, predictable formatting, reduced maintenance burden on our software engineers, reduced risk when changing markdown formatting in the future, a stable foundation to build future feature enhancements around formatting and editing. There's value to our end users and there's a lot to win for our engineering teams in the form of massively reduced tech debt.
    – Ham Vocke StaffMod
    Commented Jun 4, 2020 at 6:31
  • 9
    I upvoted because the question was the same one I had in my head but was too afraid to ask for not wanting to appear foolish or incredibly naive. It led to a good comment, too. A comment that was easier to wrap my non-nerd head around. Commented Jun 4, 2020 at 7:56
  • 3
    @HamVocke Thank you for that comment. Commented Jun 4, 2020 at 15:04
  • 2
    @ham In other words: There is no value for end users whatsoever. Indeed, you are investing effort in making sure that there is no observable change at all. You can expect your readers to be developers, that are perfectly capable of understanding, that a change sometimes doesn't deliver any value, but rather enables future change that can ultimately deliver value. This is not the change that does, so there's not need to sell it as such.
    – Tim
    Commented Jun 8, 2020 at 7:40
  • 3
    @Tim Switching to a common standard most certainly does have value for end users, in that they do not have to learn so many similar but confusingly incompatible formatting rules for every site on the internet that uses Markdown. Commented Jun 12, 2020 at 10:55
  • @emi There's no value in one site switching to CommonMark. If everyone did, you might have a point, but even then, the switch needs to be as invisible as possible to be successful. There is no value for users in a change that's not observable (modulo some corner cases). Contrast that with Stack Exchange switching to, say, GitHub Flavored Markdown. Now that would deliver actual value. I mean, we've been asking for tables for almost as long as Stack Overflow existed. GFM has a formal specification, too.
    – Tim
    Commented Jun 12, 2020 at 12:44
  • 2
    @Tim Since GFM is in fact an extension of CommonMark, you just proved my point. Commented Jun 12, 2020 at 13:39
  • @emi I'm not sure which point you were trying to prove. If I have supported that, consider it an accident. Exchanging one Markdown renderer for another functionally equivalent Markdown renderer doesn't deliver value to its users. It's a change that's purely internal to the site, and - if all goes well - without any observable change. The fact that this enables delivering value at a later point in time doesn't magically turn this change into one that does. If you weren't made aware of this change, you wouldn't have noticed a thing. Does "no change" feel like "added value" to you?
    – Tim
    Commented Jun 12, 2020 at 16:38
  • @Tim ????? Where did you get the idea that that it’s “without any observable change”? The details of the syntax needed to write answers changed; that’s most certainly an observable change, and if I were not made aware of it, I would probably find out soon the hard way, and become confused. Commented Jun 12, 2020 at 16:56
  • @emi It's written all over the post, e.g.: "For the vast majority of your writing, you won’t see a difference at all." The changes you might observe are corner cases. In those rare cases, the switch to CommonMark is a bug fix. No matter how much you want this to be different, it really is just this mundane: The switch is an implementation detail, that doesn't itself deliver any value to users. Yet I concede that I was wrong about one thing: You cannot expect your readers to be developers after all.
    – Tim
    Commented Jun 12, 2020 at 17:52
  • 2
    Switching to a common and formalized standard (from a custom and probably proprietary mishmash of rules) will likely help long-term preservation of our content. Commented Jun 14, 2020 at 9:08
14

You can still use indented code blocks but can’t declare the preferred language explicitly moving forward.

The Help Center is still mentioning this old method:

To manually specify the language of an indented code block, insert an HTML comment like this before the block:

<!-- language: lang-js -->

     setTimeout(function () { alert("JavaScript"); }, 1000);

It's probably hard to adjust this only for 'migrated' sites, but perhaps it's a good idea to remove it already for all sites, since with the code-fence notation (```c#) we have a decent alternative?

8
  • 3
    Great point. This is somewhere on my long list of things to do but this might just be a good point in time to change our help center. I'll take a look :)
    – Ham Vocke StaffMod
    Commented Jun 3, 2020 at 14:04
  • @HamVocke: Note tkruse's answer pointing out an issue with the description of blockquotes on that help page too.
    – V2Blast
    Commented Jun 4, 2020 at 22:35
  • 2
    @V2Blast yup, I've got a fix in review for both occasions. Will get both changes out soon.
    – Ham Vocke StaffMod
    Commented Jun 5, 2020 at 6:23
  • @HamVocke It appears only the editing-help pages were done, will the same be done for the help centre page (or is that a site moderator editable page)? Also do the statements after the example work, about specifiying a language for all code blocks/removing the language, or do they need updated as well? Commented Jul 3, 2020 at 16:57
  • @Nick great catch. The help centre pages are mod editable. While I have permissions to edit I'd rather not mess with our mods' privileges and have them do it instead. The only thing I see that should be changed is mentioning the <!-- language: lang-* --> examples and replacing them with code fence notation.
    – Ham Vocke StaffMod
    Commented Jul 6, 2020 at 6:36
  • 1
    @HamVocke mods can only edit one page and a few snippets in the Help Center itself and the Tour.
    – Glorfindel Mod
    Commented Jul 6, 2020 at 6:52
  • @Glorfindel dang, I had no idea. Alright, I went ahead and fixed the help center myself to make it consistent with our editing-help pages :)
    – Ham Vocke StaffMod
    Commented Jul 6, 2020 at 10:08
  • 2
    We mods are happy to have privileges to edit more pages, if you want to fix that little bug right up while you're at it, @Ham ;-) Commented Jul 6, 2020 at 10:24
14

Since the update to CommonMark, it's much harder to link to URLs with a ) in them. Consider this link to the Stack Exchange API documentation:

https://api.stackexchange.com/docs/questions-by-ids#order=desc&sort=activity&ids=349185&filter=!)rTkraPYPefwELKox66q&site=meta&run=true

If I try to [link][1] to it as I used to, with a reference at the end of the post, this doesn't work anymore. (This answer already proves it.)

There is a workaround, a good old HTML anchor element:

<a href="https://api.stackexchange.com/docs/questions-by-ids#order=desc&sort=activity&ids=349185&filter=!)rTkraPYPefwELKox66q&site=meta&run=true">this link</a>

produces this link. I guess using %29 would work too.

[1]: https://api.stackexchange.com/docs/questions-by-ids#order=desc&sort=activity&ids=349185&filter=!)rTkraPYPefwELKox66q&site=meta&run=true

5
  • It looks like the editing script doesn't touch these posts, e.g. meta.stackexchange.com/a/306515/295232; that's a good sign.
    – Glorfindel Mod
    Commented Jun 10, 2020 at 6:38
  • I'd argue status-bydesign due to the CommonMark's spec: "A link destination consists of either [...] and includes parentheses only if (a) they are backslash-escaped or (b) they are part of a balanced pair of unescaped parentheses. [...]" and some more examples for inline link (489, 492-496). Commented Jun 10, 2020 at 6:57
  • 5
    @MetaAndrewT. perhaps, but then it would be nice if the Community user would fix all those instances. Otherwise a lot of links are guaranteed to break...
    – Glorfindel Mod
    Commented Jun 10, 2020 at 7:00
  • 5
    This looks like a trivial fix when looking at link references (using [link][1] syntax), just find brackets and percent-encode them. When using the plain link syntax ([link](reference)) it's much harder as you can't reliably tell where a closing bracket is part of the URL and where it's supposed to close the link reference. Auto-fixing link references as part of the migration looks doable, auto-fixing plain links sounds like a huge rabbit hole to me. I can offer to try and fix the former but I'm pessimistic about the latter.
    – Ham Vocke StaffMod
    Commented Jun 10, 2020 at 14:41
  • 1
    @HamVocke thanks! Such links have (AFAIK) never worked with plain link syntax, I think you can ignore those cases.
    – Glorfindel Mod
    Commented Jun 10, 2020 at 14:45
12

For these posts, we’ve built a tool that automatically fixes these well-known issues by changing a post’s Markdown source directly and re-rendering the HTML of the post in question. When we change a post’s Markdown automatically, this will end up looking like a regular edit but we’re making sure that this won’t bump posts to the top.

For users who are curious about how these edits look like: visit the profile of the Community user (ID -1) on the site, and navigate to 'all actions' → 'revisions'. E.g. here on Meta Stack Exchange:

enter image description here

1
  • Unfortunately, this no longer works, as access to the Community user's activity pages was removed sometime after 2022-12-07, but prior to me posting this comment on 2023-12-28. IIRC, the removal of this access was announced by SE in another MSE question, but my moderate attempt to find the announcement wasn't successful.
    – Makyen
    Commented Dec 28, 2023 at 17:46
12

The Community edits trigger post quality evaluation

Code Golf is currently seeing a flood of pending reviews. This is probably because many (good) answers on this site look like low quality, but had been approved previously or predate the current rules for automatic judging of quality.

Now that Community is editing the posts, their questionable quality is needlessly brought forward, and drowns new posts that actually require review.

3
  • 4
    That's a valid concern and I didn't anticipate this one at all. The review queue logic is quite convoluted and I assumed I had taken all necessary precautions. Sorry for the mess it caused. We're discussing possible mitigations for the next migrations and are also discussing if we can fix this for migrations that have already run. If you want to take action in the meantime to get unblocked, basically all edits done by "Community" around 2020-06-16 9:00 UTC are the ones triggered by the migration and could be declared "ok". Again, sorry for the mess.
    – Ham Vocke StaffMod
    Commented Jun 17, 2020 at 14:01
  • 3
    @HamVocke I'd be happy to do the OK click-fest if you allow me more than 20 reviews per day. A dozen of our users have already hit the cap trying to clean up the mess.
    – Adám
    Commented Jun 17, 2020 at 15:38
  • 4
    The Code Golf community has now finally cleared it's low quality post review queue. On the bright side, this pointed out many old answers that should have been deleted (for reasons unrelated to their quality score) but slipped through the cracks.
    – pppery
    Commented Jun 18, 2020 at 2:22
12

What about mobile support?

I understand that the existing apps are no longer maintained, but it seems that plenty of people are still using them, whether or not that is a good idea.

My assumption: when the client side renderer changes, that will render all existing (no longer supported) mobile apps will be really broken and unusable after this change?

(Not a complaint, just a request for clarification.)

6
  • 3
    Are you asking about when you create or edit a post on the mobile apps or when you view them? For viewing posts - we store rendered markdown in the DB which allows us to skip rendering markdown on every request. Mobile uses the rendered HTML; it doesn't render the markdown itself. For editing/creation - we're discussing internally the best way forward here
    – Dean Ward
    Commented Jun 2, 2020 at 11:21
  • I was mainly thinking about viewing, but sure, people who are still using the app(s) might be using them for creating/editing content, too.
    – GhostCat
    Commented Jun 2, 2020 at 11:28
  • 13
    I've just done a quick analysis on the number of create or edit actions on Stack Overflow over the last 7 days. Actions from our mobile apps account for 0.3% of creation events and 0.6% of edit events. We need to determine how many of those would have had markdown that resulted in different HTML but I think it's likely we won't be doing anything to address differences here - as you mention the mobile apps are no longer maintained and we're not in a position to ship a new build. As noted before rendering will continue to function w/o issue!
    – Dean Ward
    Commented Jun 2, 2020 at 12:11
  • 2
    Thank you, that is really excellent work from your side. I like that "data based" approach!
    – GhostCat
    Commented Jun 2, 2020 at 12:25
  • 5
    There are three parts to this: 1) viewing posts on the site (ones made before and after the change), 2) using the in-app markdown editor, and 3) previewing posts/edits. With regards to 1, I don't expect breakage to occur, but it could happen if the CommonMark generated HTML differs enough to break the CSS. In this case, the page would just look ugly. With regards to 2, I don't expect any major issues since all the toolbar buttons should produce valid CommonMark. There might be point where a button transforms existing CommonMark incorrectly, that risk has existed since we added code fences. Commented Jun 2, 2020 at 20:01
  • 5
    With regards to 3, the iPhone and Android phone apps should be fine, as previews are done through API calls and server rendering. Any post you preview in that way should come out exactly as it looked in the preview. The iPad and (I believe) Android tablet apps use PageDown to render Markdown previews in realtime. These previews now risk being inaccurate when a user hits a Markdown/CommonMark difference. Again, this is an issue that exists today with code fences. If you go to the app today and preview a question with code fences and it looks fine, you shouldn't see a regression. Commented Jun 2, 2020 at 20:07
11

It looks like the Help Center articles (they're written in Markdown) need some love from the editing script as well. Example (this one was just edited, and edited again to correct for the migration, but it's logical to assume other pages will be affected too):

enter image description here

11

When editing a post, click the "Code Sample" icon in the toolbar, the traditional indentation still remains inserted.

I think that more inexperienced users will use the toolbar, so I would like to switch to the method of inserting the "code-fence" (```).

enter image description here

4
  • 1
    I get where you're coming from but strictly speaking this button remains correct. We introduced fenced code blocks more than a year ago and while I personally favor them over indentation, indented or fenced code blocks are a matter of taste and both equally valid. There's no obvious "right" or "wrong" here so I'd argue we leave it as is for now.
    – Ham Vocke StaffMod
    Commented Jun 18, 2020 at 17:31
  • 6
    @HamVocke but indented is less functional now because there is no way to specify syntax highlighting. To me that would be a strong reason no prefer codefence via the button. Commented Jun 20, 2020 at 14:50
  • 1
    @StephenOstermiller that is a fair point indeed.
    – Ham Vocke StaffMod
    Commented Jun 20, 2020 at 15:37
  • 1
    I agree, as new users will simply use the button or use indentation, regardless of the language that it is supposed to be. Then it would be up to a reviewer to convert the code to the fencing before adding a language hint to the highlighter. Not nice. I've created my own post(overlooked this one, sorry) with some additional remarks about the language hint & the HTML tags displayed in the buttons. Commented Jun 23, 2020 at 11:03
10

As noticed by user musefan in this post:

Using two tilde signs no longer renders as strike-through text, but does render like that in post preview.

~~Click edit to see this text in strikethrough in the edit preview~~

1
  • 7
    Ah yeah. That's due to a mismatch of the currently activated feature set between our client- and server-side renderers. We didn't support a ~~ syntax for strikethrough before so we shouldn't blindly switch right now. Strikethrough can still be achieved using <del> HTML tags for the time being. We can revisit if we make ~~ (and potentially others) part of the supported markdown formatting options once the dust of this migration has settled. I'll get a fix out to align client-side and server-side renderers' behavior.
    – Ham Vocke StaffMod
    Commented Jun 17, 2020 at 8:59
9

Bit of a minor issue, but as I was updating this post, I noticed that the character sequence \$ was rendering as "$", instead of as "\$" as it was before the migration to CommonMark. It seems that in order to render the backslash prior to the dollar sign, one has to now escape the backslash by typing it twice (i.e. as \\), whereas this wasn't necessary in the prior renderer.

It seems that any symbol after a backslash will result in the backslash no longer rendering, e.g. \., \@, or \= all render as ".", "@", and "=", whereas they used to render as "\.", "\@", and "\=" respectively. (The same doesn't happen to numbers or letters.)

Can posts using these also please be fixed automatically by the migration script, if possible? (Note that those sequences within code markup will still render as before, and don't need to be fixed.)

9

As I was typing another post today, I discovered another difference between the CommonMark renderer and the previous renderer: certain symbols after a URL (e.g. colons) used to be treated as not being part of the URL, but now they are treated as if they are. This seems to only affect the preview, and not the actual post.

Example:

As per our FAQ https://meta.stackexchange.com/questions/58587/what-are-the-reputation-requirements-for-privileges-on-sites-and-how-do-they-di:

...used to render as:

As per our FAQ What are the reputation requirements for privileges on sites, and how do they differ per site?:

...but now renders in the preview as:

As per our FAQ What are the reputation requirements for privileges on sites, and how do they differ per site?:

...with the colon being part of the actual link (as you can tell from the tooltip, or from clicking or copying the link).

This only happens in the post editor preview; on the actual post, both are rendered the same, with the colon not being part of the link.

Can this issue with the preview please be fixed?

4
  • Seems to be fixed now for your : example. ( in URLs still breaks things in general, though, e.g. uops.info/… consumes a ) as part of the URL for [foo](https://example.com/foo?bar=xyz(r64 ), and the control-L link-at-the-bottom thing just plain doesn't work for it. Commented Jul 23, 2020 at 4:56
  • @PeterCordes Not fixed. This issue is about the editing preview; it looks fine in the actual post. Commented Jul 23, 2020 at 5:36
  • Ahh, I see I didn't read carefully enough when looking for existing reports of a problem I found: Editor preview shows links clickable, final rendered post doesn't, when surrounded by [] might be semi-related. Commented Jul 23, 2020 at 5:44
  • The link now renders in the preview without the colon (rather than the colon being a part of the actual link). Still a bug, though. Commented Apr 11, 2023 at 7:27
8

Code highlighting in the preview during post editing does not seem to work anymore. Upon publishing, it still works fine. To reproduce, just hit edit on this post and have a look at the preview..

from __future__ import braces

This is especially irritating because at least I can never remember on which sites I need to write ```python, on which I need ```lang-python and on which both work (or do I need a space or ...). Without a live preview I have to guess and maybe re-edit (as I did in this question).

4
  • 3
    Good catch. I've reproduced the bug and got a fix out for review. I could've sworn that I had tested this behavior but it looks like I missed it.
    – Ham Vocke StaffMod
    Commented Jun 11, 2020 at 8:12
  • 1
    A fix is going out right now, revision 2020.6.12.37041
    – Ham Vocke StaffMod
    Commented Jun 12, 2020 at 9:15
  • @HamVocke I'm not sure what "right now" means, but this bug persists. Where can I check the status of that revision?
    – schtandard
    Commented Jun 14, 2020 at 11:27
  • 2
    you can see the currently deployed revision in the footer at the bottom right of a page (where it says rev XYZ). Syntax highlighting in preview works for me but I acknowledge that sometimes it takes a while and some more typing to kick in (as was before switching to CommonMark). Looks like something we could tweak on another day.
    – Ham Vocke StaffMod
    Commented Jun 14, 2020 at 14:25
8

It's possible to post empty posts by using HTML comments. Example. Note that this has been repro'd on sites with and without CommonMark (https://puzzling.meta.stackexchange.com/posts/6925/revisions https://meta.stackoverflow.com/posts/398084/revisions - both links require 10k. Same basic idea though).

This is probably a regression - these types of posts used to be blocked before they were posted.

2
  • 11
    This sure does look like an issue. However, since we reproduced the issue on puzzling.SE, a site that's not yet migrated to CommonMark, we can be certain that the CommonMark migration won't be the cause of this issue.
    – Ham Vocke StaffMod
    Commented Jun 8, 2020 at 13:39
  • Related (MSE): Scan posts for nonsensical HTML comments Commented Jun 9, 2020 at 10:25
8

The Stack Overflow editing help page links to the original John Gruber markdown specs in the "Need more detail?" section.

Shouldn't it point to the CommonMark specs?

7

Couple of questions:

  1. Are you going to update the HTML (since it is cached) or just the raw Markdown code?
  2. Are you also going to add an entry in the edit history, presumably saying that Community made the change?
3
  • 16
    We will update the raw Markdown whenever it's safe to do so (i.e. the resulting HTML is equivalent to the old HTML). Technically this will rebake the HTML as well. We're storing the update as a new edit, triggered by the community user and without bumping the post itself.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 13:12
  • @HamVocke: Will you do some sort of automated checking that the old HTML is equivalent to the new HTML? (And maybe communicate whatever cannot be fixed to moderators?) As a MathOverflow and math.stackexchange user, I'm somewhat worried about posts suddenly becoming unreadable and staying so for years because no one with edit rights takes notice. Commented Jun 14, 2020 at 9:10
  • @darij meta.stackexchange.com/questions/348746/…
    – DavidG
    Commented Jun 14, 2020 at 11:36
7

Did I understand correctly that you will be fixing the compatibility issues listed automatically, like more indentation necessary for list paragraphs, quote markup before empty lines, missing spaces before headlines,...? It might very well be that the question covered that already under the general migration explanations, but I just want to make super sure that you've covered this. I would not want a load of misaligned paragraphs or multi-block quotes appearing all of a sudden in 10,000 existing posts.

4
  • 3
    Yes, the migration will fix compatibility issues in a post's Markdown automatically as good as we can. Chances are, there are some fixes that we haven't covered yet but most trivial issues will be fixed automatically. To reiterate: if we can't fix a post automatically and we'll discover that the new HTML would look differently, we will not touch this post at all.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 14:04
  • 3
    @HamVocke, what does "not touch this post at all" mean here, exactly? that the source and rendered content would same as they were, and it will blow up then if anyone edits it, when the new parser processes the source? :)
    – ilkkachu
    Commented Jun 1, 2020 at 14:13
  • 12
    @ilkkachu that's pretty spot on, yes. Whenever we find one of these posts, we'll save it so we can investigate what's going on. I expect that most of these occurrences will be trivial. The next person editing won't even notice that something's changed. For those cases where the post looks fundamentally different, we'll add more automatic fixes and re-run the migration. It's a matter of trade-offs. We could put in the work and fix every single issue carefully or focus on the big ones, ignore the trivial ones and cause mild inconvenience if someone edits a post moving forward.
    – Ham Vocke StaffMod
    Commented Jun 1, 2020 at 14:42
  • @HamVocke: Thanks for the explanation! That makes sense.
    – V2Blast
    Commented Jun 2, 2020 at 12:42
7

A small discrepancy between preview and post that I noticed today on SO:

Something like http://localhost:3000 gets previewed as a link, but in the post it's regular text.

For example right now when writing this post: enter image description here

EDIT: In comments they do render as link by the way.

1
  • 2
    I reported a similar problem with links here. Commented Jul 1, 2020 at 12:41
6

Tabs are not handled properly anymore, which makes it difficult to format CommonMark source code properly.

Example 1

Using a tab to indent contents of a list does not work. It appears to be treated like one space. This is in conflict with the CommonMark specification. For example,

*——⇥test
———⇥
———⇥test

renders as:

  • test

    test

while it should render as:

  • test

    test

Example 2

Tabs in code environments are bluntly substituted by four spaces instead of making a four-space indentation. For example,

———⇥*——⇥test
———⇥———⇥test

renders as:

  • test test

while it should render as:

  • test test
3
  • 1
    I agree, it looks like we're missing something here and the new behavior is slightly different than the old one. For clarity: We've always replaced tabs with spaces when converting Markdown to HTML, with CommonMark in place, we continue to do so. Let me see what's happening here that makes indentation different from the old renderer.
    – Ham Vocke StaffMod
    Commented Jun 12, 2020 at 6:45
  • 2
    Got a fix out with the the latest revision (2020.6.12.37042) that should address the issues you describe. If you edit your post, you should be able to see the new behavior.
    – Ham Vocke StaffMod
    Commented Jun 12, 2020 at 9:55
  • @HamVocke But there are often tabs in SO code blocks, how did they get there?
    – philipxy
    Commented Jun 22, 2020 at 3:46

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .