Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML tags in blockquotes are not stripped #6

Closed
tdemin opened this issue Aug 12, 2021 · 3 comments · Fixed by #33
Closed

HTML tags in blockquotes are not stripped #6

tdemin opened this issue Aug 12, 2021 · 3 comments · Fixed by #33
Labels
bug Something isn't working gomarkdown Issue in upstream gomarkdown

Comments

@tdemin
Copy link
Owner

tdemin commented Aug 12, 2021

Initially discovered in #5.

Despite (Renderer).paragraph() utilizing (mostly) the same logic as (Renderer).blockquote(), raw HTML is stripped from text paragraphs, but not from blockquotes. Appears to be a gomarkdown issue.

Blockquote with an HTML line break

@tdemin tdemin added bug Something isn't working gomarkdown Issue in upstream gomarkdown labels Aug 12, 2021
@mntn-xyz
Copy link
Contributor

mntn-xyz commented Sep 12, 2021

I think the problem is that blockquotes contain nested nodes. I fixed the issue here, but I feel like there's a cleaner way to do this, so I didn't want to make a PR yet: https://github.com/mntn-xyz/gmnhg/tree/blockquote

To fix it, I rendered children recursively, and if it was an HTMLBlock or HTMLSpan then I just rendered the node as a plain leaf, replacing the text with the Markdown content. There's definitely a better way to do this, I was just messing around to see if it could be fixed...

@tdemin
Copy link
Owner Author

tdemin commented Sep 13, 2021

@mntn-xyz this looks like it could be possibly unified with the container branch of textWithNewlineReplacement (which would also define the future behavior of general text with HTML tags), although the weird behavior of *ast.HTMLSpan-s and blocks being trimmed from the general text and still being found in blockquote children AST still holds.

If anything, this is probably good: it makes up for possible future fixes landing in gomarkdown.

@mntn-xyz
Copy link
Contributor

Makes sense to me, I'll put together a patch sometime this week.

@tdemin tdemin closed this as completed in #33 Oct 2, 2021
tdemin pushed a commit that referenced this issue Oct 2, 2021
This makes the renderer print the content of informational
HTML tags while stripping the tags themselves.

Tags like script, iframe, style, etc, which are unlikely to
ever hold presentable content, are exempt from this, and
their content is skipped from rendering as well as the tags
themselves.

<br>, a hard-break tag, is supported as a Markdown
hard-break replacement (the two spaces before newline).

This also adds tests for this behavior inside general_text.md.

Fixes #6, a longstanding issue with inline HTML in
blockquotes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gomarkdown Issue in upstream gomarkdown
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants