Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New post add-in strips helpful comments from YAML #560

Open
apreshill opened this issue Jan 5, 2021 · 16 comments
Open

New post add-in strips helpful comments from YAML #560

apreshill opened this issue Jan 5, 2021 · 16 comments
Assignees
Labels
bug an unexpected problem or unintended behavior next to consider for next release

Comments

@apreshill
Copy link
Contributor

In the academic theme, the archetype files are helpfully commented like:

https://github.com/rbind/apreshill/blob/master/archetypes/post/index.md

---
# Documentation: https://sourcethemes.com/academic/docs/managing-content/

title: "{{ replace .Name "-" " " | title }}"
subtitle: ""
summary: ""
authors: []
tags: []
categories: []
date: {{ .Date }}
lastmod: {{ .Date }}
featured: false
draft: false
disable_jquery: false

# Featured image
# To use, add an image named `featured.jpg/png` to your page's folder.
# Focal points: Smart, Center, TopLeft, Top, TopRight, Left, Right, BottomLeft, Bottom, BottomRight.
image:
  caption: ""
  focal_point: ""
  preview_only: false

# Projects (optional).
#   Associate this post with one or more of your projects.
#   Simply enter your project's folder or file name without extension.
#   E.g. `projects = ["internal-project"]` references `content/project/deep-learning/index.md`.
#   Otherwise, set `projects = []`.
projects: []
---

But when I use the new-post addin I see:

---
title: Test
author: Alison Hill
date: '2021-01-05'
slug: []
categories: []
tags: []
subtitle: ''
summary: ''
authors: []
lastmod: '2021-01-05T07:35:43-05:00'
featured: no
image:
  caption: ''
  focal_point: ''
  preview_only: no
projects: []
---

Can blogdown preserve comments in the YAML here? Oftentimes the comments are the 'documentation' for the theme.

@apreshill apreshill added the bug an unexpected problem or unintended behavior label Jan 5, 2021
@yihui
Copy link
Member

yihui commented Jan 5, 2021

Unfortunately, no. The New Post add-in calls new_post(), which creates a new post with the hugo command hugo new, then modifies its YAML (e.g., adding title, author, tags, slug, etc.). To modify YAML, the YAML data needs to be read, modified, and written back. This process can't preserve comments (yaml::yaml.load() then yaml::as.yaml()).

You can avoid the modification by calling blogdown::new_content() instead, and you won't be able to specify any extra information like the author or tags.

I could use a hack to try to preserve comments if necessary, but it won't be robust. For a robust solution, we'll need the yaml package to support preserving comments.

@cderv
Copy link
Collaborator

cderv commented Jan 6, 2021

we'll need the yaml package to support preserving comments.

Seems like a limitation of upstream libyaml

@apreshill
Copy link
Contributor Author

The same problem occurs with new_site() with the config.yaml file. My theme's actual config.toml file has lots of helpful comments, but they all get stripped out in the config.yaml file that blogdown delivers.

@cderv
Copy link
Collaborator

cderv commented Jan 13, 2021

I think this will happen with any TOML to YAML conversion. It would require to keep the TOML for your website (format = "toml" in new_site()) or maybe your hugo theme uses YAML so no conversion ?

@apreshill
Copy link
Contributor Author

Yes, it is just unfortunate as it is one of those instances where you need to know a lot about Hugo and your theme to know that the comments are missing, and to know to not do the conversion. I cannot imagine starting with blogdown + a new theme, and not seeing any comments in any of the YAMLs 😱 You would be pretty lost.

@cderv
Copy link
Collaborator

cderv commented Jan 13, 2021

That is not an easy one but as @yihui said earlier we may be able to hack to insert the comment back... 🤔

@apreshill
Copy link
Contributor Author

apreshill commented Jan 13, 2021

If not, I would lean towards sticking with the original format the theme author used. There is no other option for a theme author to document the metadata settings other than comments, and most of using a Hugo theme is populating those YAML keys.

In addition, for academic for example, you still do have TOML in config/_default to contend with.

@apreshill
Copy link
Contributor Author

Related: gohugoio/hugo#4520

@cderv
Copy link
Collaborator

cderv commented Jan 13, 2021

@yihui does the conversion to YAML in the first place was because the format is more close to what a user would know ?

If we stick to YAML as a default for new_site() we may add an option to prevent conversion that a user could set globally for all its R sessions through a global R option in ~/.Renviron. Would that be useful @apreshill or not so ideal either ?

@apreshill
Copy link
Contributor Author

I think the problem is that this particular pain point will happen too early - you have to be pretty advanced to know there is a problem, and then to figure out a workaround. Unless we can preserve comments, I don't think the files should be automatically converted from TOML to YAML. We should keep whatever format the theme author has in place.

@yihui
Copy link
Member

yihui commented Jan 13, 2021

I don't have a strong opinion on whether the conversion to YAML should be done by default. I'm fine with leaving config.toml as is. Personally I wish all theme authors could stick with YAML, but apparently that is just a dream. As I said a few times before, YAML is bad enough in my eyes, but users have painfully learned and become familiar (maybe?) with it through the years. Things only become worse with introducing "yet another markup language", i.e., TOML (remember that YAML stands for "yet another markup language", ironically; and equally ironically, Tom's "obvious" markup language may not so "obvious"). I'd rather stay with one bad choice than letting another not so obviously better choice coexist, since the latter would be even more confusing and difficult. Unfortunately, Netlify has decided to abandon netlify.yaml, so we have to use netlify.toml anyway. Other than this file, I think we can use YAML in all other places.

From the technical point of view, it's easier to deal with YAML in R because the yaml package has existed for long. RcppTOML is relatively new (because TOML is new), and more importantly, it doesn't have a writer but only a reader. I have write_toml() in blogdown based on a hack, though. Even RcppTOML does have a writer, I'd still be hesitant on introducing a new R package dependency.

BTW, interestingly enough, bep's hack gohugoio/hugo#4520 is actually what I used in the formatR package to preserve comments. However, I doubt if that would work for TOML/YAML comments, because keeping the content of comments is easy, but finding the right location in the fields to restore the comments will be tricky.

@apreshill
Copy link
Contributor Author

I don't disagree, but I do think a YAML minus comments is more confusing than TOML, if I must pick 👼

@yihui
Copy link
Member

yihui commented Jan 13, 2021

Sure! I definitely agree.

@yihui
Copy link
Member

yihui commented Jan 14, 2021

@cderv I'd like to leave this task to you if you are interested. Here is the hack on my mind:

  1. Read the YAML/TOML data as text, and identify comments (only consider block comments of the pattern ^# .+ for now, and may consider inline comments in the future).
  2. Obtain the name of the next field (for YAML, ^([[:alpha:]+]): ; for TOML, use =), and associate the comment with the name, e.g., list(image = "# Featured image").
  3. After the conversion or processing of YAML/TOML data, insert the comments back if possible: look for the field name, and add the comments before that field.

This hack assumes that comments come before a field. If this assumption doesn't hold, we'll either lose comments, or insert comments in the wrong place.

There is a special case to consider. If the first comment is followed by an empty line, we just keep this comment in the beginning, and do not associate it with a certain field, such as the example at the top of this issue (# Documentation ...).

I guess this would largely solve the problem. We could improve it in the future as edge cases show up.

The list of comments could be stored in an attribute of the data. Then before converting or processing the data, we extract the attribute. Then we try to restore the comments using this attribute. Here is a sketch:

read_yaml = function(x, ...) {
  res = yaml::yaml.load(x)
  com = extract_comments(x)
  attr(res, 'comments') = com
  res
}

write_yaml = function(x, ...) {
  com = attr(res, 'comments')
  res = yaml::as.yaml(x)
  res = insert_comments(res, com)
  res
}

@cderv
Copy link
Collaborator

cderv commented Jan 14, 2021

Sure! I'll handle it.

@cderv cderv self-assigned this Jan 14, 2021
@yihui yihui added the next to consider for next release label Apr 15, 2021
@cderv
Copy link
Collaborator

cderv commented Feb 7, 2022

Other related issue where YAML is not correctly rewritten #684

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior next to consider for next release
Projects
None yet
Development

No branches or pull requests

3 participants