Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandoc --version reports commitBuffer: invalid argument (cannot encode character '\248') on Windows if home folder includes non-ascii characters #9686

Open
llob opened this issue Apr 23, 2024 · 3 comments
Labels

Comments

@llob
Copy link

llob commented Apr 23, 2024

Using the latest version of Pandoc on Windows 10, executing pandoc --version results in the following output:

pandoc 3.1.13
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: C:\Users\Spandoc: : commitBuffer: invalid argument (cannot encode character '\248')

The problem appears to be, that my Windows home folder name contains the Danish character 'ø' (right after the 'S').

This is not a major issue, except the Python library "pandoc" calls pandoc --version on startup, to determine which version is installed, thus becoming effectively useless under these circumstances.

This occurs with version 3.1.13 of Pandoc on Windows 10.

@llob llob added the bug label Apr 23, 2024
@jgm
Copy link
Owner

jgm commented Apr 23, 2024

What is the most recent version where this does not happen?

@llob
Copy link
Author

llob commented Apr 24, 2024

I have tested a few versions, and it seems that the problem first occurred in version 3.0.
Here are the outputs from the latest 2.x version and the first 3.x version:

pandoc.exe 3.0
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: C:\Users\Spandoc.exe: <stdout>: commitBuffer: invalid argument (invalid character)
pandoc.exe 2.19.2
Compiled with pandoc-types 1.22.2.1, texmath 0.12.5.2, skylighting 0.13,
citeproc 0.8.0.1, ipynb 0.2, hslua 2.2.1
Scripting engine: Lua 5.4
User data directory: C:\Users\S├╕renBollOvergaard\AppData\Roaming\pandoc
Copyright (C) 2006-2022 John MacFarlane. Web:  https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

@jgm
Copy link
Owner

jgm commented Apr 24, 2024

Thanks for helping to pin that down.

I note that in 2.19.2 the user data directory doesn't appear correctly: the ø has been garbled.

This seems to be an issue about encodings. Unfortunately, I don't know much about how these things work on Windows systems. Do you have a working Haskell setup, by any chance, which would allow you to compile revised code and tell me if it helps?

I suspect the issue has to do with the putStr at
https://github.com/jgm/pandoc/blob/main/pandoc-cli/src/pandoc.hs#L97-L103
and might go away if we replace this with UTF8.putStr (import qualified Text.Pandoc.UTF8 as UTF8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants