Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

转换时遇到字体名为中文(比如“宋体”)时,发生错误 #286

Open
hlhtddx opened this issue May 1, 2024 · 1 comment

Comments

@hlhtddx
Copy link

hlhtddx commented May 1, 2024

如题,转换时遇到字体名为中文(比如“宋体”)时,发生错误
bytes must be in range[0 to 255]
错误点在
https://github.com/ArtifexSoftware/pdf2docx/blame/master/pdf2docx/common/share.py#L128
当字体名称为中文时,ord(c)大于255,转换成bytes时会报错

def decode(s:str):
    '''Try to decode a unicode string.'''
    b = bytes(ord(c) for c in s) ### 这里出错
    for encoding in ['utf-8', 'gbk', 'gb2312', 'iso-8859-1']:
        try:
            res = b.decode(encoding)
            break
        except:
            continue
    return res
@hlhtddx
Copy link
Author

hlhtddx commented May 1, 2024

缺了一遍,只有在选择multiprocessing=True才会出现问题,单进程模式不会出问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant