Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于行高分配的逻辑疑问 #291

Open
heweisheng opened this issue May 17, 2024 · 0 comments
Open

关于行高分配的逻辑疑问 #291

heweisheng opened this issue May 17, 2024 · 0 comments

Comments

@heweisheng
Copy link

最近在做ocr还原扫描件(使用飞浆的面版识别+reportlib生成还原pdf),目前pdf排版比较方便,所以打算先转pdf在用pdf2docx(花时间写一套根据ocr实现排版感觉可以直接扩展这个项目,但是暂时还没有时间去扩展)
看了下pdf解析的时候可能存在多行一个段落的情况,但是多行的情况下行高应该要均分给每一行才对
会出现问题的具体情况:
test_7.pdf

image
image
使用这个逻辑转换:
image
均分行高:
image
另外可否中间插入空格行去做到排版尽量跟原来相似呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant