Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the file size after embedding a CJKV watermark with a custom font. #724

Open
bettershare opened this issue Oct 17, 2023 · 2 comments
Assignees

Comments

@bettershare
Copy link

bettershare commented Oct 17, 2023

Thank you very much for this great project, which was very helpful.

The problem I am encountering now is that after adding a Chinese watermark with a custom font(KaiTi_GB2312), the file size has increased significantly. On the contrary, when using Adobe Acrobat to add watermarks with the same settings, the file size did not increase much.

Files as attached.

KaiTi_GB2312.zip : The Font File, size: 3.94MB

sample.pdf : The original PDF document, size: 62.3KB
sample-adobe-1.pdf : Using Acrobat add 1 mark , size: 66.8KB
sample-adobe-3.pdf : Using Acrobat add 3 marks, size: 68.7KB

  1. Install user font before adding watermark

pdfcpu font install .\KaiTi_GB2312.ttf

  1. Add one Chinese watermark with pdfcpu cli:

pdfcpu watermark add -m text -- "测试中文字体水印增加的文件大小\n2023-10-16" "font: KaiTi_GB2312, points: 36, scale: 1 abs, color: #ff0000, op: 0.3, ro: 30" .\sample.pdf .\sample_cpu.pdf
sample_cpu.pdf, size: 1.11MB

  1. Add another watermark with pdfcpu cli:

.\pdfcpu.exe watermark add -m text -- "测试中文字体水印增加的文件大小\n2023-10-16" "font: KaiTi_GB2312, points: 36, scale: 1 abs, color: #ff0000, op: 0.3, ro: 30, offset: 0 220" .\sample_cpu.pdf .\sample_cpu2.pdf
sample_cpu2.pdf, size: 2.16MB

  1. Add last watermark with pdfcpu cli:

.\pdfcpu.exe watermark add -m text -- "测试中文字体水印增加的文件大小\n2023-10-16" "font: KaiTi_GB2312, points: 36, scale: 1 abs, color: #ff0000, op: 0.3, ro: 30, offset: 0 -220" .\sample_cpu2.pdf .\sample_cpu3.pdf
sample_cpu3.pdf, size: 3.22MB

From [Font] tab in [document property] with Adobe Acrobat, could find out file [-adobe-3.pdf] has only one KaiTi_GB2312 items and no [subset embedded] sign, and file [_cpu3.pdf] have three KaiTi_GB2312 items and has [subset embedded] sign on each one.

Suggestion

  1. Would you consider adding an option for font embedding, as none, only used characters and full font, respectively represent no embedding, embedding only the characters used, and embedding the complete font file, refer to Adobe Acrobat's strategy, to reduce the size of final document? As many fonts are common on different devices.
  2. As I found in file [_cpu3.pdf], the watermark has Chinese Characters and English alphabets, the designed customizing KaiTi_GB2312 font is valid to the Chinese Characters, but it seems invalid to English letters. How about add an option for customizing default font, as Chinese Character on one font, and alphabets for another font?

Thank you very much!

Document properties of sample-adobe-3.pdf, not embedded, size is small
Document properties of sample-adobe-3.pdf, not embedded, size is small

Document properties of sample_cpu3.pdf, embedded 3 times, size increased significantly
Document properties of sample_cpu3.pdf, embedded 3 times, size increased significantly

@hhrutter
Copy link
Collaborator

Reusing embedded user fonts is missing during stamping/watermarking.
Thanks for the reminder 👍🏻

@hhrutter hhrutter reopened this Feb 6, 2024
@hhrutter
Copy link
Collaborator

hhrutter commented Feb 6, 2024

With the latest commit you can enforce that the stamp/watermark command will NOT embed your user font.

On the CLI you can now set scriptnameto one of the supported ISO-15924 font script codes:
HANS, HANT, HIRA, KANA, JPAN, HANG, KORE

You can do:
pdfcpu watermark add -m text -- "测试中文字体水印增加的文件大小\n2023-10-16" "font: KaiTi_GB2312, script:hans, points: 36, scale: 1 abs, color: #ff0000, op: 0.3, ro: 30" .\sample.pdf .\sample_cpu.pdf

Any user font name with the suffix GB2312 will be recognized using script HANA and therefore not embedded,
so your original command will also work:
pdfcpu watermark add -m text -- "测试中文字体水印增加的文件大小\n2023-10-16" "font: KaiTi_GB2312, points: 36, scale: 1 abs, color: #ff0000, op: 0.3, ro: 30" .\sample.pdf .\sample_cpu.pdf

As far as 2) KaiTi_GB2312.ttf supports latin numbers, so that's a non issue.

Reusing a userfont for stamping/watermarking remains open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants