Skip to content
This repository has been archived by the owner on Nov 25, 2023. It is now read-only.

UTF-8 normalization #11

Open
d47081 opened this issue Sep 6, 2023 · 0 comments
Open

UTF-8 normalization #11

d47081 opened this issue Sep 6, 2023 · 0 comments

Comments

@d47081
Copy link
Collaborator

d47081 commented Sep 6, 2023

Some pages have unsupported character set that causes DB error in crawler

it have been temporarily fixed but requires proper solution (without db character set changing),

so I found useful library that could solve this problem
https://github.com/neitanod/forceutf8

d47081 pushed a commit that referenced this issue Sep 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant