-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
explosion spaCy Language-support Discussions
Sort by:
Latest activity
Label
Categories, most helpful, and community links
Categories
Community links
🌍 Language Support Discussions
Discuss the language data and training models for new languages
Pinned to Language Support
-
🌍 Adding models for new languages master thread
enhancementFeature requests and improvements lang / allGlobal language data new languageAdding support for new languages to spaCy.
Discussions
-
You must be logged in to vote 🌍 Arabic language support
lang / arArabic language data and models -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Why does the German sentence tokenizer consider a semicolon a sentence ending?
lang / deGerman language data and models feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 Other Languages Support
modelsIssues related to the statistical models -
You must be logged in to vote 🌍 Portuguese words starting with a capital letter are not correctly lemmatized
lang / ptPortuguese language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Adding support for Tibetan in spacy
new languageAdding support for new languages to spaCy. -
You must be logged in to vote 🌍 Feedback on alpha Finnish, Korean and Swedish trained pipelines
enhancementFeature requests and improvements lang / koKorean language data and models lang / svSwedish language data and models lang / fiFinnish language data and models v3.3Related to v3.3 -
You must be logged in to vote 🌍 English models' Accuracy Evaluation values
lang / enEnglish language data and models -
You must be logged in to vote 🌍 Update russian library
lang / ruRussian language data and models third-partyThird-party packages and services feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Floret vectors for Italian
trainingTraining and updating models feat / vectorsFeature: Word vectors and similarity -
You must be logged in to vote 🌍 Improving Bengali Stopwords collection and Exception
lang / bnBengali language data and models new languageAdding support for new languages to spaCy. -
You must be logged in to vote 🌍 conected words in Portuguese
lang / ptPortuguese language data and models -
You must be logged in to vote 🌍 Training coreference resolver on Italian Ontonotes produces low scores
trainingTraining and updating models feat / corefFeature: Coreference resolution -
You must be logged in to vote 🌍 Chinese word segmentation model for spaCy
modelsIssues related to the statistical models lang / zhChinese language data and models third-partyThird-party packages and services -
You must be logged in to vote 🌍 Italian lemmatizer low performance on agglitinated verbs
lang / itItalian language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 How do I use url_match with Japanese?
lang / jaJapanese language data and models feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 Thai Language not working properly
lang / thThai language data and models -
You must be logged in to vote 🌍 Losing POS Tagging & Other Token Attributes when Segmenting with Jieba or Pkuseg
usageGeneral spaCy usage feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 Chinese tokenization is bad
lang / zhChinese language data and models -
You must be logged in to vote 🌍 Support for Balochi
metaMeta topics, e.g. repo organisation and issue management new languageAdding support for new languages to spaCy. -
You must be logged in to vote 🌍 Sentiment analysis for all standard spacy language models
feat / textcatFeature: Text Classifier -
You must be logged in to vote 🌍 xx_sent_ud_sm bad sentence split
modelsIssues related to the statistical models lang / zhChinese language data and models lang / xxMulti-language data and models feat / senterFeature: Sentence Recognizer -
You must be logged in to vote 🌍 Support for Hebrew
modelsIssues related to the statistical models lang / heHebrew language data and models -
You must be logged in to vote 🌍 Slovenian: Feedback on alpha trained pipelines for upcoming spaCy v3.6
modelsIssues related to the statistical models lang / slSlovenian language data and models v3.6Related to v3.6