Skip to content

ArtificialOSS/WebCrawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

WebCrawl

WebCrawl is a free and open source tool to crawl through the website and generate a huge dataset that can be used to train your ai

it's inspired by: CommonCrawl

Setup

you require:

requests
BeautifulSoup
tqdm

then you can just run the app with:

python main.py