Free and Open Repository for the Hack for the Sea data. Sponsored by https://microshare.io
-
Updated
Dec 8, 2022 - Python
Free and Open Repository for the Hack for the Sea data. Sponsored by https://microshare.io
Python package for data warehouse, data lake and storage related tasks
Board Game Dataset Model is aggregated from two major board game data sources including BoardGameGeek and Board Game Atlas
My resume (feature branches may be more up-to-date than main).
Data Lake hosted on the AWS EMR cluster with S3 buckets used as source and output storages. The analysis was done using AWS Athena.
A deployable reference implementation intended to address pain points around conceptualizing data lake architectures that automatically configures the core AWS services necessary to easily tag, search, share, and govern specific subsets of data across a business or with other external businesses.
This hands-on-tutorial how to set up an Azure data lake in a few clicks and run queries on it.
Data Engineer (Udacity): Project 4 Data Lakes with Spark on Amazon Web Service (AWS)
A project using AWS to combine and transform streaming data for analytics.
Streamdata.io Stack Exchange Questions Streaming to Amazon S3 Data Lake Using Lambda
User activity insights | PySpark, AWS S3 & EMR
Add a description, image, and links to the data-lake topic page so that developers can more easily learn about it.
To associate your repository with the data-lake topic, visit your repo's landing page and select "manage topics."