Qubole Delta Lake Spark Streaming ingestion end to end Demo
-
Updated
May 4, 2020 - Python
Qubole Delta Lake Spark Streaming ingestion end to end Demo
Example of how to use Kafka and Spark to handle streaming submissions of urls.
Data Science docker environment - Spark cluster with Jupyterlab interface
A transformation pipeline for Delta Lake using AWS SDK for Pandas
Distributed Systems - Principles and Paradigms
Spark Structured Streaming application transferring Avro data from Kafka with Schema Registry to Delta Lake
Type annotations for delta-spark
Data pipeline that processes Formula1 data with Azure Databricks, DeltaLake, and Azure Data Factory
Data Streaming with Debezium, Kafka, Spark Streaming, Delta Lake, and MinIO
Example of local pyspark setup including DeltaLake for unit-testing
🛸 This project showcases an Extract, Load, Transform (ELT) pipeline built with Python, Apache Spark, Delta Lake, and Docker. The objective of the project is to scrape UFO sighting data from NUFORC and process it through the Medallion architecture to create a star schema in the Gold layer that is ready for analysis.
Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations should be performed.
Introducing Delta-Buddy: Your ultimate Delta Lake companion! 🚀 Streamline your data journey with an AI-powered chatbot. Ask Delta-Buddy anything about your Delta Lake.
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."