Udacity Data Engineering Capstone Project
Project Summary The project follows the follow steps: Step 1: Scope the Project and Gather DataStep 2: Explore and Assess the DataStep 3: Define the Data ModelStep 4: Run ETL…
Project Summary The project follows the follow steps: Step 1: Scope the Project and Gather DataStep 2: Explore and Assess the DataStep 3: Define the Data ModelStep 4: Run ETL…
Data is often reduced to what can fit into a mathematical model. Yet, taken out of context, data may lose its meaning. Ethics, privacy, and data protection issues are often…
Introduction Cassandra is a distributed, no single point of failure, continuously available and scalable. NoSQL database that manages a large amount of data across many data centres and cloud servers. It…
Web Server Log Analysis with Spark This lab will demonstrate how easy it is to perform web server log analysis with Apache Spark. Server log analysis is an ideal use…
These are my solutions for Apache Spark. Building a word count application in Spark This lab will build on the techniques covered in the Spark tutorial to develop a simple…
Spark Tutorial: Learning Apache Spark includes my solution for the EdX course. This tutorial will teach you how to use Apache Spark, a framework for large-scale data processing, within a…