Categories
Data Analysis Resources Spark

Solutions to Support Real-Time Data Analytics

Organisations embracing big data use non-traditional strategies and technologies to gather, organize, process and gather insights from large datasets. These solutions do not support real-time analytics. Real-time analytics require technology to handle data that is generated at high velocity and send by the sources simultaneously in small sizes. Data is required to be processed sequentially […]

Categories
Data Analysis Resources Spark

Web Server Log Analysis with Spark

Web Server Log Analysis with Spark This lab will demonstrate how easy it is to perform web server log analysis with Apache Spark. Server log analysis is an ideal use case for Spark. It’s a very large, common data source and contains a rich set of information. Spark allows you to store your logs in […]

Categories
Data Analysis Resources Spark

Building a word count application in Spark

These are my solutions for Apache Spark. Building a word count application in Spark This lab will build on the techniques covered in the Spark tutorial to develop a simple word count application. The volume of unstructured text in existence is growing dramatically, and Spark is an excellent tool for analyzing this type of data. […]

Categories
Data Analysis Resources Spark

Spark Tutorial: Learning Apache Spark

Spark Tutorial: Learning Apache Spark includes my solution for the EdX course. This tutorial will teach you how to use Apache Spark, a framework for large-scale data processing, within a notebook. Many traditional frameworks were designed to be run on a single computer. However, many datasets today are too large to be stored on a […]