Categories
Data Analysis Resources Machine Learning Predictive Analysis

Predictive Analysis , Binary Classification (Cookbook) – 1

This notebook contains my notes for Predictive Analysis on Binary Classification. It acts as a cookbook. Importing and sizing up a New Data Set  The file is comma delimited, with the data for one experiment occupying one line of text. This makes it a simple matter to read a line, split it on the comma […]

Categories
Data Analysis Resources Kaggle Machine Learning

Running Your First Notebook – Apache Spark

This notebook will show you how to install the course libraries, create your first Spark cluster, and test basic notebook functionality. To move through the notebook just run each of the cells. You will not need to solve any problems to complete this lab. You can run a cell by pressing “shift-enter”, which will compute […]

Categories
Data Analysis Resources Kaggle Machine Learning

Facebook Data Analysis

In [20]: import pandas as pd import numpy as np In [ ]: # Take few samples for the visualization sample_fbcheckin_train_tbl = fbcheckin_train_tbl[:10000].copy() In [21]: df = pd.read_csv(‘train.csv’, index_col=’row_id’) In [22]: df.head() Out[22]: x y accuracy time place_id row_id 0 0.7941 9.0809 54 470702 8523065625 1 5.9567 4.7968 13 186555 1757726713 2 8.3078 7.0407 74 322648 1137537235 3 7.3665 2.5165 […]

Categories
Data Analysis Resources Kaggle

Time Series Forecast using Kobe Bryant Dataset

This script is my attempt for time series analysis. Pandas has dedicated libraries for handling TS objects, particularly the datatime64[ns] class which stores time information and allows us to perform some operations really fast. In [40]: import pandas as pd import numpy as np #import matplotlib.pylab as plt #%matplotlib inline import seaborn as sns #from matplotlib.pylab […]

Categories
Data Analysis Resources

Coursera “The Data Scientist’s Toolbox”

Hi! I signed up for the course in Coursera called The Data Scientist’s Toolbox by Johns Hopkins University.  I liked the course for providing an easy and comprehensive list of tools required by a data scientist. It covers various topics as to what you need as a data scientist. It is simple and concise with […]