Data Analysis Resources

Exploratory Data Analysis with pandas – 1

This post is exploratory data analysis with pandas – 1. Clear data plots that explicate the relationship between variables can lead to the creation of newer and better features that can predict more than the existing ones. Exploratory Data Analysis, which can be effective if it has the following characteristics: • It should be fast, allowing […]

Data Analysis Resources Kaggle Machine Learning

Facebook Data Analysis

In [20]: import pandas as pd import numpy as np In [ ]: # Take few samples for the visualization sample_fbcheckin_train_tbl = fbcheckin_train_tbl[:10000].copy() In [21]: df = pd.read_csv(‘train.csv’, index_col=’row_id’) In [22]: df.head() Out[22]: x y accuracy time place_id row_id 0 0.7941 9.0809 54 470702 8523065625 1 5.9567 4.7968 13 186555 1757726713 2 8.3078 7.0407 74 322648 1137537235 3 7.3665 2.5165 […]

Machine Learning

Evaluating Machine Learning Algorithms

This blog contains notes for me to understand how to evaluate machine learning algorithms . I want to see how models compare and contrast to each other. This is from the following web page: Your First Machine Learning Project in Python Step-By-Step I am evaluating 6 different algorithms in this blog : Logistic Regression (LR) […]


Tutorial using Kobe Bryant Dataset – Part 4

This part is a Tutorial using Kobe Bryant Dataset – Part 4. You can get the data from . What excited me was that this dataset is excellent to practice classification basics, feature engineering, and time series analysis. This is continued from here. Exploring the data In [215]: #Shot accuracy sns.countplot(‘shot_made_flag’,data = data) Out[215]: <matplotlib.axes._subplots.AxesSubplot […]