Categories
Data Analysis Resources

Visualise Categorical Variables in Python

It is crucial to learn the methods of dealing with categorical variables as categorical variables are known to hide and mask lots of interesting information in a data set. A categorical variable identifies a group to which the thing belongs. You could categorise persons according to their race or ethnicity, cities according to their geographic […]

Categories
Competition Notes Data Analysis Resources Personal Stories

Exploratory Data Analysis for Hostelworld Challenge

I recently took part in a challenge by Hostelworld. The challenge proposed by Hostelworld is to build a recommendation engine for users. Recommendations can save Recommendations can save travellers valuable time, improve their hostel experience, and increase user retention. This challenge will use user information, reviews, and hostel details. This is a link for Exploratory […]

Categories
Data Analysis Resources Kaggle

Visualisation of House Prices

Visualisation is the presentation of data in a pictorial or graphical format. It enables decision makers to see analytics presented visually, so they can grasp difficult concepts or identify new patterns. This visualisation of house prices is for the Kaggle dataset. With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, […]

Categories
Data Analysis Resources Kaggle

4 different ways to predict survival on Titanic – part 4

continued from part 3 4. Way to predict survival on Titianic These notes are taken from this link In [2]: import matplotlib.pyplot as plt %matplotlib inline import numpy as np import pandas as pd import statsmodels.api as sm from statsmodels.nonparametric.kde import KDEUnivariate from statsmodels.nonparametric import smoothers_lowess from pandas import Series, DataFrame from patsy import dmatrices from […]

Categories
Data Analysis Resources Kaggle

4 different ways to predict survival on Titanic – part 3

These are my notes from various blogs to find different ways to predict survival on Titanic using Python-stack. This is continued from part 2 3. Way to predict survival on Titianic These notes are from this link I – Exploratory data analysis We tweak the style of this notebook a little bit to have centered plots. […]

Categories
Data Analysis Resources Kaggle

4 different ways to predict survival on Titanic – part 1

These are my notes from various blogs to find different ways to predict survival on Titanic using Python-stack. I am interested to compare how different people have attempted the kaggle competition. I am going to compare and contrast different analysis to find similarity and difference in approaches to predict survival on Titanic. This Notebook will […]

Categories
Data Analysis Resources

Exploratory Data Analysis with Pandas – 2

This post is exploratory data analysis with pandas – 2. Exploratory Data Analysis with pandas can be effective should be fast and graphic. This is continued from part 1 In [10]: densityplot = iris_df.plot(kind=’density’) In [11]: single_distribution = iris_df[‘petal width (cm)’].plot(kind=’hist’, alpha=0.5) Scatterplots Scatterplots can be used to effectively understand whether the variables are in a nonlinear […]

Categories
Data Analysis Resources

Exploratory Data Analysis with pandas – 1

This post is exploratory data analysis with pandas – 1. Clear data plots that explicate the relationship between variables can lead to the creation of newer and better features that can predict more than the existing ones. Exploratory Data Analysis, which can be effective if it has the following characteristics: • It should be fast, allowing […]