I have the following specification on my computer: Windows10, 64 bit,Python 3.5 and Anaconda3.I tried many times to install XGBoost but somehow it never worked for me. Today I decided to make it happen and am sharing this post to help anyone else who is struggling with installing XGBoost for Windows. XGBoost is short for […]

# Tag: data analysis

This post is exploratory data analysis with pandas – 1. Clear data plots that explicate the relationship between variables can lead to the creation of newer and better features that can predict more than the existing ones. Exploratory Data Analysis, which can be effective if it has the following characteristics: • It should be fast, allowing […]

In [20]: import pandas as pd import numpy as np In [ ]: # Take few samples for the visualization sample_fbcheckin_train_tbl = fbcheckin_train_tbl[:10000].copy() In [21]: df = pd.read_csv(‘train.csv’, index_col=’row_id’) In [22]: df.head() Out[22]: x y accuracy time place_id row_id 0 0.7941 9.0809 54 470702 8523065625 1 5.9567 4.7968 13 186555 1757726713 2 8.3078 7.0407 74 322648 1137537235 3 7.3665 2.5165 […]

This blog contains notes for me to understand how to evaluate machine learning algorithms . I want to see how models compare and contrast to each other. This is from the following web page: Your First Machine Learning Project in Python Step-By-Step I am evaluating 6 different algorithms in this blog : Logistic Regression (LR) […]

This script is my attempt for time series analysis. Pandas has dedicated libraries for handling TS objects, particularly the datatime64[ns] class which stores time information and allows us to perform some operations really fast. In [40]: import pandas as pd import numpy as np #import matplotlib.pylab as plt #%matplotlib inline import seaborn as sns #from matplotlib.pylab […]

This part is a Tutorial using Kobe Bryant Dataset – Part 4. You can get the data from https://www.kaggle.com/c/kobe-bryant-shot-selection . What excited me was that this dataset is excellent to practice classification basics, feature engineering, and time series analysis. This is continued from here. Exploring the data In [215]: #Shot accuracy sns.countplot(‘shot_made_flag’,data = data) Out[215]: <matplotlib.axes._subplots.AxesSubplot […]

This part is a kaggle tutorial using Kobe Bryant Dataset – Part 3. You can get the data from https://www.kaggle.com/c/kobe-bryant-shot-selection . What excited me was that this dataset is excellent to practice classification basics, feature engineering, and time series analysis. This is continued from here #columns not needed notNeeded = [] In [183]: #Action type column […]

This part is a kaggle tutorial using Kobe Bryant Dataset – Part 1. You can get the data from https://www.kaggle.com/c/kobe-bryant-shot-selection . What excited me was that this dataset is excellent to practice classification basics, feature engineering, and time series analysis. Importing Data Let us start with importing the basic libraries we need and the data […]