Categories
Competition Notes Machine Learning

Lessons Learnt from AIB Data Hack

Yesterday I attended AIB Data Hack. It was my first one-day data hackathon. This notebook contains some of the lessons learnt from AIB Data Hack while working on a complicated, large dataset and little time. I have taken part in few Kaggle Competitions. However, I have to say the experience in a single day or […]

Categories
Data Analysis Resources Predictive Analysis

Time-Series Predictive Analysis of DAX 30

In this blog post we’ll examine some common techniques used in time-series analysis of DAX 30 by applying them to a data set containing daily closing values from 1990 up to present day. The DAX (Deutscher Aktienindex (German stock index)) is a blue chip stock market index consisting of the 30 major German companies trading […]

Categories
Competition Notes Machine Learning scikit-learn

Modeling Women’s Health Risk Assessment

Women’s Health Risk Assessment is a multi-class classification competition for finding an optimized machine learning a solution that allows a young woman (age 15-30 years old) to be accurately categorized for their particular health risk. Based on the category a patient falls within, healthcare providers can offer appropriate education and training programs to help reduce […]

Categories
Data Analysis Resources Machine Learning

The Comprehensive Guide for Feature Engineering

Feature Engineering is the art/science of representing data is the best way possible. This is the comprehensive guide for Feature Engineering for myself  but I figured that they might be of interest to some of the blog readers too. Comments on what is written below are most welcome! Good Feature Engineering involves an elegant blend […]

Categories
Data Analysis Resources Machine Learning scikit-learn

Installing XGBoost for Windows – walk-through

I have the following specification on my computer: Windows10, 64 bit,Python 3.5 and Anaconda3.I tried many times to install XGBoost but somehow it never worked for me. Today I decided to make it happen and am sharing this post to help anyone else who is struggling with installing XGBoost for Windows. XGBoost is short for […]

Categories
Kaggle Machine Learning scikit-learn

Evaluating Algorithms using MNIST

This post is evaluating Aagorithms using MNIST In [1]: import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline import warnings warnings.filterwarnings(‘ignore’) In [2]: # importing the train dataset train = pd.read_csv(r’C:\Users\piush\Desktop\Dataset\DigitRecognizer\train.csv’) train.head(10) Out[2]: label pixel0 pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 … pixel774 pixel775 pixel776 pixel777 […]

Categories
Kaggle Machine Learning Predictive Analysis

Submission for Kaggle’s Titanic Competition

Following is my submission for Kaggle’s Titanic Competition In [361]: import pandas as pd import numpy as np In [362]: df_train = pd.read_csv(r’C:\Users\piush\Desktop\Dataset\Titanic\train.csv’) In [363]: df_train.head(2) Out[363]: PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked 0 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.2500 NaN S 1 2 […]

Categories
Machine Learning

Coding FP-growth algorithm in Python 3

FP-growth algorithm Have you ever gone to a search engine, typed in a word or part of a word, and the search engine automatically completed the search term for you? Perhaps it recommended something you didn’t even know existed, and you searched for that instead. This requires a way to find frequent itemsets efficiently. FP-growth […]