Deploy Tableau Server on AWS

This guide provides step-by-step instructions for deploying Tableau Server standalone architecture on AWS.  Tableau Server is an online solution for sharing, distributing, and collaborating on business intelligence content created in Tableau. Tableau Server users can create workbooks and views, dashboards, and data sources in Tableau Desktop, and then publish this content to the server.  Tableau […]

Full-stack Deep Learning Application Using AWS Fargate Serverless Infrastructure

This post showcases full-stack Deep Learning Application Using AWS Fargate Serverless Infrastructure. The project we optimise a model for Utilisation of hospital beds – COVID-19. It is motivated by the overcrowding of hospital beds due to COVID-19. Another reason is to avoid the repeat of the events which happened at the beginning of the pandemic. […]

AI for Healthcare

This post provides a link to my Github repository for my submissions for Udacity’s AI for Healthcare Nanodegree Program. I learned to build, evaluate, and integrate predictive models that have the power to transform patient outcomes and uses AI for Healthcare.  I started by classifying and segmenting 2D and 3D medical images to augment diagnosis and then moved on to modelling patient outcomes with electronic health records to optimize clinical trial testing decisions. Finally, I build an algorithm that uses data collected from wearable devices to estimate the wearer’s pulse rate in the presence of motion. Applying AI to 2D Medical Imaging Data I learnt the fundamental skills needed to work with 2D medical imaging data and how to use AI to derive clinically-relevant insights from data gathered via different types of 2D medical imaging such as x-ray, mammography, and digital pathology. In this project, I analyzed data from the NIH Chest X-ray dataset and trained a CNN to classify a given chest X-ray for the presence or absence of pneumonia. First, I curated training and testing sets that are appropriate for the clinical question at hand from a large collection of medical images. Then, I created a pipeline to extract images from DICOM files that can be fed into CNN for model training. Lastly, I wrote an FDA 501(k) validation plan that formally describes my model, the data that it was trained on, and a validation plan that meets FDA criteria in order to obtain clearance of the software being used as a medical device. AI for Healthcare is used in the project. Applying AI to 3D Medical Imaging Data I learnt the fundamental skills to work with 3D medical imaging datasets and frame insights derived from the data in a clinically relevant context.  In this project, I went through the steps to create an algorithm that will helps clinicians assess hippocampal volume in an automated way and integrated this algorithm into a clinician’s working environment. Hippocampus is one of the major structures of the human brain with functions that are primarily connected to learning and memory. The volume of the hippocampus may change over time, with age, or as a result of the disease. In order to measure hippocampal volume, a 3D imaging technique with good soft-tissue contrast is required. MRI provides such imaging characteristics, but manual volume measurement still requires careful and time-consuming delineation of the hippocampal boundary.  Applying AI to EHR Data I learnt the fundamental skills to work with EHR data and build and evaluate compliant, interpretable models. In this project, I worked with real, de-identified EHR data to build a regression model to predict the estimated hospitalization time for a patient and select/filter patients for the study. I analyzed an EHR dataset, transform it to the right level, build powerful features with TensorFlow, and modelled the uncertainty and bias with TensorFlow Probability and Aequitas.Applying AI to Wearable Device Data I learnt how to build algorithms that process the data collected by wearable devices and surface insights about the wearer’s health. […]

Udacity Data Engineering Capstone Project

Project Summary The project follows the follow steps: Step 1: Scope the Project and Gather Data Step 2: Explore and Assess the Data Step 3: Define the Data Model Step 4: Run ETL to Model the Data Step 5: Complete Project Write Up Step 1: Scope the Project and Gather Data Scope The project is […]

Classical & Statistical Time Series Modelling of United Health Group’s Stock Price

Time series is different from a regular regression problem because it is time dependent. The basic assumption of a linear regression that the observations are independent doesn’t hold in this case. Along with an increasing or decreasing trend, most time series have some form of seasonality trends, i.e. variations specific to a particular time frame. […]

Novelty, Anomaly and Segmentation Discovery using Matrix Profile

In this notebook, novelty and anomaly and segmentation discovery using Matrix Profile. We are using Stumpy for time series data mining tasks. We’ll examine a data set containing daily opening values for the United Health Group from 2016 up to present day. UnitedHealth Group Incorporated is an American for-profit managed health care company based in Minnetonka, […]

Principles of Good Visualization

The following are the principles of good visualization by Andy Kirk: Good data visualization is trustworthy: Truthfulness and accuracy should be an obligation. Trustworthiness is about being transparent giving readers all the information they need in order to feel confident about what they are reading and what interpretations are legitimate. Good data visualization is accessible: […]

Data Science Libraries in Docker Container and AWS

In this post, we will be installing data science libraries in a docker container and AWS.  It will show how to easily access the data science libraries from anywhere in the world without downloading and installing Python, runtime libraries, and the Jupyter package. The walk-through is for Windows platform. AWS We need an AWS Account […]

Difference Between Cross-selling and Up-selling

This post tells the difference between cross-selling and up-selling. Cross-selling identifies products that satisfy additional complementary needs that are unfulfilled by the original item. It points users to products they would have purchased anyway by sharing them at the right time to ensure a sale. For example, Comb sale to a customer buying a blow […]

Ethical, Privacy and Data Protection Issues

Data is often reduced to what can fit into a mathematical model. Yet, taken out of context, data may lose its meaning. Ethics, privacy, and data protection issues are often an afterthought or regulatory hurdle to be jumped through. Ethical Issues include: Non-objective analysis Incomplete Reporting Misleading Reporting Lack of Consideration Moral agency is the […]