We aim to find any correlation between a US President and its state concerning other Presidents. We use data from different sources to form a dataset that comprises different US presidents and their home states with additional facts about the state. We pull data from multiple sources and combine it to form a comprehensive dataset. We also find the most common state and ultimately plot the results on a live interactive map.
We start by preparing the data to comb the datasets from different sources and perform some basic preprocessing on it.
I believe Data Science allows me to express my curiosity in ways I’d never imagine. The coolest thing in Data Science is that I see data not as numbers but as an opportunity (business problem), insights(predictive modeling, stats, and data wrangling), and improvement (metrics). With this thought in mind, I decided to analyze the YouTube Comments of VP and Presidential debates.
After getting mixed results from the news sources, I thought to analyze the Vice Presidential and Presidential debates using Data Science. The idea is to use YouTube comments as a medium to get the sentiment regarding the debate and…
As the election time is approaching we will see how and why our Facebook Data is so valuable to advertisers, politicians. Facebook is the world's largest social network platform with over 2.5 billion active users. It processes data at a scale never seen before, the highly sophisticated Facebook A.I algorithms curate, categorize, and predict associations between data in an almost human way.
Why: Due to the influx of such a vast amount of data and processing power, we will explore how we can predict human traits using just a collection of Facebook Likes. To achieve our results, we will try…
My grandfather was an expert in handwriting analysis. He spent all his life analyzing documents for the CBI (Central Bureau Of Investigation) and other organizations. His unique way of analyzing documents using a magnifying glass and different tools required huge amounts of time and patience to analyze a single document. This is back when computers were not fast enough. I remember vividly that he photocopied the same document multiple times and arranged it on the table to gain a closer look at the handwriting style.
Handwriting analysis involves a comprehensive comparative analysis between a questioned document and the known handwriting…
Ever wondered how to calculate text similarity using Deep Learning? We aim to develop a model to detect text similarity between texts. We will be using the Quora Question Pairs Dataset.
Let us first start by exploring the dataset. Our dataset consists of:
Our task is to apply a supervised/semi-supervised technique like ULMFit (Ruder et al, 2018) to the Twitter US Airlines sentiment analysis data.
The reason this problem is semi-supervised is that it is first followed by an unsupervised way of training then fine-tuning the network by adding a classifier network at the top of the network.
We use the Twitter US Airlines dataset (https://www.kaggle.com/crowdflower/twitter-airline-sentiment)
We will start by:
I am a big fan of the Ben 10 Series and I have always wondered why Ben’s Omnitrix fails to change into an alien that Ben chooses to be(This is largely due to a weak A.I system already built into the watch). To help Ben, We will devise “OmniNet”, a neural network capable of predicting an appropriate alien according to the given situation.
As discussed on the show, the Omnitrix is basically a server that connects to the Planet Primus to harness the DNA of around 10000 aliens! …
It is the 21st century, technology is on the rise, the internet has succeeded paper texts. We live in a world that is interconnected. In this fast-paced, growing world, data is being rapidly created every second. The use of algorithms and statistical measures allows us to graph each movement in a way that is acceptable for predictive modeling.
Big data refers to huge amounts of data accumulated over time through the use of internet services. Traditional econometrics methods fail when analyzing such huge amounts of data and we require a host of new algorithms that can crunch this data and…
My biggest mistake in the last 2 years has been to avoid the fact that all great things start from the fundamentals. In this fast-paced world, we tend to forget the most basic, minute details about a technology or whatever we are creating. I have also encountered this strange behavior of “over innovating” on a product that is creating a product just for the sake of opening up something. In my humble opinion, need drives innovation that in turn drives an idea cultivated into a product. Another mistake I have made in these years is following up on a trend…
Introduction: An LRUCache is a data structure that uses the scheme of evicting the Least Recently Used item. Basically a cache is a mechanism used by software to effectively handle data access and reading in an efficient manner. Caching improves performances by recently used or frequently used in memory. Using a cache, we avoid the overhead required to fetch an item that is already in memory. LRU (Least Recently Used) is one cache mechanism where we evict an item that is not used to make rooms for new items to be added. …
Imagination is more important than knowledge — Einstein