Data Science With Wine

My wife is an expert wine-buyer and every good wine bottle she brings home has a little story attached to it. Even though, occasionally, I (secretly) don’t enjoy the taste of some of those wines, I know they are all considered to be of “high quality” and “very popular”. Some of them simply just don’t match my taste.

However, there have been plenty of wines I’ve tried in the past that were just plain bad. This made me think about the wine manufacturers – why do they even sell a particular (bad) wine? Can’t they just predict a customer’s response by tasting their own wine or by measuring objectively a few things about a wine’s chemistry and physics?

Wine-making is a big industry and there are already quite a few studies done and papers published in this field trying to answer this question. In fact, after a quick search online, I found a few data science studies trying to address it, and many of them were referring to the following paper:

  • P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

This particular publication came with a couple of datasets available for everyone to play around with, so I decided to use them for my next project. As an advisor to BigML, which is a leading analytics company, I wanted to analyze this wine quality data using their online platform and see 1) can I answer my question about predicting wine quality from some objective measurements and 2) how quickly could this be accomplished using the BigML online solution?

First, I downloaded the wine composition and quality assessment data from here: there are two datasets available with 1599 entries for red wine and 4898 entries for white. Even if I prefer red wine, I decided to go with the larger dataset for my study.

The dataset included 11 wine features such as residual sugar, density, pH, alcohol and few others (check the dataset if interested) and one numerical value for quality of each wine, which was expressed as a number between 0 (very bad) and 10 (excellent).

I felt that the regression analysis (having a numerical output in mind) will be too noisy and inaccurate, I decided to simply split the dataset into two classes: bad wine (0-6) and good wine (7-10).

I used ExcelⓇ to do these initial data manipulations and then imported the dataset into the BigML online portal (a simple drag and drop). Notice in the picture below how convenient it is to see all the distributions for each data column and their descriptive statistics.

Continue reading

Posted in AI, artificial intelligence, machine learning, deep learning, Analytics, data analytics, big data, big data analytics, data on the internet, data analytics meaning, Computers, Data Analysis and Visualization, Humor | Tagged , , , , | Leave a comment

Artificial Intelligence (AI) vs. Machine Learning (ML) vs. Deep Learning

I already had a post on this subject before (link) but want to summarize it again (with an added timeline):deep_learning_icons_r5_png-jpg

Deep learning is a sub-area of Machine Learning, which is a sub-area of Artificial Intelligence.

Also, here you can find the main definitions of Big Data Analytics, Machine Learning, and other terms.

source
Posted in AI, artificial intelligence, machine learning, deep learning, Analytics, data analytics, big data, big data analytics, data on the internet, data analytics meaning, Computers, Data Analysis and Visualization, Past, present, and future, The future of artificial intelligence | Tagged , , , , , , , | Leave a comment

Amazing Supercomputer Art – Part 2

The first post on this subject was focused on Cray supercomputers, which place beautiful images on the front to add an artistic touch to their technically-impressive machines.

In this (second) post, I will mostly address the “beauty through design” approach taken by Cray and a few other supercomputer makers.

Let’s start with the Thinking Machines Corporation.  Founded in 1983, it has delivered some of the most advanced (for its time) and good-looking computers ever.  A brief promotional video for its first models is available on YouTube.

Thinking Machines’ CM-5 Supercomputer, also known as FROSTBURG , was installed at the US National Security Agency (NSA) in 1991 for code-breaking tasks, and was operational until 1997:

NSA's thinking machine supercomputer - Blackboxparadox.com

No decorations, no frills.  However, this supercomputer still remains one of the most futuristic-looking supercomputers ever. Its flashing and constantly changing red light panels showed processing node usage, and were also used for its diagnostics. In fact, this old supercomputer looks so good it ended up in a Jurassic Park movie:

cm5 supercomputer in jurasic park - blackboxparadox.com

To me, the CM-5 design actually looks inspired by the WOPR computer from WarGames (1983), which wasn’t a real computer, of course, but a realistically-looking movie prop:

Continue reading

Posted in Amazing technology, data, and people, Computers | Tagged , , , , , , , , , | 1 Comment

Computer Humor

Image | Posted on by | Tagged | Leave a comment

Computer Humor

Image | Posted on by | Tagged | Leave a comment

Introduction to OptiML: Automatic Model Optimization

Very useful feature… Automated AI/machine learning is the future.

The Official Blog of BigML.com

BigML’s upcoming release on Wednesday, May 16, 2018, will be presenting a new resource to the platform: OptiML. In this post, we’ll do a quick introduction to OptiML before we move on to the remainder of our series of 6 blog posts (including this one) to give you a detailed perspective of what’s behind the model optimization part of the release. Today’s post explains the basic concepts that will be followed by an example use case. Then, there will be three more blog posts focused on how to use OptiML through the BigML Dashboard, API, and WhizzML for automation. Finally, we will complete this series of posts with a technical view of how OptiML works behind the scenes.

Understanding OptiML

At BigML, we are believers of human-in-the-loop Machine Learning and the importance of feature engineering which is driven by subject matter expertise in real-life situations.  As such…

View original post 867 more words

Posted in Computers | Leave a comment

Computer Humor

Image | Posted on by | Tagged | Leave a comment

Computer Humor

Image | Posted on by | Tagged | Leave a comment

Computer Humor

Image | Posted on by | Tagged | Leave a comment

Computer Humor

Image | Posted on by | Tagged | Leave a comment