My wife is an expert wine-buyer and every good wine bottle she brings home has a little story attached to it. Even though, occasionally, I (secretly) don’t enjoy the taste of some of those wines, I know they are all considered to be of “high quality” and “very popular”. Some of them simply just don’t match my taste.
However, there have been plenty of wines I’ve tried in the past that were just plain bad. This made me think about the wine manufacturers – why do they even sell a particular (bad) wine? Can’t they just predict a customer’s response by tasting their own wine or by measuring objectively a few things about a wine’s chemistry and physics?
Wine-making is a big industry and there are already quite a few studies done and papers published in this field trying to answer this question. In fact, after a quick search online, I found a few data science studies trying to address it, and many of them were referring to the following paper:
- P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
This particular publication came with a couple of datasets available for everyone to play around with, so I decided to use them for my next project. As an advisor to BigML, which is a leading analytics company, I wanted to analyze this wine quality data using their online platform and see 1) can I answer my question about predicting wine quality from some objective measurements and 2) how quickly could this be accomplished using the BigML online solution?
First, I downloaded the wine composition and quality assessment data from here: there are two datasets available with 1599 entries for red wine and 4898 entries for white. Even if I prefer red wine, I decided to go with the larger dataset for my study.
The dataset included 11 wine features such as residual sugar, density, pH, alcohol and few others (check the dataset if interested) and one numerical value for quality of each wine, which was expressed as a number between 0 (very bad) and 10 (excellent).
I felt that the regression analysis (having a numerical output in mind) will be too noisy and inaccurate, I decided to simply split the dataset into two classes: bad wine (0-6) and good wine (7-10).
I used ExcelⓇ to do these initial data manipulations and then imported the dataset into the BigML online portal (a simple drag and drop). Notice in the picture below how convenient it is to see all the distributions for each data column and their descriptive statistics.
Then, I used the 1-Click function to split the dataset randomly as 80/20 for testing and training data, and BigML did an automatic population bias correction during this process. Then, I chose Random Forest for test dataset processing and ran it with 500 trees (with replacement). Random Forest is a popular technique that relies on an ensemble of many “weak” predictions to make a more accurate and high-confidence prediction. Random forest can be used for both regression and classification tasks.
At first I used manual settings, but then re-ran the model using an “auto” setting, which allowed me to completely avoid thinking about model configuration.
When the model was finally completed (which took a couple of minutes), I ran an “Evaluation” of the results, which showed all the outputs I wanted to see: the confusion metrics, the ROC curve, etc.
This was not a fantastic result. While the model correctly predicted the majority of bad (736 out of 825) and good (120 out of 155) wines, the level of false predictions seemed to me to be too high. For example, the model described about 12% of wines “good”, while they were actually “bad”. You wouldn’t want this to happen to you when you were deciding what to buy (even if, due to the way I classified the data, many of the “bad” wines could be really close in quality to the “good” ones).
As a wine-maker, would I use this particular model to predict the quality of wine from some objective measurements? Maybe… as a complement to some other predictive techniques. But, it seems to me that some important wine properties are still missing from the data used and, thus, it is impossible to achieve higher accuracy without false predictions. For example, elements like Zn and Mg in wine seem to impact its taste. They were not present in this dataset. Thus, we might just need more data to make this model really good.
Reading the paper further, I can also better understand how the dataset labels were generated: “Regarding the preferences, each sample was evaluated by a minimum of three sensory assessors (using blind tastes), which graded the wine in a scale that ranges from 0 (very bad) to 10 (excellent). The final sensory score is given by the median of these evaluations.” These assessors may be well trained beyond that of the average sommelier, but taste is still unique to each individual….while the general aspects that make wine ‘very bad’ may be easily distinguishable for even a novice participant enjoying a fermented drink, the aspects that distinguish a 6-7 from a 9-10 may not be noticed by the average consumer.
For reference, if I still decide to use this model, all I need is to click “Predict” and an interactive prediction model will become available to me:
I can now move the sliders to change values and see the answer at the top of the screen (the answer is “bad” in this particular case).
Obviously, I didn’t do this quick project to advance the science and business of wine-making: it requires full-time effort and a lot of background research to have. But I did it to check how quickly I can complete this ML study WITHOUT any programming… The answer is this: with some basic understanding of 1) ML principles and algorithms and 2) the BigML online solution itself, a wannabe data scientist can have his answer in probably less than 30 minutes.
Why is this important? Because programming takes time and requires folks with special skills. Plus, “exploratory projects” are usually short-term and benefit from a quick turn-around. The above approach is easy, scalable, and transferable – the online solution looks the same to everyone and offers a standardized set of tools and techniques, which only get better with time. Also, it offers a lot of conveniences like automatic dataset balancing (up/downsampling for highly-unbalanced datasets), automated descriptive statistics on your data, etc.; all is done exactly the same way for all users and saves you time.
By the way, if you want to productize this or other models, you can download the final predictive model as Python code (or in other formats) and make it a part of your production code.
Let me finish this article with a few more words about wine quality… The following is a list of suggestions for wine quality judges from the AWS Journal (American Wine Society, don’t confuse it with Amazon Web Services) paper that could probably explain the modeling difficulties I reported above.
TIPS FOR WINE JUDGING
- Good lighting is essential for good evaluation.
- Avoid mouthwash prior to an evaluation.
- Do not wear cologne or perfume which can interfere with the sense of smell.
- No smoking.
- Wine glasses for each wine should be the same size and shape.
- Pour the same amount of wine in each glass.
- Condition your mouth with wine before tasting the first wine.
- Discourage talking during evaluations. Concentrate on the wine.
- Do not discuss the wines during evaluation—evaluate in silence. No comments, no grunts, no sighs, no facial expressions. Do not influence other tasters. Rely on your own senses.
- Discussion of wines occurs after every person has completed the evaluation form.
Maybe the lighting was inconsistent for the dataset I used?
Enjoy good wine, if you know how to choose it.