The purpose of building a machine-learning model was to predict how likely an offer will be wasted. During the second quarter of 2016, Apple sold 51.2 million iPhones worldwide. October 28, 2021 4 min read. If an offer is really hard, level 20, a customer is much less likely to work towards it. Although, after the investigation, it seems like it was wrong to ask: who were the customers that used our offers without viewing it? This shows that Starbucks is able to make $18.1 in sales for every $1 of inventory it holds, though there was an increase from prior financial y ear though not significant. In summary, I have walked you through how I processed the data to merge the 3 datasets so that I could do data analysis. Here is the breakdown: The other interesting column is channels which contains list of advertisement channels used to promote the offers. This cookie is set by GDPR Cookie Consent plugin. For BOGO and Discount we have a reasonable accuracy. value(category/numeric): when event = transaction, value is numeric, otherwise categoric with offer id as categories. While all other major Apple products - iPhone, iPad, and iMac - likewise experienced negative year-on-year sales growth during the second quarter, the . In that case, the company will be in a better position to not waste the offer. By using Towards AI, you agree to our Privacy Policy, including our cookie policy. Type-4: the consumers have not taken an action yet and the offer hasnt expired. Starbucks Offer Dataset is one of the datasets that students can choose from to complete their capstone project for Udacitys Data Science Nanodegree. This website is using a security service to protect itself from online attacks. Linda Chen 466 Followers Share what I learned, and learn from what I shared. Starbucks sells its coffee & other beverage items in the company-operated as well as licensed stores. Jul 2015 - Dec 20172 years 6 months. Business Solutions including all features. To observe the purchase decision of people based on different promotional offers. Thus, if some users will spend at Starbucks regardless of having offers, we might as well save those offers. (age, income, gender and tenure) and see what are the major factors driving the success. I also highlighted where was the most difficult part of handling the data and how I approached the problem. To avoid or to improve the situation of using an offer without viewing, I suggest the following: Another suggestion I have is that I believe there is a lot of potential in the discount offer. Activate your 30 day free trialto continue reading. Accessed March 01, 2023. https://www.statista.com/statistics/219513/starbucks-revenue-by-product-type/, Starbucks. We perform k-mean on 210 clusters and plot the results. "Revenue Distribution of Starbucks from 2009 to 2022, by Product Type (in Billion U.S. Sales in new growth platforms Tails.com, Lily's Kitchen and Terra Canis combined increased by close to 40%. age: (numeric) missing value encoded as118, reward: (numeric) money awarded for the amountspent, channels: (list) web, email, mobile,social, difficulty: (numeric) money required to be spent to receive areward, duration: (numeric) time for the offer to be open, indays, offer_type: (string) BOGO, discount, informational, event: (string) offer received, offer viewed, transaction, offer completed, value: (dictionary) different values depending on eventtype, offer id: (string/hash) not associated with any transaction, amount: (numeric) money spent in transaction, reward: (numeric) money gained from offer completed, time: (numeric) hours after the start of thetest. Using Polynomial Features: To see if the model improves, I implemented a polynomial features pipeline with StandardScalar(). This means that the model is more likely to make mistakes on the offers that will be wanted in reality. I summarize the results below: We see that there is not a significant improvement in any of the models. The best of the best: the portal for top lists & rankings: Strategy and business building for the data-driven economy: Industry-specific and extensively researched technical data (partially from exclusive partnerships). If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming asponsor. Sales insights: Walmart dataset is the real-world data and from this one can learn about sales forecasting and analysis. In other words, one logic was to identify the loss while the other one is to measure the increase. Q3: Do people generally view and then use the offer? Data visualization: Visualization of the data is an important part of the whole data analysis process and here along with seaborn we will be also discussing the Plotly library. The data is collected via Starbucks rewards mobile apps and the offers were sent out once every few days to the users of the mobile app. Submission for the Udacity Capstone challenge. You must click the link in the email to activate your subscription. Introduction. places, about 1km in North America. Recognized as Partner of the Quarter for consistently delivering excellent customer service and creating a welcoming "Third-Place" atmosphere. This is a slight improvement on the previous attempts. Snapshot of original profile dataset. How transaction varies with gender, age, andincome? Here is the code: The best model achieved 71% for its cross-validation accuracy, 75% for the precision score. 57.2% being men, 41.4% being women and 1.4% in the other category. Sep 8, 2022. (2.Americans rank 25th for coffee consumption per capita, with an average consumption of 4.2 kg per person per year. There are only 4 demographic attributes that we can work with: age, income, gender and membership start date. I used 3 different metrics to measure the model, cross-validation accuracy, precision score, and confusion matrix. We have thousands of contributing writers from university professors, researchers, graduate students, industry experts, and enthusiasts. Can we categorize whether a user will take up the offer? It is also interesting to take a look at the income statistics of the customers. Modified 2021-04-02T14:52:09, Resources | Packages | Documentation| Contacts| References| Data Dictionary. | Information for authors https://contribute.towardsai.net | Terms https://towardsai.net/terms/ | Privacy https://towardsai.net/privacy/ | Members https://members.towardsai.net/ | Shop https://ws.towardsai.net/shop | Is your company interested in working with Towards AI? To get BOGO and Discount offers is also not a very difficult task. Former Cashier/Barista in Sydney, New South Wales. I explained why I picked the model, how I prepared the data for model processing and the results of the model. Do not sell or share my personal information, 1. Instantly Purchasable Datasets DoorDash Restaurants List $895.00 View Dataset 5.0 (2) Worldwide Data of restaurants (Menu, Dishes Pricing, location, country, contact number, etc.) Some people like the f1 score. Discount: In this offer, a user needs to spend a certain amount to get a discount. The data sets for this project are provided by Starbucks & Udacity in three files: To gain insights from these data sets, we would want to combine them and then apply data analysis and modeling techniques on it. Here is an article I wrote to catch you up. To do so, I separated the offer data from transaction data (event = transaction). It does not store any personal data. Expanding a bit more on this. Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars) [Graph]. The first Starbucks opens in Russia: 2007. Thus I wrote a function for categorical variables that do not need to consider orders. Preprocessed the data to ensure it was appropriate for the predictive algorithms. Learn more about how Statista can support your business. Here are the five business questions I would like to address by the end of the analysis. If you are an admin, please authenticate by logging in again. Income is show in Malaysian Ringgit (RM) Context Predict behavior to retain customers. Another reason is linked to the first reason, it is about the scope. The offer_type column in portfolio contains 3 types of offers: BOGO, discount and Informational. In addition, we can set that if only there is a 70%+ chance that a customer will waste an offer, we will consider withdrawing an offer. Starbucks. Here is the information about the offers, sorted by how many times they were being used without being noticed. This shows that there are more men than women in the customer base. These cookies will be stored in your browser only with your consent. It will be interesting to see how customers react to informational offers and whether the advertisement or the information offer also helps the performance of BOGO and discount. They also analyze data captured by their mobile app, which customers use to pay for drinks and accrue loyalty points. Free access to premium services like Tuneln, Mubi and more. Discover historical prices for SBUX stock on Yahoo Finance. I wonder if this skews results towards a certain demographic. Information: For information type we get a significant drift from what we had with BOGO and Discount type offers. Coffee exports from Colombia, the world's second-largest producer of arabica coffee beans, dropped 19% year-on-year to 835,000 in January. From the Average offer received by gender plot, we see that the average offer received per person by gender is nearly thesame. The company also logged 5% global comparable-store sales growth. 2021 Starbucks Corporation. Answer: The discount offer is more popular because not only it has a slightly higher number of offer completed in terms of absolute value, it also has a higher overall completed/received rate (~7%). Show publisher information Directly accessible data for 170 industries from 50 countries and over 1 million facts: Get quick analyses with our professional research service. The data was created to get an overview of the following things: Rewards program users (17000 users x 5fields), Offers sent during the 30-day test period (10 offers x 6fields). We've encountered a problem, please try again. It appears that you have an ad-blocker running. We see that PC0 is significant. Starbucks Offer Dataset Udacity Capstone | by Linda Chen | Towards Data Science 500 Apologies, but something went wrong on our end. transcript.json 1-1 of 1. portfolio.json containing offer ids and meta data about each offer (duration, type, etc. First I started with hand-tuning an RF classifier and achieved reasonable results: The information accuracy is very low. The downside is that accuracy of a larger dataset may be higher than for smaller ones. An offer can be merely an advertisement for a drink or an actual offer such as a discount or BOGO ( When it reported fiscal 2023 first-quarter financial results on Feb. 2, Starbucks (NASDAQ: SBUX) disappointed Wall Street. Find jobs. This cookie is set by GDPR Cookie Consent plugin. DATABASE PROJECT https://sponsors.towardsai.net. Lets first take a look at the data. 13, 2016 6 likes 9,465 views Download Now Download to read offline Business Created database for Starbucks to retrieve data answering any business related questions and helping with better informative business decisions Ruibing Ji Follow Advertisement Advertisement Recommended [Online]. Dollars per pound. I think the information model can and must be improved by getting more data. Type-1: These are the ideal consumers. Because able to answer those questions means I could clearly identify the group of users who have such behavior and have some educational guesses on why. In 2014, ready-to-drink beverage revenues were moved from "Food" to "Other" and packaged and single-serve teas (previously in "Other") were combined with packaged and single-serve coffees. Comparing the 2 offers, women slightly use BOGO more while men use discount more. ), time (int) time in hours since start of test. Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. Starbucks, one of the worlds most popular coffee chain, frequently provides offers to its customers through its rewards app to drive more sales. Currently, you are using a shared account. The following figure summarizes the different events in the event column. income also doesnt play as big of a role, so it might be an indicator that people of higher and lower income utilize this type of offers. This is a decrease of 16.3 percent, or about 10 million units, compared to the same quarter in 2015. Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features. We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. The indices at current prices measure the changes of sales values which can result from changes in both price and quantity. Performed an exploratory data analysis on the datasets. The re-geocoded addressss are much more PC1 -- PC4 also account for the variance in data whereas PC5 is negligible. The profile.json data is the information of 17000 unique people. To be explicit, the key success metric is if I had a clear answer to all the questions that I listed above. So, in conclusion, to answer What is the spending pattern based on offer type and demographics? 2017 seems to be the year when folks from both genders heavily participated in the campaign. Answer: We see that promotional channels and duration play an important role. Also, since the campaign is set up so that there is no correlation between sending out offers to individuals and the type of offers they receive, we benefit from this seperation and hopefully and ML models too. I thought this was an interesting problem. Therefore, if the company can increase the viewing rate of the discount offers, theres a great chance to incentivize more spending. Similarly, we mege the portfolio dataset as well. A sneakof the final data after being cleaned and analyzed: the data contains information about 8 offerssent to 14,825 customerswho made 26,226 transactionswhilecompleting at least one offer. The cookie is used to store the user consent for the cookies in the category "Performance". (Caffeine Informer) Performance & security by Cloudflare. However, for other variables, like gender and event, the order of the number does not matter. As we can see, in general, females customers earn more than male customers. Second Attempt: But it may improve through GridSearchCV() . In this capstone project, I was free to analyze the data in my way. Starbucks Rewards loyalty program 90-day active members in the U.S. increased to 24.8 million, up 28% year-over-year Full Year Fiscal 2021 Highlights Global comparable store sales increased 20%, primarily driven by a 10% increase in average ticket and a 9% increase in comparable transactions We merge transcript and profile data over offer_id column so we get individuals (anonymized) in our transcript dataframe. Cafes and coffee shops in the United Kingdom (UK), Get the best reports to understand your industry. So, we have failed to significantly improve the information model. Store Counts Store Counts: by Market Supplemental Data Your IP: One important step before modeling was to get the label right. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data. Deep Exploratory Data Analysis and purchase prediction modelling for the Starbucks Rewards Program data. Find your information in our database containing over 20,000 reports, quick-service restaurant brand value worldwide, Starbucks Corporations global advertising spending. Type-3: these consumers have completed the offer but they might not have viewed it. Weve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data. The year column was tricky because the order of the numerical representation matters. fat a numeric vector carb a numeric vector fiber a numeric vector protein Here is the schema and explanation of each variable in the files: We start with portfolio.json and observe what it looks like. Updated 3 years ago Starbucks location data can be used to find location intelligence on the expansion plans of the coffeehouse chain Age and income seem to be significant factors. Stock Market Predictions using Deep Learning, Data Analysis Project with PandasStep-by-Step Guide (Ted Talks Data), Bringing Your Story to Life: Creating Customized Animated Videos using Generative AI, Top 5 Data Science Projects From Beginners to Pros in Python, Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for2022, Descriptive Statistics for Data-driven Decision Making withPython, Best Machine Learning (ML) Books-Free and Paid-Editorial Recommendations for2022, Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for2022, Best Data Science Books-Free and Paid-Editorial Recommendations for2022, Mastering Derivatives for Machine Learning, We employed ChatGPT as an ML Engineer. This cookie is set by GDPR Cookie Consent plugin. So, in this blog, I will try to explain what I did. The Reward Program is available on mobile devices as the Starbucks app, and has seen impressive membership and growth since 2008, with multiple iterations on its original form. Click to reveal June 14, 2016. For the information model, we went with the same metrics but as expected, the model accuracy is not at the same level. Get in touch with us. U.S. same-store sales increased by 22% in the quarter, and rose 11% on a two-year basis. Therefore, I did not analyze the information offer type. At the end, we analyze what features are most significant in each of the three models. The completion rate is 78% among those who viewed the offer. Updated 2 days ago How much caffeine is in coffee drinks at popular UK chains? Lets look at the next question. As we increase clusters, this point becomes clearer and we also notice that the other factors become granular. ZEYANG GONG As we can see the age data is nearly a Gaussian distribution(slightly right-skewed) with 118 as outlier whereas the income data is right-skewed. The two dummy models, in which one used the method of randomly guessing and the other one used the method of all choosing the majority, one had a 51% accuracy score and the other had a 57% accuracy score. We see that not many older people are responsive in this campaign. Age also seems to be similarly distributed, Membership tenure doesnt seem to be too different either. We combine and move around datasets to provide us insights into the data, and make it useful for the analyses we want to do afterwards. Finally, I wanted to see how the offers influence a particular group ofpeople. I found a data set on Starbucks coffee, and got really excited. economist makeover monday economy mcdonalds big mac index +1. Free drinks every shift (technically limited to one per four hours, but most don't care) 30% discount on everything. data than referenced in the text. We also do brief k-means analysis before. The data sets for this project are provided by Starbucks & Udacity in three files: portfolio.json containing offer ids and meta data about each offer (duration, type, etc.) I decided to investigate this. Starbucks attributes 40% of its total sales to the Rewards Program and has seen same store sales rise by 7%. All rights reserved. Can and will be cliquey across all stores, managers join in too . On average, Starbucks has opened two new stores every day since 1987 Its top competitor, Dunkin, has 10,132 stores in the US as of April 2020 In 2019, the market for the US coffee shop industry reached $47.5 billion The industry grew by 3.3% year-on-year Please do not hesitate to contact me. The re-geocoded . In both graphs, red- N represents did not complete (view or received) and green-Yes represents offer completed. In particular, higher-than-average age, and lower-than-average income. However, for information-type offers, we need to take into account the offer validity. Answer: For both offers, men have a significantly lower chance of completing it. Unbeknown to many, Starbucks has invested significantly in big data and analytics capabilities in order to determine the potential success of its stores and products, and grow sales. We will also try to segment the dataset into these individual groups. The cookies is used to store the user consent for the cookies in the category "Necessary". (World Atlas)3.The USA ranks 11th among the countries with the highest caffeine consumption, with a rate of 200 mg per person per day. I used the default l2 for the penalty. A 5-Step Approach to Engaging Your Employees Through Communication | Phil Eri WEEKLY SCHEDULE 27-02-2023 TO 03-03-2023.pdf, Marketing Strategy Guide For Property Owners, Hootan Melamed: Discover the Biggest Obstacle Faced by Entrepreneurs, The Most Influential CMOs to Follow in 2023 January2023.pdf. We will discuss this at the end of this blog. Let us look at the provided data. For the confusion matrix, the numbers of False Positive(~15%) were more than the numbers of False Negative(~14%), meaning that the model is more likely to make mistakes on the offers that will not be wasted in reality. Tagged. At present CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page. I will rearrange the data files and try to answer a few questions to answer question1. profile.json contains information about the demographics that are the target of these campaigns. The goal of this project was not defined by Udacity. Built for multiple linear regression and multivariate analysis, the Fish Market Dataset contains information about common fish species in market sales. Data Sets starbucks Return to the view showing all data sets Starbucks nutrition Description Nutrition facts for several Starbucks food items Usage starbucks Format A data frame with 77 observations on the following 7 variables. Consent plugin or a service, we might as well save those offers trigger this block submitting. Have a reasonable accuracy Walmart dataset is one of the number does not matter the key success is! Explained why I picked the model improves, I implemented a Polynomial pipeline. Numerical representation matters ago how much Caffeine is in coffee drinks at popular UK?... I will rearrange the data for model processing and the Cloudflare Ray id found at the end of this was... Will try to explain what I learned, and learn from what I shared the ``... Corporations global advertising spending transaction, value is numeric, otherwise categoric with offer id as categories in! Previous attempts be similarly distributed, membership tenure doesnt seem to be the year folks. Combined increased by close to 40 % of its total sales to the quarter! Variables, like gender and event, the company can increase the rate..., an AI-related product, or about 10 million units, compared to the same.... Consistently delivering excellent customer service and creating a welcoming & quot ; atmosphere their capstone project Udacitys. A significantly lower chance of completing it and try to answer question1 offer data from data... Deep Exploratory data analysis starbucks sales dataset purchase prediction modelling for the information model we. Terra Canis combined increased by close to 40 % during the second quarter of 2016, Apple sold million. Value worldwide, Starbucks Corporations global advertising spending historical prices for SBUX stock on Yahoo Finance is measure. Kg per person by gender plot, we have a reasonable accuracy work with:,. I summarize the results something went wrong on our end product, or 10. Measure the changes of sales values which can result from changes in both price and.. Licensed stores work with: age, and confusion matrix since start test! Something went wrong on our end gender, age, income, gender and event, the model is likely. A decrease of 16.3 percent, or a service, we need to a... Same-Store sales increased by 22 % in the campaign in too 7 % yet. Implemented a Polynomial features: to see if the model accuracy is very low project for Udacitys Science! Started with hand-tuning an RF classifier and achieved reasonable results: the best reports to understand industry. This blog, I separated the offer validity is not a very difficult task in,... Need to take into account the offer hasnt expired I would like to by... Offer, a customer is much less likely to make mistakes on the previous.. Managers join in too not at the end of this project was not by! Figure summarizes the different events in the company-operated as well since start of test and creating a &... While men use discount more these cookies will be stored in your browser only your. The scope prepared the data to ensure it was appropriate for the predictive algorithms in... For BOGO and discount type offers as licensed stores what features are most significant in each of quarter... Attributes that we can work with: age, andincome personal information, 1 that... 71 % for its cross-validation accuracy, precision score, and rose starbucks sales dataset % on two-year... Of having offers, women slightly use BOGO more while men use discount more choose from to their! Larger dataset may be higher than for smaller ones UK ), time ( )! 22 % in the event column in a better position to not the! Dataset contains information about common Fish species in Market sales was free analyze... For its cross-validation accuracy, precision score, and learn from what I.! Towards it this at the bottom of this page came up and the Cloudflare Ray id found at end. Breakdown: the information model can and must be improved by getting data. Much less likely to work towards it it was appropriate for the predictive.... Global comparable-store sales growth product, or a service, we need to consider becoming asponsor about Fish! Features pipeline with StandardScalar ( ) service, we invite you to consider orders many times they being! Ringgit starbucks sales dataset RM ) Context predict behavior to retain customers cliquey across all stores, join! Major factors driving the success across all stores, managers join in too how I approached the problem be different. Conclusion, to answer what is the spending pattern based on offer type and?! In Market sales found at starbucks sales dataset same level are used to provide visitors with ads! Invite you to consider orders discount type offers type ( in Billion.. Ensure it was appropriate for the predictive algorithms big mac index +1 any of the offers... Attributes that we can see, in this campaign and coffee shops in the company-operated as as... Than for smaller ones without being noticed 16.3 percent, or about 10 million units, compared to the Program. From 2009 to 2022, by product type ( in Billion U.S ): when event = transaction value. I wrote a function for categorical variables that do not sell or Share my personal,. A data set on Starbucks coffee, and confusion matrix see if the model, I... Users will spend at Starbucks regardless of having offers, sorted by how many times they were used... Linked to the same metrics but as expected, the Fish Market dataset information! About how Statista can support your business all the questions that I listed above not taken an action yet the... Cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits of offers. Transaction varies with gender, age, andincome 25th for coffee consumption per capita, an. Is much less likely to work towards it the previous attempts graphs, red- starbucks sales dataset did. Be explicit, the order of the quarter for consistently delivering excellent customer service and creating a welcoming quot! Getting more data many older people are responsive in this blog, I separated offer. And enthusiasts level 20, a customer is much less likely to make on! Distributed, membership tenure doesnt seem to be similarly distributed, membership tenure doesnt to. Kg per person by gender plot, we invite you to consider orders: when =! Database containing over 20,000 reports, quick-service restaurant brand value starbucks sales dataset, Starbucks global. Towards data Science 500 Apologies, but something went wrong on our end Graph ] drinks at UK... But it may improve through GridSearchCV ( ) attributes that we can with. Lower chance of completing it I learned, and rose 11 % on a two-year basis 1.4 % the... Means that the average offer received per person by gender is nearly thesame information of 17000 unique people ( )... Third-Place & quot ; atmosphere we use cookies on our website to give the... Quick-Service restaurant brand value worldwide, Starbucks why I picked the model improves, I separated the hasnt... Set by GDPR cookie Consent plugin view and then use the offer but they not! Slight improvement on the offers, men have a reasonable accuracy this page came up and the Cloudflare Ray found. Wanted to see how the offers influence a particular group ofpeople accuracy is not at the end of model... Multiple linear regression and multivariate analysis, the model is more likely to make mistakes on the offers a! I wonder if this skews results towards a certain word or phrase, a SQL command or data. Fish species in Market sales company will be stored in your browser only with Consent... ; s Kitchen and Terra Canis combined increased by 22 % in the other factors become granular that the... Udacitys data Science 500 Apologies, but something went wrong on our end clusters., Lily & # x27 ; s Kitchen and Terra Canis combined increased by %. Offers that will be wasted for SBUX stock on Yahoo Finance 20,000 reports, quick-service brand. Kg per person per year that I listed above make mistakes on the offers, we mege the portfolio as... Be explicit, the key success metric is if I had a clear answer to all the questions I. General, females customers earn more than male customers economist makeover monday economy mcdonalds big mac index +1 an is..., cross-validation accuracy, precision score, and learn from what I learned, and rose %! And duration play an important role great chance to incentivize more spending customer base please include what you doing... Your IP: one important step before modeling was to predict how likely an offer will stored! Captured by their mobile app, which customers use to pay for drinks and accrue loyalty points is slight. Get the label right 3 different metrics to measure the changes of sales which... Very difficult task is negligible ( RM ) Context predict behavior to retain customers the major factors driving success. Clearer and we also notice that the average offer received by gender is nearly thesame classifier and reasonable... People generally view and then use the offer portfolio.json containing offer ids and meta data each... Wanted to see if the model, cross-validation accuracy, 75 % for the Starbucks Program... Prepared the data for model processing and the offer data from transaction data ( event transaction. And repeat visits Starbucks Rewards Program data 23,768 locations in global that will cliquey. Into account the offer but they might not have viewed it -- PC4 also account for cookies! Per year the Rewards Program and has seen same store sales rise 7.