-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathVisualizationforKFC.py
More file actions
131 lines (110 loc) · 6.69 KB
/
VisualizationforKFC.py
File metadata and controls
131 lines (110 loc) · 6.69 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
import streamlit as st
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
file = 'KFC.xlsx'
sheet = 'Clean_data'
df = pd.read_excel(file, sheet_name=sheet)
sheet = 'Addon'
df2 = pd.read_excel(file, sheet_name=sheet)
sheet = 'Promotion'
df3 = pd.read_excel(file, sheet_name=sheet)
# st.subheader('Max Budget Distribution by occuption and budget')
# st.bar_chart(df,x='occupation',y="budget",color="budget")
st.subheader('Max Budget Distribution by occupation and budget')
crosstab = pd.crosstab(df['occupation'], df['budget'])
# Plot
fig, ax = plt.subplots()
crosstab.plot(kind='bar', stacked=False, ax=ax)
ax.set_xlabel("Occupation")
ax.set_ylabel("Count")
st.pyplot(fig)
st.write("""
This chart shows how maximum spending budgets are distributed across different occupation groups: students, staff, teaching assistants (TA), and others.
Overall, students have the highest counts across all budget ranges. Most students are in the 100–199 budget range, followed by 200–299, with fewer in 300+ and below 100. This shows that students generally prefer moderate spending levels.
For staff, the distribution is much smaller, but they are still mainly in the 100-199 range, with a few individuals spread across higher and lower budget categories.
The TA group has very low counts overall, with only a small number of people in each budget range, slightly more in the 300+ range.
The others category has very few individuals overall, with most of them in the 200–299 range.
In summary, the chart indicates that mid-range budgets (100–199) are the most common across all occupations, especially among students, while higher and lower budget ranges are less frequent.
""")
st.divider()
st.subheader('Demographic Spending by Age')
crosstab = pd.crosstab(df['age'], df['budget'])
fig, ax = plt.subplots()
crosstab.plot(kind='bar', stacked=False, ax=ax)
st.pyplot(fig)
st.write("""
This chart shows how spending budgets are distributed across different age groups.
The 18–22 age group dominates spending in most categories, especially in the 100–199 range, which is the highest among all groups. They also have a strong presence in the 200–299 range, while fewer people spend above 300 or below 100.
The 23–27 group follows a similar pattern but at lower levels. Most of their spending is also in the 100–199 category, with moderate participation in lower and mid ranges.
For the 28–35 group, spending is more balanced. Unlike younger groups, they have slightly higher participation in the 300+ category, suggesting some individuals in this group tend to spend more.
The above 35 group shows generally low spending across all categories, with only small numbers in each budget range.
The under 18 group has very limited participation, mainly in the mid-range budgets.
Younger groups, especially 18–22, spend more and prefer mid-range budgets, while older groups spend less overall.
""")
st.divider()
st.subheader('Demographic Spending by Nationality')
crosstab = pd.crosstab(df['nationality'], df['budget'])
fig, ax = plt.subplots()
crosstab.plot(kind='bar', stacked=True, ax=ax)
st.pyplot(fig)
st.write("""
This chart shows how spending budgets are distributed across different nationalities.
Myanmar has the highest number of people across all budget ranges, with most spending in the 100–199 category, followed by 200–299 and some in 300+.
Thailand also shows noticeable spending, mainly in the 100–199 range, with smaller numbers in other categories.
Countries like India, China, and Sri Lanka have lower participation, mostly in the mid-range budgets.
Most of the other nationalities have only a few individuals, spread across different budget levels.
Spending is concentrated more in Myanmar, while other nationalities show smaller contributions, mainly in mid-range budgets.
""")
st.divider()
st.subheader('Most Popular Main Menu')
fig_histogram = px.histogram(df,y="menuCategory")
st.plotly_chart(fig_histogram)
st.write("""
This horizontal bar chart compares the popularity of different menu categories based on their total order count.
Chicken is overwhelmingly the most popular main menu choice, reaching a count of 100, which
significantly outpaces all other categories. Burgers follow as the second most preferred option,
while rice bowls, sides/drinks, and snack/sweets show much lower demand. This visual
clearly identifies chicken as the primary driver of sales among the available menu categories.
""")
st.divider()
st.subheader('Add-on Popularity') #Same with pieChart
fig_histogram = px.histogram(df2,x="Count",y="Addon")
st.plotly_chart(fig_histogram)
st.write("""
This horizontal bar chart illustrates the total count for various add-on items, providing a clear comparison
of their popularity. French fries emerge as the most frequently selected add-on by a significant
margin, followed by beverages and egg_tarts. Other items like mashed potatoes and chick_n_roll
show moderate demand, while tuna_corn_salad and shrimp_donut are the least popular options.
The length of the bars allows for an immediate identification of the top-performing side items
within the dataset.
""")
st.divider()
st.subheader('Order Type Distribution')
ordertype_count = df['orderType'].value_counts().reset_index()
ordertype_count.columns = ['orderType', 'count']
# Pie chart
fig = px.pie(ordertype_count, names='orderType', values='count')
st.plotly_chart(fig)
st.write("""
This pie chart provides a breakdown of various order types, showing that individual orders
are the most frequent at 40.5%, closely followed by promotion based orders at 38.1%.
Together, these two categories dominate the distribution. Group orders account for 19%
of the total, while snack_sharing represents the smallest fraction at only 2.38%.
This suggests that the majority of customers prefer ordering individually or are
heavily influenced by promotional offers.
""")
st.divider()
st.subheader('Order Method by Age Group')
crosstab = pd.crosstab(df['age'], df['orderMethod'])
fig, ax = plt.subplots()
crosstab.plot(kind='bar', stacked=False, ax=ax)
st.pyplot(fig)
st.write("""
This chart shows the number of orders by age group and ordering method. Customers aged 18–22 have the highest orders across all methods, especially kiosks, followed by apps and counters. The 23–27 group also prefers kiosks, with fewer app and counter orders. For ages 28–35, orders are lower, with apps used slightly more than kiosks and counters.
Customers above 35 mainly use the counter, with little to no use of apps or kiosks. The under 18 group has the lowest orders, showing a small preference for apps, followed by counters and very few kiosk orders.
Younger customers tend to prefer kiosks and apps, while older customers rely more on the counter method.
""")
st.divider()