Live schema:
Businesses:
| Field | Value | Important or not |
|---|---|---|
| id | 1 | Yes |
| network_id | 1 | Yes |
| client_id | 1 | Yes |
| business_code | 1397 | Yes |
| timezone | America/Los_Angeles | Conditional |
| name | Growthzilla Inc. | Yes |
| phone | 5623623456 | No |
| support@growthzilla.com | No | |
| address | 4842 Wiggin St. | No |
| city | Oak Park | Yes |
| state | CA | Yes |
| zipcode | 91377 | Yes |
| latitude | Empty | Conditional |
| longitude | Empty | Conditional |
| place_id | Empty | No |
| office_hours | {“enabled”:false,“hours”:[{“day”:“sunday”,“start”:“08:00”,“end”:“20:00”,“closed”:false},{“day”:“monday”,“start”:“08:00”,“end”:“20:00”,“closed”:false},…]} | No |
| logo | https://growthzilla.com/img/growthzilla-logo.png | No |
| description | Empty | No |
| mod_waitlist | 1 | No |
| mod_timesheet | 1 | No |
| mod_payment | 1 | Yes |
| active | 2 | Yes |
| password | Empty | No |
| salt | Empty | No |
| status | ACTIVE | Yes |
| loginattempts | 0 | No |
| invoice_items | Empty | Conditional |
| reportsettings | Empty | No |
| stripe_id | Empty | Conditional |
| subscription_id | Empty | Conditional |
| created_at | November 21, 2014, 6:15 AM | Yes |
| updated_at | August 17, 2022, 3:49 PM | Yes |
| websites | [“https://growthzilla.com”] | No |
| suspended_after | Empty | No |
| country | Empty | Yes |
Checkins:
| Field | Value | Important or Not |
|---|---|---|
| id | 8 | Yes |
| subscriber_id | 3 | Yes |
| business_id | 3 | Yes |
| network_id | 3 | Yes |
| employee_id | 12 | Yes |
| queue_id | Empty | Conditional |
| billamount | 1,000 | Yes |
| paidamount | 0 | Yes |
| cashamount | 0 | Yes |
| creditamount | 0 | Yes |
| giftamount | 0 | Conditional |
| tipamount | 0 | Yes |
| cashtipamount | 0 | Yes |
| cashbackamount | 0 | Conditional |
| guestamount | 0 | Conditional |
| taxamount | 0 | Yes |
| feesamount | 0 | Yes |
| discountamount | 0 | Yes |
| services | Empty | Conditional |
| paidforothers | Empty | Conditional |
| payee | Empty | No |
| split | 0 | Conditional |
| points_earned | 1 | Yes |
| manual_points | 0 | Conditional |
| points_redeemed | 0 | Yes |
| withcoupon | Empty | Conditional |
| withmultiplecoupons | Empty | No |
| withfees | Empty | No |
| refund_id | Empty | Conditional |
| checkout_day | November 25, 2014 | Yes |
| orig_checkout_day | Empty | No |
| prepaid | 0 | Yes |
| isnew | 1 | Yes |
| created_at | November 26, 2014, 5:51 AM | Yes |
| updated_at | Empty | Yes |
| deleted_at | Empty | Yes |
Saleslogs:
| Field | Value | Important or Not |
|---|---|---|
| id | 1 | Yes |
| business_id | 68 | Yes |
| checkin_id | 458,824 | Yes |
| product_id | Empty | Conditional |
| service_id | 2,153 | Yes |
| price | 15,600 | Yes |
| commissionprice | Empty | Conditional |
| discountprice | Empty | Conditional |
| discount | 0 | Yes |
| adjustment | 0 | Conditional |
| quantity | 1 | Yes |
| itemproperties | Empty | No |
| employees | [358] | Yes |
| created_at | March 31, 2016, 5:56 AM | Yes |
| updated_at | March 31, 2016, 5:56 AM | Yes |
| prepaids | Empty | Conditional |
| campaign_id | Empty | Conditional |
| cashprice | Empty | Conditional |
| cashdiscount | Empty | Conditional |
Note: Both tables handle financial aspects but from different angles. Checkins focus on the overall transaction (total bill, discounts, payments), wile Saleslogs dive into specifics of each sale (service or product sold, price, discounts applied)
There is a possibility that a customer checks in but no sales are recorded with that checkin. This situation can occur for several reasons, reflecting the diverse purposes of check-ins across different business models.
Appointments:
| Field | Value | Important or Not |
|---|---|---|
| id | 00003912-1bc2-4b41-8923-103356cd4386 | Yes |
| appointment_bid | 491d5d34-63fd-4e63-a120-98d7ec195cbc | No |
| business_id | 150 | Yes |
| queue_id | Empty | Conditional |
| appointment_day | August 19, 2020 | Yes |
| profile_id | Empty | Conditional |
| phone | 18185552462 | No |
| sbashyalcfg@example.com | No | |
| name | Shishir Bashyal | No |
| self | 0 | No |
| order_id | Empty | No |
| appointmenttype | NORMAL | Yes |
| status | CONFIRMED | Yes |
| dueamount | 899 | Yes |
| discount | 0 | Yes |
| coupon | Empty | Conditional |
| textok | 0 | No |
| repeat | 0 | No |
| repeatsettings | Empty | No |
| starttime | August 19, 2020, 9:15 PM | Yes |
| endtime | August 19, 2020, 9:30 PM | Yes |
| taskendtime | August 19, 2020, 9:30 PM | No |
| duration | 15 | Yes |
| sourceips | Empty | No |
| confirmationtype | No | |
| sendconfirmation | 0 | No |
| app | Empty | No |
| desktopapp | 0 | No |
| kb | 0 | No |
| domain | Empty | No |
| source | Empty | Conditional |
| scheduledby | self | No |
| remindertime | Empty | No |
| reminderstatus | Empty | No |
| nopreference | 1 | No |
| price_adjustment | 0 | Conditional |
| time_adjustment | 0 | No |
| invitation_id | Empty | No |
| created_day | August 18, 2020 | Yes |
| created_at | August 18, 2020, 7:04 PM | Yes |
| updated_at | August 18, 2020, 7:04 PM | Yes |
| deleted_at | Empty | Yes |
| | | |
Fact transactions:
| Column Name | Source Table | Source column name | |
|---|---|---|---|
| transaction_id | checkins | id | done |
| business_id | Businesses | done | |
| network_id | Businesses | done | |
| subscriber_id | checkins | subscriber_id | done |
| cashier_id | checkins | employee_id | done |
| employee_id | saleslogs | done | |
| appointment_id | appointments | ||
| saleslog_id | saleslogs | done | |
| service_id | saleslogs | done | |
| product_id | saleslogs | done | |
| checkout_day | checkins | checkout_day | done |
| booked_at | appointments | ||
| queue_id | checkins | queue_id | |
| prepaid_id | saleslogs | does not exist | |
| coupon_id | saleslogs | does not exist | |
| paidamount | checkins | paidamount | done |
| cashamount | checkins | cashamount | done |
| creditamount | checkins | creditamount | done |
| giftamount | checkins | giftamount | done |
| guestamount | checkins | guestamount | done |
| cashbackamount | checkins | cashbackamount | done |
| cashtipamount | checkins | cashtipamount | done |
| taxamount | checkins | taxamount | done |
| feesamount | checkins | feesamount | done |
| discountamount | checkins | discountamount | done |
| cashprice | saleslogs | done | |
| cashdiscount | saleslogs | done | |
| quantity | saleslogs | done | |
| tipamount | checkins | tipamount | done |
| points_earned | checkins | points_earned | done |
| points_redeemed | checkins | points_redeemed | done |
| manual_points | checkins | manual_oints | |
| payee_id | checkins | payee | |
| guest_subscriber_id | checkins | paidforothers | |
| is_prepaid | checkins | prepaid | done |
| is_new | checkins | isnew | done |
| refund_id | checkins | refund_id | |
| created_at | (multiple) | ||
| updated_at | (multiple) |
Facts Table
FACT1: Transactions
-
transaction_id ⇒ checkins.id
-
checkout_day ⇒ chcekins.checkout_day
-
queue_id
-
subscriber_id
-
business_id ⇒ b.business_code
-
network_id ⇒ b.network_id
-
saleslog_id ⇒ saleslogs.id
-
service_id ⇒ saleslogs.service_id
-
product_id ⇒ saleslogs.product_id
-
employe_id ⇒ saleslogs.employees
-
cashier_id ⇒ checkins.employee_id
-
coupon_id ⇒ salelogs.campaing_id (/coupon_id)
-
prepaid_id ⇒ sales.log
-
paidamount (all these from checkins)
-
cashamount
-
creditamount
-
giftamount
-
guestamount
-
cashbackamount
-
cashtipamount
-
taxamount
-
feesamount
-
discountamount
-
cashprice
-
cashdiscount
-
quantity
-
tipamount
-
points_earned
-
points_redeemed
-
manual_points
-
payee_id
-
guest_subscriber_id ⇒ checkins.paidforothers
-
is_prepaid
-
is_new
-
refund_id
-
appointment_id ⇒ appointments.id
-
booked_at ⇒ appointsments.booked_date
-
booked_source ⇒ appointmens…
-
appointment ??
-
created_at
-
updated_at
select
-- -- --
c.id as transaction_id,
-- -- --
b.business_code as business_id,
b.network_id,
-- -- --
c.subscriber_id,
-- -- --
c.employee_id as cashier_id,
sl.employees,
-- -- --
sl.id AS saleslog_id,
sl.service_id,
c.services,
sl.product_id,
sl.price,
sl.quantity,
sl.discount,
sl.discountprice,
sl.cashprice,
sl.cashdiscount,
sl.commissionprice,
sl.prepaids,
sl.created_at,
sl.updated_at,
-- -- --
c.billamount,
c.paidamount,
c.cashamount,
c.creditamount,
c.tipamount,
c.cashtipamount,
c.giftamount,
c.guestamount,
c.feesamount,
c.discountamount,
c.cashbackamount,
c.points_earned,
c.points_redeemed,
c.taxamount,
c.prepaid,
c.isnew,
c.checkout_day
-- -- --
from
checkins c
join
businesses b ON b.id = c.business_id
left join
saleslogs sl ON sl.checkin_id = c.id
left join
appointments ap ON ap.queue_id = c.queue_id
order by
c.id desc
limit
100;Dimensions Tier
- Business
- Network
- Region
- Management
- User
- Customer
- Customer ? (Parent/Family)
- Day
- Periods
-
Service - Category
-
Product - Category - Brand
-
Prepaid
-
Coupon/Campaign
-
Staff
FACT2: Appointments
…appointments
FACT3: Staff
…employees + timesheets
FACT4: Operation
…queues + appointments
FACT5: Membership+Prepaid
…giftcards…membership
(check schema in dbdiagram)
The retail schema tells us in exquisite detail what was purchased at each store and under what conditions. However, the schema doesn’t allow us to very easily analyze which products were sold in the same market basket together. This notion of analyzing the combination of products that sell together is known by data miners as affinity grouping but more popularly is called market basket analysis. Market basket analysis gives the retailer insights about how to merchandise various combinations of items. If frozen pasta dinners sell well with cola products, then these two products perhaps should be located near each other or marketed with complementary pricing. The concept of market basket analysis can be extended easily to other situations. In the manufacturing environment, it is useful to see what products are ordered together because we may want to offer product bundles with package pricing.
The retail sales fact table cannot be used easily to perform market basket analyses because SQL was never designed to constrain and group across line item fact rows. Data mining tools and some OLAP products can assist with market basket analysis, but in the absence of these tools, we’ll describe a more direct approach below. Be forewarned that this is a rather advanced technique; if you are not doing market basket analysis today, simply skim this section to get a general sense of the techniques involved.
We illustrate a market basket fact table that was derived from retail sales transactions. The market basket fact table is a periodic snapshot representing the pairs of products purchased together during a specified time period. The facts include the total number of baskets (customer tickets) that included products A and B, the total number of product A dollars and units in this subset of purchases, and the total number of product B dollars and units purchased.
The basket count is a semiadditive fact. For example, if a customer ticket contains line items for pasta, soft drinks, and peanut butter in the market basket fact table, this single order is counted once on the pasta-soft drinks fact row, once on the row for the pasta-peanut butter combination, and so on. Obviously, care must be taken to avoid summarizing purchase counts for more than one product.
WITH transactions AS (
SELECT *
FROM saleslogs
ORDER BY checkin_id DESC
LIMIT 1000
), basket_fact AS (
SELECT
t1.service_id AS service_A,
t2.service_id AS service_B
FROM
transactions t1
INNER JOIN transactions t2 ON t1.checkin_id = t2.checkin_id
WHERE
t1.service_id IS NOT NULL AND t2.service_id IS NOT NULL
), service_pairs AS (
SELECT
service_A,
service_B,
COUNT(*) AS pair_count
FROM
basket_fact
GROUP BY
service_A, service_B
), total_service_A AS (
SELECT
service_id AS service_A,
COUNT(DISTINCT checkin_id) AS total_service_A
FROM
transactions
GROUP BY
service_id
)
SELECT
sp.service_A,
sp.service_B,
sp.pair_count,
sp.pair_count * 100.0 / (SELECT COUNT(DISTINCT checkin_id) FROM transactions) AS support,
sp.pair_count * 100.0 / tsa.total_service_A AS confidence
FROM
service_pairs sp
JOIN
total_service_A tsa ON sp.service_A = tsa.service_A;