Leetcode 1174: Immediate Food Delivery II

Description

+-----------------------------+---------+
| Column Name                 | Type    |
+-----------------------------+---------+
| delivery_id                 | int     |
| customer_id                 | int     |
| order_date                  | date    |
| customer_pref_delivery_date | date    |
+-----------------------------+---------+
delivery_id is the primary key of this table.
The table holds information about food delivery to customers that make orders at some date and specify a preferred delivery date (on the same order date or after it).

If the customer’s preferred delivery date is the same as the order date, then the order is called immediate; otherwise, it is called scheduled.

The first order of a customer is the order with the earliest order date that the customer made. It is guaranteed that a customer has precisely one first order.

Write an SQL query to find the percentage of immediate orders in the first orders of all customers, rounded to 2 decimal places.

The query result format is in the following example.

Example 1:
Input: 
Delivery table:
+-------------+-------------+------------+-----------------------------+
| delivery_id | customer_id | order_date | customer_pref_delivery_date |
+-------------+-------------+------------+-----------------------------+
| 1           | 1           | 2019-08-01 | 2019-08-02                  |
| 2           | 2           | 2019-08-02 | 2019-08-02                  |
| 3           | 1           | 2019-08-11 | 2019-08-12                  |
| 4           | 3           | 2019-08-24 | 2019-08-24                  |
| 5           | 3           | 2019-08-21 | 2019-08-22                  |
| 6           | 2           | 2019-08-11 | 2019-08-13                  |
| 7           | 4           | 2019-08-09 | 2019-08-09                  |
+-------------+-------------+------------+-----------------------------+
Output: 
+----------------------+
| immediate_percentage |
+----------------------+
| 50.00                |
+----------------------+
Explanation: 
The customer id 1 has a first order with delivery id 1 and it is scheduled.
The customer id 2 has a first order with delivery id 2 and it is immediate.
The customer id 3 has a first order with delivery id 5 and it is scheduled.
The customer id 4 has a first order with delivery id 7 and it is immediate.
Hence, half the customers have immediate first orders.

Solution 1 (with CTE and window function)

Explanation

The following is the initial solution I came up with, using CTE and window function. The choice of CTE is obvious, and the reason for the window function is that I want to aggregate by the order_date column to find the first orders, but I still need to the customer_pref_delivery_date column which is not aggregated.

The immediate_percentage being asked is essentially just the average of the logical statement customer_pref_delivery_date = order_date (aliased to immed). In SQL, true is 1 and false is 0.

Originally I had many other columns in the first_orders CTE, but I cleaned up by removing all the unused columns to have the following:

WITH first_orders AS (
SELECT 
    RANK() OVER(PARTITION BY customer_id ORDER BY order_date) AS order_rank,
    customer_pref_delivery_date = order_date AS immed
FROM delivery
)
SELECT ROUND(AVG(first_orders.immed) * 100,2) AS immediate_percentage
FROM first_orders
WHERE order_rank = 1;

Solution 2 (with subquery in FROM statement)

Explanation

I thought to myself that I should be able to easily “convert” the CTE into a subquery in the FROM statement. So I did. But I consider the query with CTE a bit better because it is a little more readable.

SELECT
  ROUND(
    AVG(
      order_date=customer_pref_delivery_date
    ) * 100, 2) AS immediate_percentage
FROM (
  SELECT *,
  RANK() OVER(PARTITION BY customer_id ORDER BY order_date) AS order_rank
  FROM delivery
) AS d_ranked
WHERE order_rank = 1;

Solution 3 (with subquery in WHERE statement)

Explanation

I also tried to think about whether there is another way to check whether an order is the first order. So I came up with writing a subquery in the WHERE statement using the IN function of SQL. The subquery returns a table of customer_id and the order_date of the first order. So the WHERE statement filters the delivery table so that it will only have all the first orders.

SELECT 
  ROUND(
  AVG(order_date = customer_pref_delivery_date) *
  100, 2) AS immediate_percentage
FROM delivery
WHERE (customer_id, order_date) IN (
  SELECT 
    customer_id,
    MIN(order_date) AS first_order_date
  FROM delivery
  GROUP BY customer_id
);

Solution (by YouTuber Ms Emma Ding)

Explanation

She uses CTE and a subquery in the SELECT statement, but her approach does not require any window function. She also calculates the immediate_percentage differently.

WITH first_order AS (
  SELECT
    customer_id
  FROM Delivery
  GROUP BY customer_id
  HAVING MIN(order_date) = 
    MIN(customer_pref_delivery_date)
)

SELECT
  ROUND(
    CAST(COUNT(customer_id) AS DECIMAL) /
    (
      SELECT COUNT(DISTINCT customer_id)
     FROM DELIVERY
    ) * 100,
    2) AS immediate_percentage
FROM first_order;

You can see her video explaining the solution here.