Shop Time and the Area of a Square

How Shipt Sets the Floor

Venchei Sanders
8 min readJul 6, 2020

How long does it take to buy just one item in a grocery store? Go ahead and take a guess, I’ll wait… What did you guess — ten, fifteen minutes? When conducting personal shopping, such a question may come to mind when you’re under a time crunch. Imagine it’s lunch time and you only have 40 minutes to run into the grocery store to grab dinner and snacks before you have to pick up the kids. So how long does it take to find one item in the grocery store? Is that item in the front or rear of the store? Do you need help from a store associate to find that one item?

I too wrestled with this question. Shipt is a grocery delivery company, and recently I was tasked with determining which data is useful for a machine learning model that estimates what we call “Shop Time”.

What is Shop Time?

Shop Time is the duration of time from start to finish that is needed to find items in a store and checkout. At Shipt, we deeply value our relationship with our members and shoppers. Thus, surfacing shop time estimates allows a Shipt Shopper to better plan, be more efficient, and provide an amazing experience for our members.

When developing a model, deciding which data to value is paramount. Data selected to train a model is important as our model predictions will be consumed by our shoppers via what we call the offer card.

The image below is an order offer card from the Shipt Shopper App. The 1 hour 15 minute estimate includes not only the shop time but also the drive time.

Order Offer Card

Identifying Outliers

If we choose data that was poorly captured, then our model is trained and partially influenced by false signals, resulting in less accurate shop time expectations for our shoppers. Let’s start the data selection process by determining reasonable lower bounds for shop time.

In physics, there’s a well known mental model called First Principles Reasoning. Thinking via First Principles allows a person to start from scratch while questioning every known assumption about a specific domain, then building a new (hopefully innovative) solution based upon new thinking about those assumptions.

A strong analyst should have many tools in the toolbox like reasoning by First Principles. Using the concept of a reasonable floor, we consider the most basic order type — the smallest possible order with only one item. We’ll call the associated shop time for a single-item order the Minimum Shop Time.

So how long does it take to find only one item in a grocery store? If you guessed 10 minutes, then we need you to explain why. A data scientist once said to me, “In God we trust. All others must provide data.” The beauty of learning with data is that we can leverage data to prove such assertions.

In a perfect world, all data collected would correspond perfectly to the timestamps of the user interaction within the app. However, weather, buildings and associated natural interference leads to suspicious data captured in the system.

We’ve observed outliers such as shop times that are negative, null, less than one minute, or — on the extreme right — 12 hours. To develop a model that reasonably represents the actual task of shopping, we had to consider the minimum time it takes and remove recorded data that may be incorrectly captured by the system.

Below is the distribution of shop times at Shipt for single-item orders:

Note: Hundreds of zero or negative shop times were captured by our database.

Grocery Shopping Axioms and Theory

If you studied mathematics, you have probably heard of Euclid’s Elements. Euclid’s Elements are a set of books attributed to Euclid of Alexandria around 300 BC. During that time, Euclid’s Elements were considered a bible of sorts for mathematics. His theories on Number Theory and Geometry are considered fundamental to a proper mathematics education. In the mathematical language of Euclid, understood truths are called Axioms and Postulates. Here we present our Axioms and Postulates for Grocery Shopping:

  1. An order for delivery has a minimum of one product and one item (Shipt refers to this as an order line — similar to a line item on a receipt)
  2. Shop times are positive, all real numbers
  3. As number of products and total items requested increase, so does the duration of time to shop

Note: This article attempts to narrow the focus of number two.

We have data about the area (or size) of stores that we service. Moreover, we can leverage that information along with member feedback to develop a model that reasonably estimates the time it takes to shop a single-item order.

In order to meet or exceed delivery expectations, an experienced shopper will maximize their time by:

  1. Arriving at the store early
  2. When appropriate, communicating order issues quickly
  3. Delivering all requested and available products on-time

Let’s suppose you’re a shopper who received our Minimum Shop Time order. Ok, you have an easy order. It’s only one item. What could go wrong? Well, what if the item is in the back of the store? How long might it take you to get there? How could you minimize this time?

Let’s assume the store’s perimeter is that of a rectangle, particularly a square, and that information regarding the size of the grocery store is known.

If you remember from grade school, the area of a square is:

Furthermore, via the Pythagorean Theorem, the length of the diagonal is:

By taking the square root of (Equation 1),

we now have a way to determine the length of the side of the store; more importantly, the shortest possible distance through the store — a straight line!

Obviously, this violates the laws of reality, a shopper likely cannot traverse the diagonal of a grocery store; but, it provides an important insight that we will leverage when we select the data for our model.

Finally, if there was a way to relate this to distance and time then we could evaluate our earlier guess.

Physics: Insert {Equations of Motion}…

In physics, equations of motion are equations that describe the behavior of a physical system in terms of its motion as a function of time (i.e. how fast can a Shipt Shopper cross a distance). Let’s leverage the rate, time, and distance relationship of:

A human walks at roughly 3.1 Miles per Hour. After converting to feet per minute, we get 272.8 feet per minute.

So what’s your point?

Since we know the distance to the rear of the store and our walking pace, we now have a way to determine if our time to shop guess was reasonable.

We can now flag shop times in our data that are below the computed Minimum Shop Time threshold.

Framework/Model Assumptions and Limitations

Our model is not perfect. It makes assumptions about a shoppers ability to walk, locate an item, and checkout. These assumptions are as follows:

  1. Our model assumes the worst possible case for an item’s location — an item is located in the rear corner of the store opposite the shopper’s entrance.
  2. The shortest possible path to the rear of the store can be traversed via the diagonal. A shopper’s ability to enter the store and walk to the rear via the diagonal (a straight line) represents the optimal path to the rear of the store.
  3. We assume there are minimal impediments to the shoppers ability to walk at the average speed (walking pace of human is assumed).
  4. Finally, we assume a shopper can locate an item that is stocked, pay the cashier, and walk out of the store unencumbered; meaning, at this point we are leveraging this model to gain insight about shop time.

Furthermore, despite violating the physical properties of a store (i.e. a shopper cannot walk through the aisles of a store via the diagonal), this model provides an insight into reasonable shop time expectations for a shopper who is fulfilling a single-item order. For all larger orders, we expect larger shop times. Thus, shop times recorded below the threshold for all order sizes will be marked as potential outliers.

Reflections and a Real Example

When attempting to model a complex world, sometimes it makes sense to keep your thinking simple. Here, we developed a simple model to approximate the time to shop a single-item order. We are now able to use a simple equation to flag all orders for removal with shop times below a reasonable threshold for our training set. Let’s end with an example:

A typical small-size format Target is about 40,000 square feet (full-size 130,000 sq ft). That means we should expect the Minimum Shop Time at a small-format Target to be

or

We can now remove all data below these computed thresholds. Amazing how a simple model can have a profound impact on your thinking!

If data modeling and predictions interest you, come join our team. We’re hiring at Shipt!

--

--

Venchei Sanders

ESk8r 🛹 Data Science Dude Evergreen Learner: Economics, Finance, Math, and Technology 🤓 3x Alabama Alum 👨🏾‍🎓🧻🌊🐘