At an airport there is a company that offers a specific service for travellers (there are no competitors offering the same service at the location). They performed an experiment - they temporarily reduced the price of the service (quite significantly) hoping they might achieve more sales. They have records about all sales for about 3 months before and 3 months after the reduction of the price and in addition they know how many travellers visited the airport every day within those 6 months.
The task is to compare two samples:
Sample 1: total sales $(x_{1,1}, x_{1,2}, ..., x_{1,m})$ per day for $m$ consecutive days before the price was reduced ($m$ is approximately 90, i.e. 3 months)
Sample 2: total sales $(x_{2,1}, x_{2,2}, ..., x_{2,n})$ per day for $n$ consecutive days after the price was reduced ($n$ is also approximately 90, i.e. 3 months)
They would like to know if the reduction of the price of the service influenced sales or not. Just by observing graphical representation of the data it looks like that not but they would like to support the hypothesis by statistical arguments.
The first idea that came to my mind is that a variant of two-sample $t$-test could be used in this case. But the problem is that the number of travellers visiting the airport is known for each day and it differs from day to day. So obviously if there is more people at the airport the sales are higher. Let's denote $(y_{1,1}, x_{1,2}, ..., y_{1,m})$ numbers of travelers visiting the airport in days before the reduction of the prize and $(y_{2,1}, x_{2,2}, ..., y_{2,n})$ numbers of travelers visiting the airport in days after the reduction of the prize . My idea is to "standardize" sales by dividing them by numbers of travelers, i.e. $z_{i,j} = x_{i,j}/y_{i,j}$.
Questions I have:
- Does it make sense to apply such standardization to the data considering I want to compare resulted "standardized" samples $(z_{1,1}, z_{1,2}, ..., z_{1,m})$ and $(z_{2,1}, z_{2,2}, ..., z_{2,n})$? Would two-sample $t$-test be suitable for such data? (What is worrying me is that in fact each observed day sales value has a different weight $1/y_{i,j}$ that is applied to it when it is standardized and I am not sure if this doesn't break any assumption of two-sample $t$-test.)
- Could possibly a different type of analysis be applied to the data to decide if sales were influenced by the reduction of the price of service or not?