I always looked at it like this:
If we consider older and limited file systems, block/clustersize had largely to do with the maximum amount of clusters possible/allowed by the file system.
Example, a 12 bit FAT entry puts an upper limit to amount of clusters that can be addressed. 111111111111 is the largest 12-bit binary number. The decimal equivalent of this number would be 4095, and the hexadecimal equivalent would be FFF.
A 16 bit value increases amount of clusters that can be addressed.
So one way of addressing a disk size limitation that is the result of number of clusters we can address is increasing number of bits we can use to address a cluster/block.
Another way however is increasing the cluster or block size.
Increasing number of clusters we can address increases overhead. On the other hand increasing cluster/block size increases waste: Store a 1 KB file inside a 4KB cluster and we waste 3 KB. Or, store a 13 KB file in 4 KB clusters and we again waste 3 KB as we need to allocate 4 clusters to the file.
So, it's a trade-off between overhead and potentially wasted space increase when using large block/cluster sizes. For example, if we know in advance the file system will largely have to deal with large files we can opt for a large cluster size and have the advantage of reduced overhead.
Pages 'act' as 'middle man' between the OS and storage, but pages unlike block/cluster sizes can not be defined by something as file system formatting, rather they're fixed. Efficiency requires common ground between page and block/cluster size and so it's the page size that determines minimum block size as it is a fixed value.