Let $ X $ be the $ q \times q $ shift matrix sending $ |y \rangle \mapsto |y+1 \rangle $ where the ket index $ y=0,\dots, q-1 $ is taken mod $ q $. Let $ Z $ be the diagonal $ q \times q $ clock matrix sending $ |y \rangle \mapsto (e^{2 \pi i /q})^m|y \rangle $. Then $ \mathrm{P}(1,\mathbb{Z}_q)=<X,Z> $ (traditionally a global phase is also added as a generator but global phases will be irrelevant throughout this question) is the modular Pauli group for one qudit of dimension $ q $. Similarly the modular Pauli group on $ n $ qudits, each of dimension $ q $, is generated by $$ \mathrm{P}(n,\mathbb{Z}_q):=<\{X_i,Z_i:i=1,\dots n \}> $$ where $ X_i $ is a tensor product with $ X $ acting on the $ i $th qudit and identity on all the other qudits. Then we can define the $ n $ qudit modular Clifford group as $$ \mathrm{Cl}(n,\mathbb{Z}_q):=\{g \in \mathrm{SU}(q^n): g h g^{-1} \in \mathrm{P}(n,\mathbb{Z}_q) \text{ for all } h \in \mathrm{P}(n,\mathbb{Z}_q) \} $$ In other words, the normalizer in the unitary group of the modular Pauli group. In this definition I use $ \mathrm{SU}(q^n) $ instead of $ \mathrm{U}(q^n) $ so that the modular Clifford group will be finite (since the question is about unitary designs, which are by definition finite sets of unitary matrices). There is no loss of generality because $ \mathrm{U}(q^n)=e^{i \theta} \mathrm{SU}(q^n) $ and global phase is irrelevant.
It is well known that $ \mathrm{Cl}(n,\mathbb{Z}_q) $ is always a 2-design if $ q $ is prime. In fact the qubit Clifford group $ \mathrm{Cl}(n,\mathbb{Z}_2) $ is actually a 3-design. It is even known that $ \mathrm{Cl}(n,\mathbb{Z}_2) $ is almost a 4-design in the sense that the frame potential is one away from the minimum that would make it a 4-design.
But what happens when $ q $ is not prime?
In Theorem III.1 of Clifford groups are not always 2-designs it is claimed that $ \mathrm{Cl}(n,\mathbb{Z}_q) $ is a 2-design if and only if $ q $ is prime. In other words, $ \mathrm{Cl}(n,\mathbb{Z}_q) $ for $ q $ composite is never a 2-design. A similar definition seems to be used in Generators for single qudit Clifford, d=4 where explicit generators of $ \mathrm{Cl}(1,\mathbb{Z}_4) $ show that $ \mathrm{Cl}(1,\mathbb{Z}_4) $ is not a 2-design
However in section 12.2.1 "The Clifford group is a unitary 2-design" of the dissertation of Markus Heinrich it is claimed that $ \mathrm{Cl}(n,\mathbb{F}_q) $ is a 2-design for any prime power $ q $. This paper by Heinrich's advisor David Gross also seems to claim that $ \mathrm{Cl}(n,\mathbb{F}_q) $ is a 2-design for any prime power $ q $.
I assume this difference is just arising from definitions. Can someone explain the difference in definition? Why do the generators in Generators for single qudit Clifford, d=4 seem to show that $ \mathrm{Cl}(1,\mathbb{Z}_4) $ is not a 2-design but the proof in Heinrich/Gross claims that $ \mathrm{Cl}(1,\mathbb{F}_4) $ is a 2-design?
1/3/2023: This is probably terrible manners but I love the answer from Markus Heinrich so much that I took his new notation $ \mathrm{Cl}(n,\mathbb{Z}_q) $ and $ \mathrm{Cl}(n,\mathbb{F}_q) $ and I went back into the original question and applied it in all the appropriate places. Same thing with the use of the term modular that he suggests.