In 1996 Irving Good himself recalls:
One of the related problems close to philosophy is the estimation of the probability of one category of a multinomial when the order of the cells is irrelevant. [... This] led me on to the development of a hyperprior for the hyperparameter $k$, which I discussed at some length in my book The Estimation of Probabilities (Good, 1965) and in several later works.
As pointed out by AChem, Good does not use the term hyperparameter in his 1965 book. Instead, he speaks of a "flattening constant". In the 1980 paper Some history of the hierarchical Bayesian methodology Good explicitly claims to have invented the latter term, without claiming credit for the former term: