The following is a simple and intuitive explanation for your question:
Let's have a binomial distribution (the data model)
$$\mathbb{P}[X=x]=\binom{n}{x}\theta^{x}(1-\theta)^{n-x}$$
where the support is $x=0,1,2,...,n$
This is a DISCRETE rv where X is the variable, (n is known) and $\theta$ is a parameter $\in [0;1]$
Now let's change the point of view, looking at this distribution as a function (continuous function) in the variable $\theta$
As this is a function of $\theta$ we can first discard all the quantities that do not depend on $\theta$ getting
$$f(\theta|x)=\theta^{x}(1-\theta)^{n-x}=\theta^{(x+1)-1}(1-\theta)^{(n-x+1)-1}$$
Now $\theta$ is the variable and $x$ a parameter (observed data) and we recognize a Beta distribution
$$f(\theta|x) \propto Beta[(x+1);(n-x+1)]$$
Now I think it is very easy to verify that Beta is exactly the conjugate prior of Binomial Model
Important Observation: To identify the family of conjugate prior for a Statistical Model there is a very useful factorization theorem
Let's suppose that the model can be written in the following way:
$$\mathbb{p}(\mathbf{x}|\theta)=g[t(\mathbf{x}),n,\theta]\cdot\psi(\mathbf{x})$$
$\forall x,\theta$ and assuming that $g(\theta)$ is integrable on all
$\Theta$
Then the family
$$\mathbb{\pi}(\theta)\propto g(s,m,\theta)$$
Is the conjugate prior.
Applying this theorem to the binomial model you immediately identify that
$$g(t,n,\theta)=\theta^{x}(1-\theta)^{n-x}$$
thus the conjugate prior must be of the form
$$\theta^{a}(1-\theta)^{b}$$
that is obviously the kernel of a Beta distribution (to ensure it is a denisty you have to multiply it by the normalization constant, of course, but it is not a problem as the beta distribution is a known density).
Here you can find a very useful table of the most common model with priors, posteriors, parameters and so on...