The general idea behind the measure-theoretic formulation of probability is that assigning probabilities to events is in many ways similar to assigning a measure to subsets of a measurable space.
This similarity is especially striking under the "sample space/event space" formalism:
Given an experiment $E$ whose outcome is random or uncertain,
one defines
- the sample space $\Omega$ as a set that contains every possible outcome of the experiment $E$;
- the event space $\mathscr F$ as a collection of subsets $A\subset\Omega$ (each of which is called an event).
Then,
one defines a function $\Pr:\mathscr F\to[0,1]$ that assigns to every event $A$ its chance of success $\Pr(A)$.
Then,
if you think about the properties that $\Pr$ should satisfy (for example,
the probability of $[A$ or $B]$,
where $A$ and $B$ cannot both occur simultaneously,
is given by $\Pr(A)+\Pr(B)$; or $\Pr(\Omega)=1$) you will notice that they are very similar to the definition of a measure.
As for the specific reasons why the mathematical objects studied in measure theory are defined the way they are,
you can look at the math.stackexchange question "What is the Definition of a Measurable Set?". My answer to that question also includes a reference for further reading.
Also,
if you have lots of time on your hands and you are interested in the history of probability and the role of measure theory in this history,
you can read "The Sources of Kolmogorov’s
Grundbegriffe" by Shafer and Vovk.