Comparing BPSK and DPSK I noticed that BPSK has lower BER. Why is this?
You can think of DPSK as being a binary data stream that's been run through a differentiator and then encoded with BPSK. I.e., if your original stream is a collection of \$x[k]\$, then your differential stream is a collection of \$x_d[k] = x[k] \oplus x[k-1]\$.
To receive DPSK, you demodulate it like BPSK, then take the first backwards sum (which, because the arithmetic is modulo 2, looks like the first backward difference): for a received stream out of the demodulator of \$\hat x_d\$, you compute \$\hat x[k] = \hat x_d[k] \oplus \hat x_d[k - 1]\$.
This means that a single bit error in your "raw" stream results in two bit errors in your final DPSK stream. So -- higher bit error rate.
Are there other advantages of using one over the other?
Yes. Often when you're sending PSK, you don't have an absolute phase reference. You have to generate your own phase reference from the incoming signal (see carrier recovery). This phase reference just naturally has a 180 degree ambiguity. DPSK takes care of that -- it pays attention to the phase changes, so that phase ambiguity takes care of that.
There have been schemes proposed to do something along the lines of "Well, I'm using forward error correction anyway, so I'll just feed the corrector the true bit stream and an inverted bit stream. Then whichever one has fewer detected errors -- wins!"
The problem with that is that if you have enough noise in your signal that you need forward error correction, you probably have enough noise in your system that it will exhibit a phenomenon where the carrier recovery circuit will slip by 180 degrees when that happens. With plain old DPSK, that would be a single bit error (not even two!). However, with the BPSK -> smart error correction scheme, you get a block of erroneous bits that's up to twice the block size of a block error correction code, or the coherence length of a convolutional code. Since good codes tend to be really long, that means that what would have been just one little error ends up being a whole bunch of useless bits.