I am looking for a fast way to find the modular multiplicate inverse of an integer $a$ in mod $p$. I am mainly interested in $p=115792089210356248762697446949407573530086143415290314195533631308867097853951$, which is the prime used for the finite field over which the elliptic curve P-256 is defined by the NIST (https://www.nsa.gov/ia/_files/nist-routines.pdf).
Take for example $a=23212057121572626849771762094316348576132063248523924165380765792316089469237$. What is the fastest way to calculate $a^{-1}$ mod $p$? I am currently using the extended Euclidean algorithm, but it is quite slow. I need the speed, since it's an operation that I will need to use a lot for encryption.
A file of 1664 bytes for example, requires about 20000 uses of the extended Euclidean, which takes about 10 seconds on my computer, which is quite long to encrypt 1664 bytes.
The algorithm I'm using is the following (taken from "Guide to Elliptic Curve Cryptography":