Can someone intuitively explain this "Information Filter" formation quoted from wikipedia ? In particular I struggle to understand why $\mathbf{I}_k = \mathbf{H}_k^\textsf{T} \mathbf{R}_k^{-1} \mathbf{H}_k$ and $\mathbf{i}_k = \mathbf{H}_k^\textsf{T} \mathbf{R}_k^{-1} \mathbf{z}_k$ become something we can directly add as "information" to the $Y$ matrix.
In cases where the dimension of the observation vector '''y''' is bigger than the dimension of the state space vector '''x''', the information filter can avoid the inversion of a bigger matrix in the Kalman gain calculation at the price of inverting a smaller matrix in the prediction step, thus saving computing time. In the information filter, or inverse covariance filter, the estimated covariance and estimated state are replaced by the information matrix and the information vector respectively. These are defined as: :\begin{align} \mathbf{Y}_{k \mid k} &= \mathbf{P}_{k \mid k}^{-1} \\ \hat{\mathbf{y}}_{k \mid k} &= \mathbf{P}_{k \mid k}^{-1}\hat{\mathbf{x}}_{k \mid k} \end{align}
Similarly the predicted covariance and state have equivalent information forms, defined as: \begin{align} \mathbf{Y}_{k \mid k-1} &= \mathbf{P}_{k \mid k-1}^{-1} \\ \hat{\mathbf{y}}_{k \mid k-1} &= \mathbf{P}_{k \mid k-1}^{-1}\hat{\mathbf{x}}_{k \mid k-1} \end{align}
and the measurement covariance and measurement vector, which are defined as: \begin{align} \mathbf{I}_k &= \mathbf{H}_k^\textsf{T} \mathbf{R}_k^{-1} \mathbf{H}_k \\ \mathbf{i}_k &= \mathbf{H}_k^\textsf{T} \mathbf{R}_k^{-1} \mathbf{z}_k \end{align}
The information update now becomes a trivial sum \begin{align} \mathbf{Y}_{k \mid k} &= \mathbf{Y}_{k \mid k-1} + \mathbf{I}_k \\ \hat{\mathbf{y}}_{k \mid k} &= \hat{\mathbf{y}}_{k \mid k-1} + \mathbf{i}_k \end{align}