Derivation of the time domain Wiener FilterPermalink
Derivation of the Wiener filter using linear algebraPermalink
The Wiener filter is a linear minimum mean-square-error (LMMSE) estimator, thus we seek a linear combination of the sequence x(n) that minimizes the MSE. Let us define the mean-square-error to be
MSE≜E[(s(n)−ˆs(n))2]
where s(n)−ˆs(n) is the error signal between the desired signal s(n) and the estimated one ˆs(n). Particularly for the Wiener filter is that ˆs(n) is a linear combination of the noisy signal x(n) such that ˆs(n)=wTx(n) (hence the name LMMSE). Therefore, the MSE for the Wiener filter is
MSE=E[(s(n)−wTx(n))2].
As we are interested in obtaining w, we minimize the MSE and obtain the argument at the minima hence
w∗=arg minwE[(s(n)−wTx(n))2].
Obtaining the optimum solution w∗ is fairly straightforward given that we have some fundamental knowledge on multivariate calculus. We essentially only need to take the derivative with respect w, set it equal to 0 and solve for w - just like the way you usually would solve an optimization problem. Of course, for the multivariate case i.e. where our solution of w is a vector, we compute the gradient instead taking the derivative with respect to a coefficient in w. The gradient of the MSE is
∇wMSE=∇wE[(s(n)−wTx(n))2]=E[∇w(s(n)−wTx(n))2]
It is important to note that the expectation operator E[⋅] is linear which means that we may move the gradient operator (which is also linear) inside the expectation operator. Taking the gradient of (s(n)−wTx(n))2 is straightforward as we can simply use the chain rule. Using the chain rule, the gradient of the MSE is
∇wMSE=E[−2(s(n)−wTx(n))x(n)]=0.
We multiply into the parentesis and expand to
∇wMSE=E[−2s(n)x(n)+2x(n)xT(n)w]=0.
and note that x(n)xT(n)w=(wTx(n))x(n)
∇wMSE=−E[s(n)x(n)]+E[x(n)xT(n)]w=0.
and here it is also important to note that w is not a random vector which means that we can move w outside the expectation operator. We now have E[s(n)x(n)] which is the cross-correlation vector between s(n) and x(n) and E[x(n)xT(n)] which is the autocorrelation matrix. Let Rxx=E[x(n)xT(n)] and rxs=E[s(n)x(n)] then solving for w becomes
w=R−1xxrxs
Therefore
arg minwE[(s(n)−wTx(n))2]=R−1xxrxs
Derivation of the Wiener filter without linear algebraPermalink
We recommend you to get comfortable with multivariate calculus and linear algebra as it is by far more convenient that deriving the solution to the Wiener filter without linear algebra. And as you will see, you most likely will end up using linear algebra to get to the final solution anyway.
Nonetheless, the optimization problem we are trying to solve in this case is
w∗0,…,w∗I−1=arg minw0,…,wI−1E[(s(n)−M−1∑i=0wix(n−i))2]
To solve this optimization problem, we take the partial derivative with respect to the Wiener filter coefficients. That is
∂∂wjMSE=E[∂∂wj(s(n)−M−1∑i=0wix(n−i))2]
Again we use the chain rule such that
∂∂wjMSE=E[−2(s(n)−M−1∑i=0wix(n−i))x(n−j)]=0
=E[−s(n)x(n−j)+M−1∑i=0wix(n−i)x(n−j)]
=−E[s(n)x(n−j)]+M−1∑i=0wiE[x(n−i)x(n−j)]
where rsxj=E[s(n)x(n−j)] is the cross-correlation between s(n) and x(n−j), and rxixj=E[x(n−i)x(n−j)] is the autocorrelation between x(n−i) and x(n−j). We now have
M−1∑i=0wirxixj=rsxj
which is
w0rx0xj+…+wjrxjxj+…+wM−1rxM−1xj=rsxj
and as we can see, the solution of wj depends on all other coefficients of the Wiener filter also (no surprise here). However, if we take the partial derivative for all M coefficient we end up with M linear equations with M unknowns i.e.
∂∂w0MSE=0⇒w0rx0x0+…+wM−1rxM−1x0=rsx0
⋮
∂∂wM−1MSE=0⇒w0rx0xM−1+…+wM−1rxM−1xM−1=rsxM−1.
On this form linear algebra comes in very handy as we may put the equations into matrix-vector form making it easy to solve.
Leave a comment