Derivation of the time domain Wiener FilterPermalink

Derivation of the Wiener filter using linear algebraPermalink

The Wiener filter is a linear minimum mean-square-error (LMMSE) estimator, thus we seek a linear combination of the sequence x(n) that minimizes the MSE. Let us define the mean-square-error to be

MSEE[(s(n)ˆs(n))2]

where s(n)ˆs(n) is the error signal between the desired signal s(n) and the estimated one ˆs(n). Particularly for the Wiener filter is that ˆs(n) is a linear combination of the noisy signal x(n) such that ˆs(n)=wTx(n) (hence the name LMMSE). Therefore, the MSE for the Wiener filter is

MSE=E[(s(n)wTx(n))2].

As we are interested in obtaining w, we minimize the MSE and obtain the argument at the minima hence

w=arg minwE[(s(n)wTx(n))2].

Obtaining the optimum solution w is fairly straightforward given that we have some fundamental knowledge on multivariate calculus. We essentially only need to take the derivative with respect w, set it equal to 0 and solve for w - just like the way you usually would solve an optimization problem. Of course, for the multivariate case i.e. where our solution of w is a vector, we compute the gradient instead taking the derivative with respect to a coefficient in w. The gradient of the MSE is

wMSE=wE[(s(n)wTx(n))2]=E[w(s(n)wTx(n))2]

It is important to note that the expectation operator E[] is linear which means that we may move the gradient operator (which is also linear) inside the expectation operator. Taking the gradient of (s(n)wTx(n))2 is straightforward as we can simply use the chain rule. Using the chain rule, the gradient of the MSE is

wMSE=E[2(s(n)wTx(n))x(n)]=0.

We multiply into the parentesis and expand to

wMSE=E[2s(n)x(n)+2x(n)xT(n)w]=0.

and note that x(n)xT(n)w=(wTx(n))x(n)

wMSE=E[s(n)x(n)]+E[x(n)xT(n)]w=0.

and here it is also important to note that w is not a random vector which means that we can move w outside the expectation operator. We now have E[s(n)x(n)] which is the cross-correlation vector between s(n) and x(n) and E[x(n)xT(n)] which is the autocorrelation matrix. Let Rxx=E[x(n)xT(n)] and rxs=E[s(n)x(n)] then solving for w becomes

w=R1xxrxs

Therefore

arg minwE[(s(n)wTx(n))2]=R1xxrxs

Derivation of the Wiener filter without linear algebraPermalink

We recommend you to get comfortable with multivariate calculus and linear algebra as it is by far more convenient that deriving the solution to the Wiener filter without linear algebra. And as you will see, you most likely will end up using linear algebra to get to the final solution anyway.

Nonetheless, the optimization problem we are trying to solve in this case is

w0,,wI1=arg minw0,,wI1E[(s(n)M1i=0wix(ni))2]

To solve this optimization problem, we take the partial derivative with respect to the Wiener filter coefficients. That is

wjMSE=E[wj(s(n)M1i=0wix(ni))2]

Again we use the chain rule such that

wjMSE=E[2(s(n)M1i=0wix(ni))x(nj)]=0

=E[s(n)x(nj)+M1i=0wix(ni)x(nj)]

=E[s(n)x(nj)]+M1i=0wiE[x(ni)x(nj)]

where rsxj=E[s(n)x(nj)] is the cross-correlation between s(n) and x(nj), and rxixj=E[x(ni)x(nj)] is the autocorrelation between x(ni) and x(nj). We now have

M1i=0wirxixj=rsxj

which is

w0rx0xj++wjrxjxj++wM1rxM1xj=rsxj

and as we can see, the solution of wj depends on all other coefficients of the Wiener filter also (no surprise here). However, if we take the partial derivative for all M coefficient we end up with M linear equations with M unknowns i.e.

w0MSE=0w0rx0x0++wM1rxM1x0=rsx0

wM1MSE=0w0rx0xM1++wM1rxM1xM1=rsxM1.

On this form linear algebra comes in very handy as we may put the equations into matrix-vector form making it easy to solve.

Leave a comment