EE 559 Homework week 6

Starting from:

$30

p. 1 of 2
EE 559 Homework week 6
1. For 2-class perceptron with margin algorithm, using basic sequential GD, fixed increment,
prove convergence for linearly separable training data by modifying the perceptron
convergence proof covered in class. You may write out the proof, or you may take the
3-page proof from lecture (included in this HW folder, as updated with corrections), and
mark up the proof to show all changes as needed. If you mark up the existing proof, be sure
to mark everything that needs changing (e.g., if a change propagates through the proof, be
sure to make all changes for a complete answer).
2. You are given the following training data points in three pattern classes S1, S2, and S3:
Note that in our notation (throughout this class), for convenience we can write
with commas, to denote a column vector (of dimension 4 in this case).
(a) Find linear discriminant functions that correctly classify the training data, using the
multiclass Perceptron algorithm using maximal value method (given in Discussion
Week 6). Use augmented space (so first augment the data). There are few enough
iterations that this can be done by hand, or you may write code to do it if you prefer.
Use the following assumptions and starting point. Assume the data points have
already been shuffled, so use the training data in the order given above. Use
, and initial weight vectors:
.
(b) From this 5-dimensional feature space, consider points that lie in the plane defined
by all such that . Give the decision rule for points that
lie in this plane. Plot in 2-space, the decision boundaries and decision regions in
plane .
3. Suppose you set up a training algorithm to use a modified MSE criterion:
in which the purpose of the new term is to prefer small if .
(a) Find using gradient relations.
(b) Find the optimal by solving . Compare your result to the
pseudoinverse solution.
{(0,1,−1,2)} ∈S1; {(1,1,1,1),(2,1,1,1)} ∈S2; {(−1,1,0,−1)} ∈S3
x1, x2 , x3, x ( 4 )
η(i) = 1 ∀i
w(1)
(0) = −1, w(2)
(0) = 1, w(3)
(0) = 0
P
x x = 1, x1, x2 ( ,0,0) x1, x2 ( )
P
J (w) = 1
N
X w− b
2
2
+ λ w
2
2
w
2 λ > 0
∇wJ (w)
w = wˆ ∇wJ (w)=0
p. 2 of 2
4. Starting from the MSE criterion function, derive a learning algorithm using the basic
sequential gradient descent technique, as follows:
(a) Find an expression for and from that derive an expression for .
(b) Complete the derivation to get the sequential gradient descent algorithm based on
MSE. Compare with the Widrow-Hoff learning algorithm given in lecture.
Jn (w) ∇w Jn (w)