Note above that U,V,W may be shared across the network.
Difference between RNN and LSTM? What exactly is LSTM (Long Short Term Memory networks). There’s a good article below.
Embeddings Word2vec is a group of related models that are used to produce word embeddings. Word2vec
f(x) = x
Derivative f’(x) = 1
f(x) = 0 for x < 0
f(x) = x for x >= 0
f’(x) = 0 for x < 0
= 1 for x >= 0
See a lot more here: https://en.wikipedia.org/wiki/Activation_function
The following below need to be sorted
One of the first things to do is add meaningful names.