The Attention is all you need paper mentioned positional encoding without lacking some details. I am going to write my understanding at those details
The formula is the following:
PE(pos,2i) =sin(pos/100002i/dmodel)
PE(pos,2i+1) =cos(pos/100002i/dmodel)
The paper mentioned that the i is the dimension index of and dmodel is dimension of the embedding. If so, given the last i = dmodel -1, 2i will be out of the bound. So, that is not the correct explanation.
2i and 2i+ 1 here suggest even and odd dimension indices. At the even dimension indices, apply sine function; at the odd dimension indices, apply cosine function. So i is ranged from [0, to dmodel/2) and for each i, it generates 2 dimensions.
Once having the PE (Positional Encoding) value for a position, by the diagram in page 3, it is added to the embedding of the input.
new_embedding[pos, 2i] = embedding[pos, 2i] + PE(pos, 2i)
new_embedding[pos, 2i+1] = embedding[pos, 2i+1] + PE(pos, 2i+1)
The embedding variable here is the embedding for each word in a sentence, and pos is the position of the sentence. (It is a sentence - not the whole dictionary.)
This part of the StatQuest video clearly explained how embedding is calculated.