原链接(我是原作者):https://yonggie.github.io/posts/2023/12/blog-post-1/
Motivation
To allievate temporal accumulated errors, so introduce kkk future actions for current prediction.
Explanation
Conditioned Variational Auto Encoder(CVAE), Transformer as encoder and decoder.
Settings
action={imgs,joints}\{imgs, joints\}{imgs,joints}
Training
Inference
Mannually set K and K running-average possiblity to esamble current action embedding (to tackle accumulated errors).
Questions
We see the condition is not the same, even though paper claimed that it uses “Conditioned VAE”. It’s a mathematically wrong approach in the first place. We are not even talking about the [CLS] and [POS_EMD] auxiliary input.
Thoughts & Comments
- Transformer as CVEA encoder and decoder
- K temporal ensamble