01

Encoders × decoders

Same utterance animated from four discrete speech representations, each paired with both decoder architectures.


02

AVTTS results


03

Citation