Training wav2vec on Multiple Languages From Scratch
Large amount of parallel speech-text data is not available in most languages, leading to the development of wav2vec for ASR systems. The training process involves self-supervised pretraining and low-resource finetuning. The model architecture includes a multi-layer convolutional feature encoder, qua
0 views • 10 slides