Today, we are one step closer to the immortal celebrity future we have long been promised ( since April ). Meta has unveiled Voicebox, its generative text-to-speech model that promises to do for the spoken word what ChatGPT and Dall-E, respectfully, did for text and image generation.
Essentially, its a text-to-output generator just like GPT or Dall-E — just instead of creating prose or pretty pictures, it spits out audio