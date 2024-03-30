OpenAI has unveiled a new platform called Voice Engine, which promises to revolutionize the field of speech synthesis. This tool is able to generate a synthetic voice from a short audio sample of just 15 seconds of a person, offering the possibility of reading texts on command in the original language of the sample or in other languages. With the aim of evaluating positive applications and necessary security measures, OpenAI has launched limited access to this technology, collaborating with several companies in various sectors.

Partners who have already had the opportunity to experiment with Voice Engine include Age of Learning, a company active in the technology education sector; HeyGen, a visual storytelling platform; Dimagi, creator of software for the frontline healthcare sector; Livox, developer of AI communication apps; and the Lifespan Health System. These collaborations have allowed us to explore practical applications of the technology, such as the creation of pre-scripted speech content and real-time personalized responses for students, written through GPT-4.

Jeff Harris, a member of OpenAI's product team for Voice Engine, revealed that development of the platform began in late 2022. The technology leverages licensed and publicly available data to power the text-to-speech API's pre-built voices and Read Aloud feature of ChatGPT. However, access to Voice Engine will be limited to around ten developers, highlighting OpenAI's caution in introducing this technology.

The field of text-to-audio generation, especially AI-based voice cloning, is experiencing rapid evolution, with companies like Podcastle and ElevenLabs standing out for their innovations. This growing interest, however, clashes with ethical and security concerns related to the improper use of the technology, as demonstrated by the US Federal Communications Commission's recent ban on automated calls that use cloned AI voices without consent.

OpenAI has required its partners to adhere to strict usage policies, which include the prohibition of impersonating individuals or organizations without their consent, the obligation to obtain the explicit and informed consent of the original speaker, and the commitment not to allow users to create their own entries. Furthermore, all generated audio clips will carry a watermark to facilitate traceability and the use of synthetic voice will be carefully monitored. In response to the potential risks, OpenAI proposes various preventative measures, such as eliminating voice authentication for accessing bank accounts, policies to protect the use of people's voices in AI, increased efforts in deepfake education and the development of AI content tracking systems.