Generative artificial intelligence has slowly and steadily proliferated all walks of life the world over, whether in the form of text-based generators such as OpenAI’s ChatGPT and Google Bard, or image generators such as Midjourney and Dall-E.
ALSO READ: ChatGPT tricks: 10 things you probably didn’t know you could do with the AI bot
Recently, however, there has also been a big push towards voice-based generative artificial intelligence, with tools such as Meta’s AudioCraft. The latest entrant on the scene happens to be Apple, with the extremely exciting, albeit less talked about “Personal Voice” feature, for iOS 17, which has already been made available with the public beta version.
What is Personal Voice?
The Personal Voice feature for iPhones is designed for people who are at risk of losing their speech abilities over time or have a recently diagnosed condition such as ALS (Amyotrophic Lateral Sclerosis). Furthermore, it also may be useful for those who are simply just introverts, and shy speaking in front of large groups.
What it does, is give users 150 random sentences to read out aloud. Once done, it learns and clones their voice. From thereon, users can read any text on the device’s screen aloud, without having to actually speak themselves.
Personal Voice works in tandem with another feature called Live Speech, a text-to-speech mechanism. What it does, is allow users to type anything they’d like to say during voice as well as FaceTime calls, and even in-person conversations, and have it spoken out loud. Live Speech, Apple says, can use the user’s voice cloned by the Personal Voice feature. This ensures users don’t have to resort to using computerised, somewhat unnatural sounding voices which the iPhone has.
How to set up Personal Voice?
Personal Voice is extremely easy to set up, and the process takes about 15-20 minutes, depending on the user’s reading speed. It is important to remember here that the feature is currently only available with the public beta version of iOS 17. Here’s how to set it up.
1. Head to the “Settings” app.
2. Scroll down and the “Accessibility” menu
3. From thereon, Search for “Personal Voice.”
4. Tap on the “Personal Voice” option. At the top of the menu, one should see an option which says “Create a Personal Voice.” Click on the same.
5. Here, press the “Continue” button and give your voice a name in the subsequent window which pops up.
6. Now, hit the “Record” button and read out aloud the 150 sentences generated by the device.
7. Once done, you will receive a prompt informing you of the same.
8. Now, one needs to connect their phone to the charger and lock its screen for the device to learn and process your voice.
Bear in mind, your iPhone may take anywhere between a few hours and a few days for processing your voice, bearing in mind that the feature is still a beta version. Should one wish to check the progress on the same at any point in time, they can check for it by heading back to Personal Voice settings. Also, remember that the same is kept on hold until and unless one’s device screen is locked.
How to use one's recorded voice?
Once processed, here’s how to use the voice one has recorded, to speak.
1. Head over to “Settings”
2. Scroll down and tap on the “Accessibility” menu, and select “Live Speech” there.
3. Here, toggle on Live Speech.
4. In the subsequent window tap on “Voices” and choose the voice created using Personal Voice. It should appear at the top of the list.
5. Once enabled, one can triple-tap the sleep/wake button on their iPhone and open the Live Speech text-to-speech tool. Now, all one has to do is type whatever they want read out loud in the box, or choose any of their pre-set phrases.
ALSO READ: Deepfake audio are near impossible for humans to spot, study suggests
While Personal Voice is certainly an extremely handy tool, one also needs to bear in mind that artificially generated audio can be misused to dupe people. A recent study conducted by the University College of London found that humans were able to detect audio deepfakes only about 73% of the time, when they were made to listen to 50 deepfake samples generated by using a text-to-speech algorithm trained on an English and Mandarin dataset. Audio deepfakes are defined as artificially generated speech made using generative AI made to resemble real voices.
Unleash your inner geek with Croma Unboxed
Subscribe now to stay ahead with the latest articles and updates
You are almost there
Enter your details to subscribe
Happiness unboxed!
Thank you for subscribing to our blog.
Disclaimer: This post as well as the layout and design on this website are protected under Indian intellectual property laws, including the Copyright Act, 1957 and the Trade Marks Act, 1999 and is the property of Infiniti Retail Limited (Croma). Using, copying (in full or in part), adapting or altering this post or any other material from Croma’s website is expressly prohibited without prior written permission from Croma. For permission to use the content on the Croma’s website, please connect on contactunboxed@croma.com
- Related articles
- Popular articles
Atreya Raghavan
Comments