iOS 18: Your iPhone Just Got All These New Features
I Switched To ChatGPT's Voice Mode. Here Are 7 Reasons Why It's Better Than Typing
I really didn't expect much the first time I tapped the tiny wavelength icon to try ChatGPT's Voice Mode. I figured it was just another AI gimmick. After all, I've been disappointed by voice assistants before -- but this isn't Siri.
Don't miss: What Is ChatGPT? Everything You Need to Know About the AI Chatbot
Voice Mode slips effortlessly into the give-and-take of a real human conversation, catching my pauses, half-finished thoughts and throw-away "ums." I can figure out what I'm making for dinner while inching through LA traffic or brush up on my Polish while wiping down counters in my apartment. All without breaking the conversational flow or ever reaching for my keyboard.
ChatGPT, from OpenAI, isn't the only chatbot going hands-free. Google's Gemini Live offers the same "talk over me, and I'll keep up" vibe. Anthropic's Claude has a beta version of its voice mode on its mobile apps, complete with on-screen bullet points as it speaks, and Perplexity's iOS and Android assistant also answers spoken questions and launches apps like OpenTable or Uber on command.
But even with everyone racing to master real-time AI conversation, ChatGPT remains my go-to. Whatever your chatbot of choice, take a break from the typing and try out the voice option. It's far more useful than you think.
(Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)
Watch this: ChatGPT's Viral Feature: Turning People Into Action Figures
01:19 What exactly is Voice Mode?Voice chat (or "voice conversations") is ChatGPT's hands-free mode that lets you talk to the AI model and hear it talk back to you, no typing required. There's a voice icon that you'll find in the mobile, desktop and web app on the bottom-right of any conversation you're in. If you press the button, you can say your question aloud and ChatGPT will transcribe it, reason over it and reply. As soon as it's done talking, it starts listening again, creating a natural back-and-forth dialogue.
Just remember: Voice Mode runs on the same large language model as regular ChatGPT, so it can still hallucinate or get facts wrong. You should always double-check anything important.
OpenAI offers two versions of these voice conversations: Standard Voice (the default, lightweight option for free) and Advanced Voice (only available for paid users).
Standard Voice first converts your speech to text and processes it with GPT-4o (and GPT-4o mini), taking a little bit longer to talk back to you. Advanced Voice, on the other hand, uses natively multimodal models, meaning it "hears" you and generates audio, so the conversation is more natural and done in real time. It can pick up on cues other than the words themselves, like the speed you're talking or the emotion in your voice, and adjust to this.
Note: Free users can access a daily preview of Advanced Voice.
awe Nelson Aguilar/CNET7 reasons you should start using ChatGPT's Voice Mode feature1. It's genuinely conversationalUnlike typing, when I talk to ChatGPT, I'm not hunting for the right word or backspacing after every typo. I'm just speaking, like I would with any friend or family member, filled with "ummmmms" and "likes" and other awkward breaks. Voice Mode rolls with all of my half-finished thoughts, though, and responds with either a fully fleshed-out answer or a question to help me hone in on what I need. This effortless give-and-take feels much more natural than typing.
2. You can use ChatGPT hands-freeObviously, I still need to open the ChatGPT app and tap on the Voice Mode button to start, but once I begin, I no longer have to use my hands to continue a conversation with the AI chatbot. I can be stuck in traffic and brainstorm a vacation that I want to take later this year. I can ask about flights, hotels, landmarks, restaurants and anything else, without touching my phone, and that conversation is saved within the app, so that I don't have to remember everything that ChatGPT tells me.
3. It's good for learning a new language with real-time translationI mentioned earlier that I use Voice Mode to practice languages, which Voice Mode excels in. I can speak in English and have ChatGPT respond in flawless Polish, complete with pronunciation tips. Just ask Voice Mode, "Can you help me practice my (language)" and it'll respond with a few ways it can help you, like conversation starters, basic vocabulary or numbers. And it remembers where you left off, so you can, in a way, take lessons; no Duolingo needed.
4. Get answers about things you see in the real worldThis feature is exclusive to Advanced Voice, but this is probably my favorite feature with Voice Mode. Thanks to its multimodal superpowers, I can turn on my phone's camera or take a video/photo and ask ChatGPT to help me. For example, I had trouble recognizing a painting I found at a thrift store, and the owner had no idea where it came from. I pulled up voice chat, turned on my camera and asked Voice Mode where the painting was from. In seconds, it could tell me the title of the painting, the artist's name and when it was painted.
5. It's a better option for people with certain disabilitiesFor anyone with low vision or dyslexia, talking for sure beats typing. Voice Mode can transcribe your speech and then read your answer aloud at whatever pace you choose (you can adjust this in your settings or ask ChatGPT to slow down). The hands-free option also helps anyone with motor-skill challenges, because all you need to do is one-tap to start and another to stop, without extensive typing on a keyboard.
6. Faster brainstormingSometimes I get a burst of ideas, and I think faster than I can type, so ChatGPT's Voice Mode is perfect for spitballing story ideas, figuring out a new layout for my living room or deciding interesting meals to cook for the week. Because I'm thinking aloud instead of staring at my phone, my ideas flow much easier and faster, especially with ChatGPT's instant follow-ups. It helps keep the momentum rolling until I've got a polished idea for whatever I'm brainstorming.
7. Instant summaries you can listen toDrop a 90-page PDF in the chat, like for a movie script or textbook, ask for a summarization and have the AI read it aloud to you while you fold laundry. It's like turning any document (I even do Wikipedia pages) into a podcast -- on demand.
Voice Mode isn't just a neat trick; it's a quick and more natural way to use ChatGPT. Whether you're translating street signs, brainstorming an idea or catching up on the news aloud, talking to ChatGPT feels less like using a chatbot and more like having a conversation with a bite-sized expert. Once you get used to thinking out loud, you might never go back to your keyboard.
Confirmed: Nintendo Has Freed The F Bomb And You Can Swear In Switch 2 Speech-to-text GameChat
It's been a whirlwind 24 hours (depending on where you are and who is counting) since the Nintendo Switch 2 launch, and there's plenty for players to do with one of the most highly anticipated consoles in years. So much to do, in fact, that our Nintendo Switch 2 review is still in progress. But there's one handy new addition you might not be aware of: the speech-to-text GameChat feature for talking with friends – wait, did someone just say fuck? And did Nintendo then actually print that?
It's the age-old tale; give people the option to speak to a device that will listen, and they're going to curse at it. Just ask the chat log history between me and our home's first Alexa. Of course, Nintendo is so family-friendly that you might think it'd have made a workaround for this, no? Some sort of censoring on its prized Switch 2? Well, no – looks like you can curse up a storm at your brand-new Switch 2 and it'll translate straight into GameChat.
In a Bluesky post, Switch 2 player David Howe shared a screenshot of a speech-to-text conversation between himself and a friend, with console documenting their entire verbal revelation that they could, in fact, curse.
You may like"CONFIRMED: you can say fuck in game chat speech-to-text," Howe wrote, accompanied by photographic evidence.
This was further confirmed by our own Rollin Bishop, who took to his own brand-new Switch 2 and spent some time cursing at it only to find it displayed those words right back to him. And to our brand director Sam Loveridge, who signed up as the swearing test dummy.
For anyone else looking to do the same, GameChat's speech-to-text feature isn't the default. Pressing the C-button will bring up GameChat, while its speech-to-text feature can be enabled in accessibility settings. It is, swearing aside, a great accessibility feature. If it's at all confusing, don't worry; we've got you covered with our own Nintendo Switch 2 GameChat guide.
This novelty may wear off within a week, but for now, a good number of Switch 2 consoles are likely to be berated by a most creative and awful string of curse words. Of course, the Switch 2 is just the latest in a long line of consoles that have been verbally abused by players; most of those just didn't have ears.
A GameStop reportedly ruined new Nintendo Switch 2 consoles with receipts stapled to the screen, but hey, Walmart is giving out free chips and soda "to celebrate launch day"
Text-to-speech With Feeling - This New AI Model Does Everything But Shed A Tear
We Are/Getty ImagesNot so long ago, generative AI could only communicate with human users via text. Now it's increasingly being given the power of speech -- and this ability is improving by the day.
On Thursday, AI voice platform ElevenLabs introduced v3, described on the company's website as "the most expressive text-to-speech model ever." The new model can exhibit a wide range of emotions and subtle communicative quirks -- like sighs, laughter, and whispering -- making its speech more humanlike than the company's previous models.
Also: Could WWDC be Apple's AI turning point? Here's what analysts are predicting
In a demo shared on X, v3 was shown generating the voices of two characters, one male and the other female, who were having a lighthearted conversation about their newfound ability to speak in more humanlike voices.
There's certainly none of the Alexa-esque flatness of tone, but the v3-generated voices tend to be almost excessively animated, to the point that their laughter is more creepy than charming -- take a listen yourself.
The model can also speak more than 70 languages, compared to its predecessor's v2 limit of 29. It's available now in public alpha, and its price tag has been slashed by 80% until the end of this month.
AI-generated voice has become a major focus of innovation as tech developers look toward the future of human-machine interaction.
Automated assistants like Siri and Alexa have long been able to speak, of course, but as anyone who routinely uses these systems can attest, their voices are very mechanical, with a rather narrow range of emotional cadence and tones. They're useful for handling quick and easy tasks, like playing a song or setting an alarm, but they don't make great conversation partners.
Some of the latest text-to-speech (TTS) AI tools, on the other hand, have been engineered to speak in voices that are maximally realistic and engaging.
Also: You shouldn't trust AI for therapy - here's why
Users can prompt v3, for example, to speak in voices that are easily customizable through the use of "audio tags." Think of these as stylistic filters that modify the output, and which can be inserted directly into text prompts: "Excited," "Loudly," "Sings," "Laughing," "Angry," and so on.
ElevenLabs isn't the only company racing to build more lifelike TTS models, which big tech companies are selling as a more intuitive and accessible way to interact with AI.
In late May, ElevenLabs competitor Hume AI unveiled its Empathic Voice Interface (EVI) 3 model, which allows users to generate custom voices by describing them in natural language. Similarly nuanced conversational abilities are also now on offer through Google's Gemini 2.5 Pro Flash model.
Want more stories about AI? Sign up for Innovation, our weekly newsletter.
Artificial Intelligence The best AI for coding in 2025 (including two new top picks - and what not to use) I tested 10 AI content detectors - and these 5 correctly identified AI text every time The best AI image generators are getting scary good at things they used to be terrible at Looking for an AI-powered website builder? Here's your best option in 2025 Use AI at work? You might be ruining your reputation, a new study finds
Comments
Post a Comment