Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: speech Clear Filter

Here’s the Jimmy Kimmel clip that got him pulled off the air

is The Verge’s executive editor. He has covered tech, policy, and online creators for over a decade. Posts from this author will be added to your daily email digest and your homepage feed. Disney gave in to threats from FCC chairman and occasional speech regulator Brendan Carr this evening, announcing that Jimmy Kimmel Live! would be pulled off the air “indefinitely.” Carr was unhappy that Kimmel characterized the alleged shooter of Charlie Kirk as “anything other than” a member of the “MAGA g

CorentinJ: Real-Time Voice Cloning (2021)

Real-Time Voice Cloning This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. This was my master's thesis. SV2TTS is a deep learning framework in three stages. In the first stage, one creates a digital representation of a voice from a few seconds of audio. In the second and third stages, this representation is used as reference to generate speech given arbitrary text. Video

VibeVoice: A Frontier Open-Source Text-to-Speech Model

VibeVoice: A Frontier Open-Source Text-to-Speech Model VibeVoice is a novel framework designed for generating expressive, long-form, multi-speaker conversational audio, such as podcasts, from text. It addresses significant challenges in traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and natural turn-taking. A core innovation of VibeVoice is its use of continuous speech tokenizers (Acoustic and Semantic) operating at an ultra-low frame rate of 7.5 Hz.

Microsoft VibeVoice: A Frontier Open-Source Text-to-Speech Model

VibeVoice: A Frontier Open-Source Text-to-Speech Model VibeVoice is a novel framework designed for generating expressive, long-form, multi-speaker conversational audio, such as podcasts, from text. It addresses significant challenges in traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and natural turn-taking. A core innovation of VibeVoice is its use of continuous speech tokenizers (Acoustic and Semantic) operating at an ultra-low frame rate of 7.5 Hz.

An inner-speech decoder reveals some mental privacy issues

Most experimental brain-computer interfaces (BCIs) that have been used for synthesizing human speech have been implanted in the areas of the brain that translate the intention to speak into the muscle actions that produce it. A patient has to physically attempt to speak to make these implants work, which is tiresome for severely paralyzed people. To go around it, researchers at the Stanford University built a BCI that could decode inner speech—the kind we engage in silent reading and use for al

A mind–reading brain implant that comes with password protection

A brain scan (artificially coloured) produced by magnetic resonance imaging. Credit: K H Fung/Science Photo Library A brain implant can decode a person’s internal chatter — but the device works only if the user thinks of a preset password1. The mind-reading device, or brain–computer interface (BCI), accurately deciphered up to 74% of imagined sentences. The system began decoding users’ internal speech — the silent dialogue in people’s minds — only when they thought of a specific keyword. This

Scientists Say They’ve Found a Way to Vocalize the “Inner Voices” of People Who Can’t Speak

Image by Getty / Futurism Neuroscience/Brain Science New advances in brain-computer interface (BCI) technology may make speech for those who've lost the ability to do so easier than ever before. In a new, groundbreaking study published in the journal Cell, researchers from Stanford University claimed that they have found a way to decode the "inner speech" of those who can no longer vocalize, making it far less difficult to talk with friends and family than previous BCIs that required them to e

New Brain Interface Interprets Inner Monologues With Startling Accuracy

Scientists can now decipher brain activity related to the silent inner monologue in people’s heads with up to 74% accuracy, according to a new study. In new research published today in Cell, scientists from Stanford University decoded imagined words from four participants with severe paralysis due to ALS or brainstem stroke. Aside from being absolutely wild, the findings could help people who are unable to speak communicate more easily using brain-computer interfaces (BCIs), the researchers say

I tested this new AI podcast tool to see if it can beat NotebookLM - here's how it did

Speechify The Speechify text-to-speech app enables its over 50 million users worldwide to convert any text, including documents, articles, PDFs, and images into audio, with over 200 voices to choose from. Now, the company is delving into a new type of audio: AI-generated podcasts. Also: I finally gave NotebookLM my full attention - and it really is a total game changer Starting today, Speechify users will be able to turn any content into a "lecture-style" podcast. They'll also get access to a

Conversations remotely detected from cell phone vibrations, researchers report

UNIVERSITY PARK, Pa. — An emerging form of surveillance, “wireless-tapping,” explores the possibility of remotely deciphering conversations from the tiny vibrations produced by a cell phone’s earpiece. With the goal of protecting users’ privacy from potential bad actors, a team of computer science researchers at Penn State demonstrated that transcriptions of phone calls can be generated from radar measurements taken up to three meters, or about 10 feet, from a phone. While accuracy remains limit

I tested 3 text-to-speech AI models to see which is best - hear my results

Elyse Betters Picaro / ZDNET ZDNET's key takeaways There are now several AI tools available that can generate humanlike speech. Some AI voices can now whisper, laugh, and perform other expressive feats. TTS tools vary in terms of their level of realism and their intended audiences. Synthetic voices generated by artificial intelligence are, for better or worse, becoming commonplace. Meanwhile, the number of companies developing this technology is growing rapidly. Recent innovations in AI, s

Balabolka is a free text-to-speech tool for Windows

The on-screen text can be saved as a WAV, MP3, MP4, OGG or WMA file. The program can read the clipboard content, view the text from AZW, CHM, DjVu, DOC, EPUB, FB2, HTML, LIT, MOBI, ODT, PRC, PDF and RTF files, customize font and background colour, control reading from the system tray or by the global hotkeys. The program uses various versions of Microsoft Speech API (SAPI); it allows to alter a voice's parameters, including rate and pitch. The user can apply a special substitution list to impro

Mistral’s Voxtral goes beyond transcription with summarization, speech-triggered functions

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Mistral released an open-sourced voice model today that could rival paid voice AI, such as those from ElevenLabs and Hume AI, which the company said bridges the gap between proprietary speech recognition models and the more open, yet error-prone versions. Voxtral, which Mistral will release under an Apache 2.0 license, is available in a 24

Voxtral – Frontier open source speech understanding models

Voice: the original UI. Voice was humanity’s first interface—long before writing or typing, it let us share ideas, coordinate work, and build relationships. As digital systems become more capable, voice is returning as our most natural form of human-computer interaction. Yet today’s systems remain limited—unreliable, proprietary, and too brittle for real-world use. Closing this gap demands tools with exceptional transcription, deep understanding, multilingual fluency, and open, flexible deploy

Building voice AI that listens to everyone: Transfer learning and synthetic speech in action

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Have you ever thought about what it is like to use a voice assistant when your own voice does not match what the system expects? AI is not just reshaping how we hear the world; it is transforming who gets to be heard. In the age of conversational AI, accessibility has become a crucial benchmark for innovation. Voice assistants, transcriptio

X CEO Linda Yaccarino is stepping down after two years

Linda Yaccarino is stepping down as CEO of X, apparently effective immediately. She posted the news, naturally, on X, saying "I’m immensely grateful to [Elon Musk] for entrusting me with the responsibility of protecting free speech, turning the company around, and transforming X into the Everything App." She went on to say that "the historic business turn around we have accomplished together has been nothing short of remarkable." Reasonable minds can differ on if any of those things have happene

A proof-of-concept neural brain implant providing speech

Stephen Hawking, a British physicist and arguably the most famous man suffering from amyotrophic lateral sclerosis (ALS), communicated with the world using a sensor installed in his glasses. That sensor used tiny movements of a single muscle in his cheek to select characters on a screen. Once he typed a full sentence at a rate of roughly one word per minute, the text was synthesized into speech by a DECtalk TC01 synthesizer, which gave him his iconic, robotic voice. But a lot has changed since

A neural brain implant provides near instantaneous speech

Stephen Hawking, a British physicist and arguably the most famous man suffering from amyotrophic lateral sclerosis (ALS), communicated with the world using a sensor installed in his glasses. That sensor used tiny movements of a single muscle in his cheek to select characters on a screen. Once he typed a full sentence at a rate of roughly one word per minute, the text was synthesized into speech by a DECtalk TC01 synthesizer, which gave him his iconic, robotic voice. But a lot has changed since

Supreme Court upholds Texas porn law that caused Pornhub to leave the state

The Supreme Court today upheld a Texas law that requires age verification on porn sites, finding that the state's age-gating law doesn't violate the First Amendment. The 6–3 decision delivered by Justice Clarence Thomas rejected an appeal by the Free Speech Coalition, an adult-industry lobby group. Pornhub disabled its website in Texas last year because of the state law. The Supreme Court's conservative majority decided that the law should be reviewed under the standard of intermediate scrutin

Topics: age court law said speech

Supreme Court Says Age Verification Laws for Porn Sites Are Constitutional

The U.S. Supreme Court ruled on Friday that states with laws requiring age verification for porn sites is constitutional. The case, known as Free Speech Coalition v. Paxton (Ken Paxton is the Attorney General of Texas), was decided 6-3 with the court’s three liberal justices dissenting. The Texas law, which requires age verification using a credit card or a government-issued ID document, went into effect in 2023 and Pornhub started blocking access to the site in the Lone Star State in protest.

DeepSpeech Is Discontinued (2020)

Status This project is now discontinued. Project DeepSpeech DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. Documentation for installation, usage, and training models are available on deepspeech.readthedocs.io. For the latest release, including pre-trained models and checkpoints, see the latest release on GitHub.

The NO FAKES act has changed, and it's worse

A bill purporting to target the issue of misinformation and defamation caused by generative AI has mutated into something that could change the internet forever, harming speech and innovation from here on out. The Nurture Originals, Foster Art and Keep Entertainment Safe (NO FAKES) Act aims to address understandable concerns about generative AI-created “replicas” by creating a broad new intellectual property right. That approach was the first mistake: rather than giving people targeted tools to

Apple devices offer amazing speech to text transcription in developer betas, shows test

If you ever need to transcribe audio or video to text, most current apps are powered by OpenAI’s Whisper model. You’re probably using this model if you use apps like MacWhisper to transcribe meetings or lectures, or to generate subtitles for YouTube videos. But iOS 26 and Apple’s other developer betas include the company’s own transcription frameworks – and a test suggests that they match Whisper’s accuracy while running at more than twice the speed … If you’ve ever used the built-in dictation

X sues New York over hate speech disclosure law

Social media company X has filed a lawsuit against the state of New York over a law governing hate speech. The social network's Global Government Affairs account posted about the suit, claiming the law's required disclosures infringe on First Amendment protections for free speech. The Stop Hiding Hate Act, which is slated to take effect this week, would require social media companies to report on how they define and moderate content including hate speech, misinformation, disinformation, harassm

X sues to block copycat NY content moderation law after California win

Last year, X won its fight to block a California law requiring social media companies to report on efforts to remove hate speech and other kinds of content the state deemed harmful. Now, X has sued to stop New York from enforcing a law that it claims is a "carbon copy" of California's—which resulted in a settlement blocking the California law after a court ruled it likely violated the First Amendment. In a complaint filed Tuesday, X revealed that the New York lawsuit came after New York lawmak

Answering the Nintendo Switch 2’s lingering accessibility questions

One of the biggest surprises of the Nintendo Switch 2’s reveal was its proposed accessibility. For years, Nintendo has been known for accidentally stumbling on accessibility solutions while stubbornly refusing to engage with the broader subject. Yet, in the Switch 2, there appeared a more holistic approach to accessibility for which disabled players have been crying out. This was supported by a webpage dedicated to the Switch 2’s hardware accessibility. However, specifics were thin and no furth

The Steve Jobs Archive shares stories, videos, and notes of his famous commencement speech

is a news editor covering technology, gaming, and more. He joined The Verge in 2019 after nearly two years at Techmeme. Thursday marks the 20th anniversary of Steve Jobs’ famous Stanford commencement speech, and the Steve Jobs Archive has marked the occasion by uploading an HD version of the speech, publishing notes Jobs emailed to himself, and sharing details about the leadup to the speech. You can see everything on a page on the Steve Jobs Archive’s website and watch the HD video on YouTube.

How Steve Jobs Wrote the Greatest Commencement Speech Ever

In early June 2005, Steve Jobs emailed his friend Michael Hawley a draft of a speech he had agreed to deliver to Stanford University’s graduating class in a few days. “It’s embarrassing,” he wrote. “I'm just not good at this sort of speech. I never do it. I'll send you something, but please don't puke.” The notes that he sent contained the bones of what would become one of the most famous commencement addresses of all time. It has been viewed over 120 million times and is quoted to this day. Pr

See the bullet points behind Steve Jobs’ famous Stanford speech, plus new video

Today the Steve Jobs Archive is commemorating the 20th anniversary of Jobs’ famous commencement address to the 2005 graduating class at Stanford University. A newly enhanced video version of the speech has been released alongside a digital exhibit containing artifacts, including Jobs’ own personal bullet points of speech ideas. ‘Stay Hungry, Stay Foolish’ is new digital exhibit from Steve Jobs Archive The Steve Jobs Archive just launched a new digital exhibit titled “Stay Hungry, Stay Foolish,

Topics: 20 day jobs speech steve