Adobe's New AI Is All About Audio. How to Create Music for Your Videos with Firefly

Much of the news and product updates Adobe dropped this week was, unsurprisingly, centered around generative AI. But while most of this year has seen massive leaps in image and video generation, Adobe is focusing on elevating its AI offerings in another area: AI audio. The two new features, generate soundtrack and generate speech, do exactly what their names suggest. You can create background music and record scripts for your video. But each comes with hands-on controls that make AI audio less of a gamble and more of a useful tool for creators of all skill levels. They're available in beta now. Adobe is also releasing a beta version of its latest, fifth-gen Firefly Image Model. It promises to be better at producing photorealistic images, and you can now use prompt-based editing. There's also a new beta Firefly video editor that comes with a multitrack timeline that's meant to help you compile AI-generated clips. Adobe is also expanding its partnerships with two new AI companies, ElevenLabs and Topaz Labs. For even more AI news, you can learn about the AI assistants coming to Photoshop and Express. Here's an example of how you're prompted to write your AI music description. Adobe Generate music and soundtracks Music licensing is complicated, especially for commercial use. So let me start with the part that matters most: Any music generated with Firefly's generate soundtrack is given a universal license, which means you can use it for any purpose, indefinitely. Adobe creates its AI tools by using content (in this case, audio) that it has permission to use for AI training. So in theory, you shouldn't have Firefly AI audio removed from YouTube or other platforms or get a dreaded copyright strike. "This is a unique time in the world where music licensing is on the top of everybody's mind and creators are just either frustrated because they're trying to do the best thing for their content, or they're confused," Jay LeBoeuf, Adobe's head of AI audio, said in an interview. "So we're just hoping to remove the confusion." In a demo, Firefly did reject a prompt with an artist's name in it as it violated its user guidelines due to copyright concerns. Because the model isn't trained on Taylor Swift's music, for example, it can't create music similar to hers. Now, the fun stuff: Generate soundtrack is the first AI music tool from Adobe, and it's designed to take the guesswork out of what you want. You upload your video, and the AI analyzes it. Based on its assessment, Firefly will write a prompt it thinks may work well for your video. It's a Mad Libs-style prompt, and you can swap out the descriptors as you see fit. The prompt has three parts: describing the general vibe, style (think genre) and purpose (commercial, experimental, etc.). You can also adjust the tempo and energy level. Once you're happy with your prompt, click generate and less than two minutes later, four instrumental-only variations will be ready for you to play. Your audio will be as long as your video, but you can edit that as needed. You can upload videos that are up to five minutes long. How to generate music with Firefly You can try your hand at creating AI instrumental music for your videos now. Generate soundtrack and generate speech are both available through Firefly, and they're in beta. Check to see if your Adobe plan includes access to Firefly, and if it doesn't, you can get a plan starting at $10 per month. Open Firefly on web. Click Generate on the left side menu. Click Generate soundtrack from the cards available below the chat window. Upload your video using the left side menu. Firefly will then analyze your video and write an appropriate prompt in the left side menu. If you don't like what Firefly came up with, you can click the "X" and type in your preferred prompt. You can also pick from suggested vibes, styles and purposes from the left side menu. Scroll down and adjust the energy, tempo and duration as needed. Click generate. Once you have a soundtrack you like, you can download the complete video (or just the soundtrack) to your computer. This is an example of four music soundtracks Firefly made for an AI video I made of some people partying on a beach. Screenshot by Katelyn Chedraoui/CNET Generating speech Generating speech in Firefly is simple, and it includes a lot of features that'll make it useful for nearly any project. It's a simple window where you can type in the words you want the AI voice to read. You can also upload a script of up to 7,500 characters -- roughly a 15- to 20-minute video. Once uploaded, you can choose from 50 voices, each tagged with an approximate age and gender, including nonbinary options. You can generate speech in 20 different languages. But the fun part is what you can do to fine-tune your prompt. Speech is more than just reading words on a page. When we read long passages or talk with others, we naturally add emphasis, emotion and rhythm to our speech. With the new program, you can do the same, adding pauses where you want the AI to take a breather and highlighting sections where the tone should shift. If you're like me and nobody pronounces your name right on the first try, you can use the "fix pronunciation" tool to ensure there aren't any flubs. Select the name or proper noun and then add a phonetic breakdown, and the AI will use that to smooth out the pronunciation. These tools, along with your hands-on ability to adjust specific sections, are meant to give you more control, something other text-to-speech programs don't always offer. "It's a way for us to provide lifelike speech to creators, to small business owners, to educators, to everybody that really just has a story to tell, and maybe they're not as comfortable as we are just pulling out a mic and talking," said LeBoeuf. Firefly audio is a brand-new AI model. But that's not your only option. Adobe has been steadily adding to its roster of third-party AI models this year, for both AI video and image. It's expanding those choices again by including ElevenLab's multilingual V2 model as an option for generating speech. For more, check out how Adobe's Project Indigo camera app works, now with iPhone 17 support.

Adobe's New AI Is All About Audio. How to Create Music for Your Videos with Firefly

Share this article

Related Articles