What you might want to know
- Google’s Gemini app can now deal with audio uploads on Android, iOS, and internet, a function customers have been asking for probably the most.
- Supported codecs embrace MP3, M4A, and WAV, with the app transcribing audio, summarizing key factors, and extracting actionable insights.
- Customers can add as much as 10 audio information without delay, however their complete size can’t exceed 10 minutes, and different Gemini utilization limits nonetheless apply.
Final month, indicators popped up that Google was engaged on letting the Gemini app deal with audio uploads. This much-requested function is now dwell throughout Android, iOS, and the net.
The replace helps MP3, M4A, and WAV information. When you add, Gemini will transcribe the audio, pull out the important thing factors, and provide you with a transparent abstract (by way of 9to5Google).
This function will be accessed by way of the plus menu on Gemini’s cellular app or “Add information” on the internet. When you add an audio clip, the app analyzes it, turning conferences, interviews, lectures, or voice notes into easy-to-digest summaries and key takeaways.
High consumer request involves life
Josh Woodward, VP of Google Labs and Gemini, shared on X that this has been the function customers have requested for probably the most.
Nonetheless, in line with Google’s assist web page, you may add as much as 10 audio information without delay, however their mixed size can’t exceed 10 minutes. Different Gemini utilization limits nonetheless apply, so maintain that in thoughts earlier than sending a batch of information.
The audio add limits aren’t infinite however are pretty beneficiant in comparison with video. Free customers get 10 minutes for audio, which is double the five-minute video cap. In the meantime, paid customers get 3 times the one-hour video restrict.
One other restrict to remember is the file rely. You possibly can add as much as 10 information per immediate, and this covers every thing from code folders with as much as 5,000 information to GitHub repos and ZIPs with as much as 10 compressed information. The brand new audio function counts towards this 10-file complete, so it doesn’t broaden the general restrict.
Past transcription, Gemini can spotlight key factors, distinguish audio system, and pull out motion gadgets or quotes. This, in flip, makes any audio file a neatly structured, searchable doc.