
ChatGPT Sends Millions to Verified Election News, Blocks 250,000 Deepfake Attempts
AI
Zaker Adham
09 November 2024
07 July 2024
|
Zaker Adham
Summary
Summary
Kyutai, a French AI company, has unveiled its latest innovation, an AI-powered chatbot named Moshi. This new chatbot boasts features akin to ChatGPT’s delayed 'Advanced Voice Mode' GPT-4o. Notably, Moshi can recognize and interpret the user's tone of voice and operate offline.
Revolutionary Features of Moshi Moshi is built on a robust 7B parameter large language model (LLM) called Helium. It is currently available to the public and can mimic various accents and 70 different emotional speaking styles. One of its standout capabilities is handling two audio streams simultaneously, allowing it to listen and speak at the same time.
Named after the Japanese phone greeting, Moshi responds in just 200 milliseconds, surpassing the 232 to 320 milliseconds response time of GPT-4o’s Advanced Voice Mode. Kyutai has trained Moshi to understand the subtleties of human conversation.
The company even collaborated with a professional voice artist to enhance voice quality. Developed by a team of eight researchers in just six months, Moshi is smaller in size and was trained on 100,000 synthetic dialogues using Text-to-Speech technology.
Future Prospects and Open Source Goals Kyutai aims to make Moshi an open-source project, ensuring users can utilize the chatbot without privacy concerns. Although faster than GPT-4o, Moshi is currently a research prototype showcasing rapid response times and the ability to replicate tones and voices. Kyutai is also developing an AI-powered audio identification, watermarking, and signature tracking system to integrate with Moshi. While it may not yet rival ChatGPT, Moshi represents significant progress in the development of offline, open-sourced AI models.
AI
Zaker Adham
09 November 2024
AI
Zaker Adham
09 November 2024
AI
Zaker Adham
07 November 2024
AI
Zaker Adham
06 November 2024