Apple Researchers Discover Method to Enable Quicker, More Natural Conversations with Siri

Apple Researchers Discover Method to Enable Quicker, More Natural Conversations with Siri

While Apple has recently relied on Google’s Gemini technology to address some of its AI limitations, the company’s research teams in Cupertino are relentlessly pursuing innovative strategies to enhance Siri’s performance.

A recent research paper published by Apple researchers seeks to achieve quicker and more natural-sounding interactions with Siri, marking a significant step in their ongoing effort to refine the digital assistant.

Unlocking Faster Responses with Acoustic Similarity Groups

Traditionally, AI voice models generate speech by utilizing tokens — brief segments of phonetic sounds that span mere milliseconds. These models utilize an autoregressive method to select the appropriate phonetic sound, which often results in a noticeable delay when responding. This approach can also lead to awkward pronunciations due to the limited selection of phonetic snippets used for training.

In their latest study, Apple researchers propose an innovative alternative. They suggest adopting Acoustic Similarity Groups (ASGs) to replace the conventional token-matching system. ASGs group together speech tokens based on perceptual similarities in sound, with some overlap between groups. By incorporating probabilistic search techniques within these ASGs, AI models can identify the most suitable speech token much more rapidly.

Although this proposal may not be revolutionary, it underscores Apple’s commitment to advancing its AI and machine learning capabilities. This initiative further indicates Apple’s intention to create a fully integrated AI solution for its devices, moving away from reliance on third-party technologies like Google’s Gemini.

Source & Images

Leave a Reply

Your email address will not be published. Required fields are marked *