Brands
Discover
Events
Newsletter
More

Follow Us

twitterfacebookinstagramyoutube

Aravind Ganapathiraju: The AI enthusiast who balances tech with the great outdoors

Almost a quarter century into speech tech and AI, Aravind Ganapathiraju, ex-VP of Applied AI at Uniphore, believes conversational AI is finally seeing its golden era.

Aravind Ganapathiraju: The AI enthusiast who balances tech with the great outdoors

Tuesday January 28, 2025 , 9 min Read

Speech is fundamental to our interactions, and in today’s Artificial Intelligence (AI)-driven age, it includes humans talking to machines. We frequently engage with virtual agents and smart speakers, like Alexa. Long before voice commands became mainstream, this was an area of deep interest for Aravind Ganapathiraju, ex-Vice President of Applied AI at Uniphore, a conversational automation technology company.

With around 25 years of experience in speech, Aravind’s journey began early in college. “I’m so into speech that I can see a waveform and can almost tell what was spoken,” says Aravind, who works remotely from Hyderabad, managing teams in Bengaluru and Chennai.

Born and raised in Hyderabad, Aravind graduated in electronics and communication engineering from NIT Trichy and pursued his master’s and PhD in the US at Mississippi State University, focusing on signal processing, particularly voice. In the 1990s, computing power favored voice research over vision. “It was a natural gravitation towards voice,” Aravind recalls.

This early start gave Aravind a strong foundation for the present day, where speech is key to human-machine interactions. Besides, a decade in Seattle and close interactions with friends at Amazon exposed him to cutting-edge developments. “I’ve seen Amazon’s speech research grow from a small team in the early 2000s, to launching Alexa. Amazon has shown how to innovate and reach the masses, a spirit that continues with Amazon Web Services (AWS),” Aravind shares.

AWS is a key infrastructure partner for Uniphore. “What I love about AWS is their close collaboration with customers. They walked with us through our journey of building speech models,” says Aravind. Having spent long years in the field of AI, he’s witnessed its evolution and its impact on human-machine interactions. Today, AI excels in summarization, knowledge retrieval, real-time alerts, and more, enhancing both human-machine and human-human interactions.

The Golden Age of Conversational AI

Aravind believes we’re in the golden era of conversational AI. “Problems we once thought insurmountable are now being solved, opening up new applications we couldn’t imagine five years ago,” he says. For example, language translation, once a complex task, is now seamless with conversational AI. Aravind recently experienced this firsthand in Japan, effortlessly communicating through voice-to-voice translation.

The journey began in the 1980s with basic voice commands for tasks like opening, closing or number recognition, limited by the compute power of the time. The 2000s saw advancements with Large Vocabulary Continuous Speech Recognition (LVCSR), bringing accuracy closer to human levels. LVCSR, used in voice assistants and transcription services, are now ubiquitous, improving interactions even between people who don’t share a common language.

On the progress so far, Aravind believes voice services have become essential for both enterprise and consumer AI. “Speech technology now matches human capabilities. Any minor gaps in understanding can be bridged with customization, enabling more complex use cases,” he explains.

For instance, if a voice assistant at a contact center knows you’ve already made three calls and this is your fourth, it can sound more apologetic. “This intelligence will be built-in, moving forward,” says Aravind, who joined Uniphore four years ago, after working at Genesys in areas like Automatic Speech Recognition, Speech SIgnal processing and Natural Language Processing (NLP)

Aravind specializes in Applied AI, which differs from core AI. While core AI is a broad term, Applied AI focuses on bringing product ideas to reality by using a methodical approach to algorithmic and model choices that provide the accuracy and cost-effectiveness and thorough evaluations to gauge market readiness.

For speech, sky's the limit

Speech recognition has definitely improved, as millions of users experience this in their daily interactions. As for the future, Aravind believes, “the sky's the limit”. In countries like India with multiple languages it makes it easier to interact.” Self-service voice options are already hitting 80% accuracy and the remaining 20% are getting better. For the latter, humans will still come in to help, but will be increasingly assisted by AI. That will help companies improve customer experience, as the cost of losing a customer is just too high. It is in situations like these where AI is helping improve things, understanding the context of what the caller wants and delivering more accurate results.

Aravind sees more capable Large Language Models (LLMs) improving the user experience and these will also be multi-modal LLMs that remove the limitations of the unimodal LLMs. On the speech side, Aravind sees similar improvements where better LLMs can help deal with the complexities.

The AWS Advantage

AWS is also helping Aravind in this journey. “They really think about how we can improve things. They have identified areas where we can reduce costs for model building, inferencing and production. The one big component we use on a daily basis is Amazon SageMaker. Every model that we churn out goes through Amazon SageMaker,” says Aravind.

Amazon SageMaker, which Uniphore uses to great effect, is a cloud-based machine-learning platform that allows creation, training and deployment of Machine Learning (ML) models on the cloud. It can be used to deploy ML models on both embedded systems and edge-devices.

Having a top-notch partner like AWS definitely helps. More so in an evolving space like AI. There are so many aspects of AI that can really empower enterprises in different stages and different business units within those enterprises. For example, Uniphore continues to innovate for sales enablement as well. This is an area that is not necessarily contact center-like, because sales happen between two professionals, unlike an agent and a customer. As a result, the conversations that they have are a lot more direct. On the other hand, they can be a make or break for a deal from happening. “We are building AI that can look at emotions, that can look at other cues, visual cues, auditory cues, tonal cues to say, what is the chance of the sale actually happening based on this conversation? And as a salesman, what can you do differently or better to make this a successful interaction,” says Aravind.

Similarly, AI tools are expanding to human resources, improving outcomes in shorter time spans. Can we use AI to automate a lot of the recruiting process? While that has been happening for some time now, many candidates and companies believe AI may shortlist candidates in a rule-based way, matching qualifications, experience to job roles and will overlook aspects like the communication abilities of candidates or their soft skills. Luckily, things are improving here as well. “I would say that is the biggest strength of AI as it can look beyond the words that are spoken, during the interview or how they are spoken. We can actually look at the emotional signals. Very soon we may be able to predict the truthfulness of the candidate while they're answering and also gauge their confidence levels,” points out Aravind. AI will also be able to identify communication skills, gaps in a candidate’s knowledge and recommend learning programs to help the individual upskill.

Looking ahead, Aravind says, “It's just a matter of us waking up and acknowledging that the time of conversation is upon us. Let's embrace it rather than thinking of it as another shiny technology that only the elites will use. It is truly for the common good.” He personally prefers voice interactions and regularly reviews human-machine conversations to identify areas for improvement.

AI and a love for the outdoors

When he’s not reviewing conversations, Aravind is a passionate outdoorsman. An avid sports enthusiast , Aravind has always been interested in football. His daughter is a University level soccer player in the US. He loves coaching as well. His most recent interest is to teach people and spread awareness of the sport called Ultimate Frisbee. It's a non-contact, gender agnostic football-like sport for all ages, where kids and adults play at same time. “It really brings communities together and you can learn a lot about strategy so that's been my passion recently and I'm also a little bit into hiking around Hyderabad,” says Aravind. He hikes primarily in and around the city and over the next decade he plans to explore India, before going hiking in other countries.

Aravind is also deeply involved with AI research and reads about new developments in the AI field. Within Uniphore, there are weekly research seminars where one of the team members who has either been through a paper or tried out a demo presents what's out there and his or her point of view on its applicability to Uniphore and its product. “That's been very informative because we don't have time to do this every week. But if someone else has done it, then at least it points us in a direction that can guide the entire team for the next three to six months,” says Aravind.

New developments in the area of speech continue to inspire and keep Aravind motivated to do more. And this has been so since his college days. Back then, in the 1990s at Mississippi State University, “we liked the David versus Goliath kind of feeling. We were the small guys building an open-source large vocabulary system and competing against MIT, CMU, Johns Hopkins at a US level. At that time, it was driven by the government. But that really propelled us. I've always loved this. A small team can do as much as a big team. You just need to have the right vision and the right commitment to make it happen,” recalls Aravind.

That spirit has always driven him and speech was a natural modality he always enjoyed. Even today, Aravind keeps himself abreast with new developments via AWS offsites and connects with colleagues in the field. He also listens to five to ten customer calls daily to understand the challenges in transcription and translation, driven by his enduring passion for speech technology.