IT 06 - Automatically Detecting the Presence of Manatee Species from Their Vocalizations

Our project addresses a real challenge faced by marine researchers: identifying manatee presence through underwater vocalizations is incredibly time-consuming. Researchers often spend days manually reviewing long recordings just to find a few calls. This slows down conservation efforts and leaves valuable data underused. To solve this, we developed an AI-powered system that automatically detects manatee vocalizations in audio recordings.

Our system uses machine learning models to classify manatee calls in real-time, helping researchers not only detect presence but also understand behavior. Different call types—such as squeaks, high squeaks, squeals, and chirps—are linked to behaviors like resting, feeding, or stress. Automating this process enables faster analysis and deeper ecological insights.

We used a real-world dataset provided by marine biologist Dr. Beth. It included over 40 hours of hydrophone recordings from locations in Florida and more than 3,000 labeled manatee vocalizations. The data also contained background noise, overlapping sounds, and other marine species, making the task more challenging—but also more realistic.

To prepare the data, we first applied bandpass filters to isolate the frequency ranges of manatee calls (500–5000 Hz and 500–8000 Hz). We then used Ephraim-Malah spectral subtraction to reduce background noise. Since these operations can shift audio timing, we implemented a custom timestamp correction method to keep features aligned with their labels.

Next, we extracted acoustic features such as MFCC, GFCC, spectral entropy, and energy using a sliding window approach. These features served as input for two different models: a Hidden Markov Model (HMM) and a CNN-LSTM neural network. The HMMs were trained separately for each call type and achieved from 75% up to 90% accuracy, performing best on squeak detection.

The CNN-LSTM model outperformed HMMs significantly. By combining convolutional layers (for local pattern detection) and LSTM layers (for temporal modeling), it achieved 90% to 100% accuracy in both freshwater and saltwater environments—even in noisy conditions. It also generalized well to test audio not seen during training.

We also developed a binary classification version of the model that simply detects whether a manatee is present or not. This version is especially useful for quick screening of long recordings and achieved up to 100% accuracy in some tests.

What sets our project apart is its end-to-end completeness. We worked with real, noisy data and built a robust pipeline from filtering and feature extraction to training and evaluation. Our models were designed to handle real-world variability and were tested under multiple frequency settings for reliability.

This system has the potential to make a real impact in marine conservation. Instead of spending weeks reviewing audio, researchers can now focus on the important parts—accelerating research and improving protection for endangered manatees. Looking ahead, we plan to add more call types, refine model accuracy, and package our system into a user-friendly application. Regular retraining with new data will ensure the system stays accurate and adaptive over time.

In short, this project brings together AI, audio processing, and conservation science to solve a meaningful problem. We hope it becomes a valuable tool for researchers working to protect these gentle creatures and their habitats.

Updated 1 year ago

Contact

Your message has been sent.

IT 06 - Automatically Detecting the Presence of Manatee Species from Their Vocalizations

Media

IT 06 - Automatically Detecting the Presence of Manatee Species from Their Vocalizations