Using advanced machine learning to identify and monitor native Hawaiian bird species through audio analysis, possibly supporting critical conservation efforts across the islands.
Vacations are meant for unwinding, but sometimes curiosity does get in the way – and I mean it in the best sense. On a recent two-week trip to Hawaii, surrounded by lush forests and new bird sounds, I found myself very curious about birds around me. On couple occasions, I noticed same sounds during morning and afternoon. By day 4, I realized the pattern for some birds. Wish I knew about Merlin app at that time to analyze audio and images for bird identification. That lack of knowledge, however, pushed me to challenge myself and I asked – Am I capable enough to build a system that can identify bird sounds in real-time? Could I build something small and helpful—using the tools I know—to decode this biodiversity, even in my downtime? Game on! I love challenges and set out to gather resources I would need to make this work.
A Spark, a New Model, and Unlikely Connections While listening and recording, I recalled that Google DeepMind‘s news about the new “Perch” bird sound identification model. One rabbit hole led to another: I discovered eBird’s reporting platform, explored open datasets like Xeno-Canto, referenced BirdNET’s classifier, and started sketching out a workflow—could these be connected to answer my simplest question: “Which bird?”
I had spent the last year learning about and aggregating climate-adjacent news, helping small businesses optimize emissions data collection and management workflows and support GHG reporting (including supplier-disclosure packages), blending latest and emerging technology (aka AI) with everyday operations. Applying the same mindset, I thought: why not try mapping these AI and automation tools onto biodiversity?
All I had at this time was deep curiosity and a concept. So I started organizing my recordings from Apple Watch and started building a four-stage pipeline for acoustic bird species classification – something that sounded so scientific and for environmental scientist became easily accessible to citizens and non-technical folks. Using my product, project, and digital operations background, I broke this endeavor into a four-stage pipeline for acoustic bird species classification.
Fist, Audio Preprocessing to standardize and segment raw data. All audio files were resampled to 32kHz mono for consistency. Each recording segmented into 5-second windows with 50% overlap to capture temporal variations while maintaining computational efficiency. This window size aligns with typical bird vocalization patterns.
Second, Feature Extraction to extract acoustic features needed for ML classification. Librosa made this possible, extracting 92 features that captured both spectral and temporal characteristics unique to each species’ vocalization. I built this in Google Colab using Google Cloud‘s compute. Big shout-out to Gemini’s co-pilot for not only helping me debug and fix errors, but also answering my questions potential downstream impacts. This helped me make better decisions and explore additional resources.
Third, ML Classification from model to species prediction. By this stage, I was gaining more confidence. What started as a potential or a maybe, was slowly shaping to be the most useful things I would build using generative AI. The Perch bird classification model processed extracted features to generate species probability distributions. The model was trained on a comprehensive database of Hawaiian specific bird species, including both native and introduced populations.
Fourth, and last, was adding a Context Layer to Hawaiian habitat and prevalence adjustments. I learned that this piece is critical in biodiversity conservation given the specificity of elevation ranges, and temporal activity patterns. These were incorporated to adjust confidence scores, reducing false positives for rare or out-of-range species while boosting likely candidates.
Results
While I was able to build this pipeline seamlessly, I hit critical issues during species identification. This experiment revealed critical gaps to make this further useful.
- Species name mapping incomplete – class IDs need taxonomic labels
- No dual-model validation to cross-check Perch predictions (based on some more secondary and primary research, validation remains one of the most impactful gaps)
- Limited ground truth – manual verification data not available
- Temporal metadata missing for seasonal pattern analysis
With these gaps, the program risks producing False positives that could mislead conservation efforts, environmental noises could trigger incorrect detections, overlapping vocalizations could reduce accuracy in high-density environments. These risks are not to be overlooked and highlight the importance of consistent field work.
Curiosity > Comfort: Gaps Are Good These walls are just new invitations to learn and they reveal what matters, where friction lives, and which problems need new thinking.
Learnings
Looking Forward: More Than an App This project is one small step, a case study in how curious tech can serve conservation. It isn’t perfect, but it’s a seed for deeper modeling, real-time monitoring, citizen science projects, and collaborative habitat management.
I’m actively exploring several other automation and climate-data projects, while seeking meaningful part-time/full-time opportunities in tech-for-good. If you care about birds, conservation, machine learning, or just learning through building please reach out to me. I’d love to collaborate, share notes, or learn together.
Feel free to comment, DM, or connect—let’s see what else we can figure out alongside Hawaii’s birds.
Sources:
| Resource | Overview |
|---|---|
| arxiv.org/html/2508.04665v1 arxiv.org/pdf/2508.04665.pdf | Presents Perch 2.0—a leading bioacoustic ML model—showcasing fine-grained species classification and broad transfer learning value for conservationarxiv+1​. |
| jpinfotech.org/deep-learning-based-dual-modal-bird-species-identification-using-audio-and-images/ YouTube: hLBWUqtef0Y | Describes systems using both audio and images through deep learning for highly accurate bird species identification utilizing dual-mode neural networksjpinfotech​youtube​. |
| arxiv.org/html/2503.15576v1 | Explores advanced bird song detection algorithms that improve automated species identification and ecological survey quality. |
| sciencedirect.com/science/article/pii/S1574954121000273 | Details BirdNET, an AI-powered tool for scalable and global bird sound identification in biodiversity research. |
| pmc.ncbi.nlm.nih.gov/articles/PMC11036034/ | Assesses how different temporal sampling approaches affect accuracy and coverage in community-wide bird acoustic surveys. |
| pmc.ncbi.nlm.nih.gov/articles/PMC12105146/ | Examines environmental and seasonal drivers of wild bird vocalization patterns for bioacoustic monitoring best practices. |
| sciencedirect.com/science/article/pii/S2351989424001999 | Studies springtime spatio-temporal patterns in bird diversity, highlighting metadata importance in ecological audio analysis. |
| sciencedirect.com/science/article/pii/S1574954125002638 | Introduces improved deep learning models for bird song detection in large-scale acoustic monitoring. |
| linkedin.com/article/edit/7389770731925970944/ | Summarizes real-world limitations and lessons learned from deploying automatic bird sound classifiers for field conservation. |
| support.google.com/websearch/answer/12412910?hl=en | Explains why Google search results vary based on user context, personalization, and algorithm updates. |
| reddit.com/r/Bard/comments/1hf86ts/how_many_deep_search_is_crazy/ | Highlights community discussion on the scope and impact of AI-driven deep web search capabilities. |
| nymag.com/intelligencer/article/google-ai-mode-search-results-bury-the-web.html | Discusses the effects of AI-powered search interfaces on web discoverability and content strategy. |
| iptwins.com/2025/10/02/combatting-fake-information-in-ai-search-results-how-new-search-technologies-are-exploited/ | Addresses risks of misinformation in AI-generated search results and emerging mitigation tactics. |
| phys.org/news/2024-05-people-accurate-results-stakes-high.html | Reports research showing users rely on accurate, trustworthy search results for high-stakes decisions. |
| themarkup.org/google-the-giant/2020/07/28/how-we-analyzed-google-search-results-web-assay-parsing-tool | Outlines methods for investigating and understanding how search rankings and display work at Google. |
| pmc.ncbi.nlm.nih.gov/articles/PMC11723449/ | Explores AI-based image feature extraction for improved bird species detection. |
| sciencedirect.com/science/article/pii/S157495412300287X | Introduces few-shot learning approaches to detect animal sounds with very limited training data. |
| YouTube: Y-CMp3n8Y8Y | Reviews public critiques of AI-driven search accuracy and highlights key pitfalls in summary generation. |
| blog.google/products/search/ai-overviews-update-may-2024/ | Shares Google’s latest update and clarifications on recent changes to search AI overview features. |
Additional Resources:
Leave a comment