Making sound pictures to identify birdsong

A trained musician can look at a musical score and imagine the sound of an entire orchestra. The score is a visual representation of the sounds. In an analogous way, we can represent birdsong by an image, and analysis of the image can tell us the species of birds singing. This is what happens with Merlin Bird ID. In a recent episode of Mooney Goes Wild on RTÉ Radio 1, Niall Hatch of Birdwatch Ireland interviewed Drew Weber of the Cornell Lab of Ornithology, a developer of Merlin Bird ID. This phone app enables a large number of birds to be identified.

An especially interesting feature of Merlin is Sound ID: just stand in a park, by the sea or in the hills and press “record”. Sound ID will compare the birdsongs to a large databank of recorded sounds and quickly identify the species. Sound ID is a major advance in sound identification and machine learning. It can identify about 250 European bird species and numerous more exotic species.

How does Sound ID work?

A recording of bird song is essentially a graph of air pressure against time, replete with information but difficult to interpret. The idea of Sound ID is to use a computer vision model to identify bird vocalisations. For Sound ID, an algorithm called a Short-time Fourier transform (STFT) converts the audio signal into an image called a spectrogram.

It is a diagram with time on the horizontal axis and frequency on the vertical. It is very like a musical score, which has time on one axis and pitch on the other. Notes sounding at the same time appear as a vertical stack, a chord.

In pictures: Aftermath of US strikes in Venezuela

Boy (7) strikes it lucky by finding one of the world’s rarest minerals near his home in Cork

Art in 2026: 10 shows to watch out for, from Phil Collins to Hilma af Klint

Miriam Lord: Michael Healy-Rae loses the room with grandstand performance over Mercosur

Once the audio has been converted into a spectrogram it can be fed into a standard computer vision model, which is trained to identify bird vocalisations based on their visual signature in the spectrogram. Computer image analysis is very advanced and can be used to break the image into manageable pieces. Each piece can then be compared with a database of birdsongs.

The spectrogram image is processed by a model called a deep convolutional neural network. This network is tuned by examining a large number of birdsongs. It is also able to recognise extraneous background noises like traffic and human speech and eliminate them.

Merlin’s Sound ID tool is trained using audio data when each bird is vocalising. Ornithologists select the precise moments when birds are singing, and tag those sounds with the corresponding bird species. The neural network uses a large number of parameters, called weights, to fit the data. A method called a gradient descent algorithm figures out how to adjust the weights to ensure that the model predictions match those of the Sound ID experts.

Several choices are necessary when constructing a spectrogram: the length of the audio clip, the optimal STFT window length, the vertical scaling, and the spatial dimensions of the spectrogram. Following extensive testing, Sound ID was set to use a window length of 512 samples, with 128 samples for the STFT and an image size of 128 x 512 pixels. This achieves a good balance between speed and model accuracy.

There is a wealth of information and technical details on the Cornell website. Merlin Bird ID is available free of charge. Once installed on your phone, it runs offline, without needing a network connection, and lets you record and identify the birds around you.

A UCD course on recreational mathematics, AweSums: The Wonder, Utility and Fun of Mathematics, will be presented this autumn by Prof Peter Lynch — registration is open at www.ucd.ie/lifelonglearning

Peter Lynch is emeritus professor at the School of Mathematics & Statistics, University College Dublin. He blogs at thatsmaths.com

Making sound pictures to identify birdsong

That’s Maths: Merlin Bird ID represents song by image, which after analysis can reveal species

How does Sound ID work?

READ MORE

In pictures: Aftermath of US strikes in Venezuela

Boy (7) strikes it lucky by finding one of the world’s rarest minerals near his home in Cork

Art in 2026: 10 shows to watch out for, from Phil Collins to Hilma af Klint

Miriam Lord: Michael Healy-Rae loses the room with grandstand performance over Mercosur

IN THIS SECTION

MOST READ

LATEST STORIES

Leinster 25 La Rochelle 24 (FT) - Harry Byrne snatches victory with last gasp penalty

‘You can be against Trump and celebrate that Maduro is gone’: Venezuelans protest in Dublin

Those seeking new Irish government should look to Maga, Eddie Hobbs tells conference

Brian Fay finishes impressive 16th at World Cross Country Championships

RTÉ’s Late Late Toy Show, sport and Traitors Ireland top list of last year’s most-watched TV shows

Subscribe

Support

About Us

Irish Times Products & Services

OUR PARTNERS: