Newsreel Asia

View Original

Analysis of Independent Forensic Report on ‘Manipur Tapes’

Report Concludes with 93% Certainty That the Voice Belongs to the Chief Minister

February 4, 2025

Leaked audio recordings, known as “Manipur Tapes” and purportedly featuring the voice of Manipur Chief Minister N. Biren Singh, were examined by an independent forensic lab, Truth Labs Forensic Services. The examination concluded with over 93% certainty that the voice in the recordings belongs to the Chief Minister. Based on the report, below is an analysis of how the tapes were analysed and how forensic experts arrived at their conclusion.

The forensic examination was commissioned by the Kuki Organisation for Human Rights Trust, represented by lawyer Prashant Bhushan, for Supreme Court petition, as reported by The Hindu. The examination was led a retired Superintendent of Police with 25 years of experience in digital evidence analysis. He was assisted by an expert who has over eight years of experience and has examined and reported on more than 200 cases involving audio-video forensics.

The lab’s website states that its reports have been used as evidence by the Supreme Court, six High Courts, trial courts, police, the Central Bureau of Investigation, the National Investigation Agency, the Central Reserve Police Force and about 200 central and state government ministries, departments and public sector undertakings.

The recording, allegedly captured during a closed-door meeting, were brought into the public domain by The Wire, in August 2024, weeks after being officially submitted to the official inquiry commission, established by the Union Home Ministry to investigate ongoing ethnic violence in Manipur.

The Manipur state government has dismissed the audio as “doctored.” The Hindu quoted a Manipur government source saying there is “no need to comment” on the findings of the independent forensic report, stating that it “has nothing to do with CM N. Biren Singh.”

Though submitted to the Supreme Court as “a clear and strong prima facie evidence” by the petitioners, the independent report is unlikely to have legal implications, as the Supreme Court, on Feb. 3, sought the report of a forensic analysis by a government agency.

Content

In the recording, the Chief Minister is purportedly heard speaking about Union Home Minister Amit Shah’s visit to Manipur weeks after the onslaught of the violence, which began on May 3, 2023. Singh allegedly recounts Shah asking, “Biren ji! ... Arre! Tum bomb marta hai (Are you using bombs)?” Following this, Shah reportedly directed him to cease using bombs, and reinforced this directive by involving the Director General of Police and others. After Shah departed, Singh purportedly told his team, “Hoi! Chupke se karna hai, open nahi karna hai (It should be used covertly, not openly).” He allegedly added that those doubting him could verify this with the “commandos” on the front line.

The voice is also heard saying, “[Police] commandos … those underground people at frontline … made all of them join together. I am telling you … revealing the truth, the PLA …, Pambei’s people (a faction of the insurgent group United National Liberation Front, or UNLF), PREPAK (another insurgent group) and every other … with commandos, I let them all join together.”

PREPAK, or People’s Revolutionary Party of “Kangleipak,” was founded in 1977 to advocate for the independence of Manipur from Indian governance. “Kangleipak” is the historical name for the region before Manipur became part of India in 1949. The Pambei faction of the UNLF is a splinter group within one of Manipur’s oldest insurgent organisations.

Authenticity

The independent forensic examination involved two key processes: verifying the authenticity of the leaked audio recordings and determining whether the voice in the recordings matched that of Biren Singh as heard in two YouTube videos.

To determine whether the leaked audio recordings were authentic or had been tampered with, forensic experts conducted a series of technical analyses.

The first step involved examining the file properties and file headers, which contain metadata about how and when an audio-video file was created or modified. If an audio-video file is edited or altered, these headers can show inconsistencies, such as repeated or missing elements that indicate manipulation.

The forensic team checked for such anomalies but found no repeated headers in the examined files.

Next, the forensic team conducted an “acoustic event analysis,” which involved listening carefully to the recordings to identify various background noises, speech patterns and operational sounds, such as pauses, stops or starts in the recording. These elements help determine if the recording is continuous and natural or if there are abrupt breaks that might indicate tampering.

The experts examined specific technical attributes of the audio, such as the noise-to-harmonic ratio, which helps detect whether artificial modifications were made, and the percentage of non-voiced frames, which refers to silent portions or sections with only background noise.

To further confirm the authenticity of the recordings, the forensic team analysed the tone consistency, fluency, loudness and speaking mode of the speaker.

The team conducted waveform and spectrographic analysis to visually inspect the structure of the audio. A waveform represents the loudness of an audio signal over time, while a spectrogram shows the distribution of different sound frequencies. These visual tools help forensic experts identify unnatural breaks, cut points or mismatches in the sound profile, which would indicate tampering.

If sections of the recording had been artificially inserted or cut, the background elements might change unnaturally and the waveform and spectrogram would show sudden gaps, mismatched patterns or repeated sections.

The analysis found no abrupt shifts, but it did reveal variations in background noise and tone that were consistent with a recording device being moved during the recording process. The examination of the waveform and spectrogram found no abrupt changes.

Voice Analysis

Further, to determine whether two voice recordings belong to the same person, the experts analysed pitch and intensity contours, which are patterns that show how a person’s voice rises and falls while speaking. Every person has unique speech characteristics, including the way they stress certain syllables, how their voice fluctuates in tone and how loudly or softly they pronounce words. These characteristics remain relatively consistent when the same person speaks under similar conditions.

For this analysis, forensic experts focused on specific words and phrases that appeared both in the leaked audio recording and in two YouTube videos of Biren Singh. These words were chosen perhaps because they provided enough variety in pronunciation, stress patterns and tonal variations to make a meaningful comparison.

Experts examined how the speaker pronounced each word in both recordings, checking for similarities in intonation patterns—the natural rhythm and melody of speech. They compared how the pitch (highness or lowness of the voice) and intensity (loudness or softness) changed when these words were spoken. If two different people had spoken the words, there would likely be noticeable differences in these contours.

The forensic analysis found that the pitch and intensity patterns for these words were very similar in both recordings.

Report’s Conclusion

The forensic experts used a “Chi-square test,” a statistical method used to determine whether there is a significant relationship between two sets of categorical data. It measures how well observed results match expected results if there were no real association between the variables being tested.

By analysing various speech characteristics, such as pitch, tone, pronunciation and other acoustic features, the forensic team calculated that there was a 93% probability that both recordings featured the same speaker. This means that if the voices were actually from different people, there would be only a 7% chance of seeing this level of similarity in their speech characteristics.

This was used to generate a likelihood ratio (13.28), meaning the evidence was 13.28 times more likely to support the conclusion that the voices matched rather than being from two different individuals.