How DataEQ predicted Trump and Brexit
In 2016, DataEQ (formerly known as BrandsEye) grabbed the attention of international media when it used its unique combination of AI and human crowds to predict that Donald Trump would beat Hillary Clinton in the race for the White House.
In the weeks and days before the election, DataEQ noticed that traditional polls did not seem to capture what it had observed on social media, a significant and intensely emotional dissatisfaction with the establishment politicians and Hillary Clinton. And after predicting the outcome of the Brexit referendum in June, DataEQ analysts noticed that a similarly surprising result could occur in the election. The day before Americans headed to the polls, DataEQ CEO, JP Kloppers told BBC News that DataEQ’s analysis of social media was pointing to a victory for Trump.
In June DataEQ began collecting public social media posts, mostly from Twitter, about the elections from the battleground states. While such early public opinions don’t necessarily reflect or actual voting day behaviour, the early collection of data, months before the election, assisted DataEQ with its structuring and fine-tuning of the data analysis process.
In the months that followed, approximately 37.6 million social media posts from some four million authors were collected for analysis. DataEQ’s geolocation algorithms identified the state in which the author was located and distributed only posts from the key battleground states to the DataEQ Crowd of human verifiers for sentiment evaluation.
Crowd verifiers evaluated the sentiment contained in each post, either positive, negative neutral and to who which candidate it pertained. All social media posts matching the candidates’ names, and mention of the US election itself, was collected. Data from outside the US was excluded.
Human evaluation is critical to accurately understanding social media conversation. Online chatter is rife with sarcasm, implied and hidden meanings, abbreviated phrases, slang, innuendo, humour and a host of other complications, understanding many of these intricacies is typically beyond the ability of an algorithm. As a result, DataEQ used its crowdsourcing approach in which real people conducted the micro jobs of correctly assigning sentiment to the candidates (for or against) in return for micropayments. This approach achieves accuracy levels that outperform sentiment analyses that rely solely on AI.
Once the sentiment had been correctly assigned in all of the 11 battleground states, the methodology we adopted was simple. Advocacy for a candidate was counted from all unique authors supporting them or speaking against their opponent. The neutral posts were excluded. This advocacy was then divided by the total emotive (for or against) conversation to determine a percentage. In each of the key battleground states, if a candidate’s percentage was greater than 50%, DataEQ predicted them winning that state.
The methodology also gave greater weighting to the most recent conversations - as sentiment shift throughout the campaign. If what people say determines how they will act, then it stands to reason that what they say most recently, unsolicited, on social media could reflect how they will act at that moment. Consequently, DataEQ collected and analysed data right up to, and including, the day before Election Day, and weighted the data according to its imminence to Election Day itself. This was the final piece to the puzzle, and DataEQ then called the outcome of the 11 battleground states that would determine the 45th US President.
The results speak for themselves. In contrast to many traditional polls, DataEQ’s weighted average matched the outcome of nine of the 11 states. Similarly to DataEQ’s Brexit prediction, accurate, human integrated evaluation of social media conversation proved to be a meaningful way to understand public opinion.
Online media has empowered millions of citizens around the globe. The democratisation of public opinion means that anyone with internet access can volunteer their opinion. Organisations are struggling to keep up with the pace, scale and volatility of stakeholder and public opinion. Being able to measure and understand how and why people feel the way they do is therefore of high strategic value and can provide a significant competitive advantage.
DataEQ uses a proprietary mix of search algorithms, AI and crowd-sourcing to mine and structure online conversation for sentiment. DataEQ’s Crowd of human contributors enrich thesentiment data by surfacing the specific topics driving sentiment. DataEQ provides this data to organisations around the world to help them make customer-centred operational and strategic improvements.