At their most recent debate, the two main US Democratic party candidates Bernie Sanders and Hillary Clinton tried to defuse a heated dispute between their campaigns over a breach of the Clinton campaign’s voter data, which Sanders staffers allegedly accessed inappropriately. As punishment, his campaign was briefly suspended from the Democratic Party’s master voter file, and it’s now suing the party for $600,000 per day of lost access.
While the two candidates exchanged conciliatory statements in public, their staff remain at loggerheads – one Clinton operative likened it to “the opposing general getting your battle plans.”
The campaigns' outrage at the incident shows just how dependent on data US election campaigns have become. Now-President Barack Obama’s 2008 campaign famously took the use of voter data to new heights, and his 2012 re-election effort was similarly supported by a whole team of data experts – whereas Mitt Romney’s campaign to defeat him was hobbled when his data operation utterly collapsed on election day.
The past two elections have given the Democrats a serious edge in the data game, and this cycle’s Republican contenders are making every effort to keep up. Ted Cruz has already got into hot water for using tens of millions of Facebook users' data to construct psychological profiles. Even Donald Trump, as maverick as any candidate in living memory and equipped with only a bare-bones campaign machine, knows better than to do without it.
The rationale for all this is plain enough: the more information you collect and process about voters, the better you can target your message and calibrate your speeches to win them over. Data now determines almost all choices of how to run advertising and where to look for potential donors.
Having the right data allows campaigns to build mathematical models of the electorate to a very fine level of detail, and to quickly react to shifts in electorate opinions which can also be “mined” from Twitter, Facebook and the like.
Now they can establish to a remarkable degree how likely a particular voter is to choose a particular candidate – and that allows campaigns to target their attention and messaging incredibly finely. It is now possible to automatically place voters in a pre-defined set of classes (“college student not owning a gun who is pro-immigration and pro-gay marriage”, for example) and target such classes using different strategies.
Far from being purely cynical and mechanical, this sort of data-driven approach does actually offer benefits to voters. It’s one way to push back on the echo-chamber effect that obstructs campaign messaging, in which people keep seeing the same type of message over and over again. Being able to target undecided voters allows them to look at a diversity of content, solving the problem of being in a “filter bubble” that keeps them from seeing anything unexpected that might change their thinking.
Lie of the land
Using data in this way is not at all simple. Big data necessarily contains a lot of “noise”; the real expense isn’t in collecting large amounts of data, but rather in extracting valuable information from it.
However much data is available, it’ll always be incomplete. It will be a sample and will therefore have a bias, meaning it can never represent the entire voter population with perfect accuracy. People that are on Twitter or those that are accessible via services such as Amazon MTurk, a popular crowdsourcing platform used to collect data and to run surveys, are not representative of either the general population or of the electorate.
Any effort to collect data from Facebook and run surveys online also has to contend with serious data quality problems. Our recent research has shown that about half of the people on such online platforms provide low quality or fake data.
Another drawback of data gathering is its cost. Asking people to provide information on Amazon MTurk costs money. If the number of people is large, the cost may become outrageous.
But however challenging it is to handle, data is only going to become more important for campaigners. And if it’s used correctly, it might help improve the state of democracy, too – since the more candidates know about their voters, the better they can understand their needs.
Gianluca Demartini receives funding from the Engineering and Physical Sciences Research Council (EPSRC) and is a member of the Association for Computing Machinery (ACM).
Gianluca Demartini, Senior Lecturer in Data Science, University of Sheffield
This article was originally published on The Conversation. Read the original article.



Trump Signals Opposition to USMCA Renewal as U.S. Reviews Trade Relations with Canada and Mexico
Xi’s North Korea Visit Strengthens Ties and Elevates Kim Jong Un’s Global Standing
DOJ Sues Virginia Over Law Enforcement Mask Ban
Trump Signals Possible U.S.-Iran Peace Deal as Hormuz Reopening Nears
KMT Chair Cheng Li-wun Defends Taiwan-China Engagement During U.S. Visit
U.S.-Iran Peace Talks Continue Despite Escalating Military Strikes
Trump Administration Plans Deportation of Iranian Migrants to Central African Republic Under New Third-Country Deal
US-Iran Peace Deal Nears as Tehran and Pakistan Signal Breakthrough
US Plans NATO Force Reduction in Europe Amid Defense Burden Dispute
IMF Advances Ukraine Loan Program, Clears $690M Disbursement
South Korea Ex-President Yoon Suk Yeol Sentenced to 30 Years Over Martial Law Plot
France Hosts Israeli-Palestinian Civil Society Appeal to Revive Two-State Solution Ahead of G7 Summit
G7 Summit 2026 to Focus on Middle East Conflicts, Ukraine War, and Global Economic Challenges
Peru Election 2026: Fujimori Holds Narrow Lead as Contested Votes Face Review
Trump Nominates Jay Clayton as DNI Amid FISA Surveillance Dispute
North Korea Slams U.S. Missile Sale to South Korea, Warns of Rising Regional Tensions
JCPOA Nuclear Deal Explained as U.S. Nears Potential New Iran Peace Agreement 



