How to Read a Voter File - Scoring Party Affinity and Turnout Propensity - A Practical Guide to Voter Targeting for Campaigns
Lots of candidates guess at who to talk to. Many will knock every door, call every phone number, and try to win over every voter. But many registered voters never show up on election day, and many who do will always vote for the same party. They're unpersuadable. You need to know exactly who to talk to first, and to do that you'll need to narrow down your list. Here's how I read a voter file and how you can do it yourself.
Start With Two Questions
When I look at a voter file, I'm trying to answer two questions as fast as possible:
- Which party does this voter lean toward?
- How likely are they to actually show up?
Together they tell you who to prioritize and what message they should hear. Get those two questions wrong and you're burning volunteer hours on the wrong doors.
Party Affinity
I sort voters into five groups: Hard Rep, Soft Rep, Independent, Soft Dem, and Hard Dem.
Hard partisans have voted in their party's primary consistently and recently. Soft partisans have voted in a party primary at some point, or registered with a party but rarely participate in primaries. Everyone else, including people who've voted in primaries for both parties, goes in the Independent column.
Propensity to Vote
Propensity is simpler. It's just a score based on how often someone has shown up to vote in past general elections. I run it on a three-point scale: one point for the most recent general, one point for each of the two most recent elections of the same type (presidential or gubernatorial). A score of three means they vote every time and zero means they never do.
How to Score a Voter File in Excel
You don't need a data firm or a programming background to do basic voter analysis. Excel gets you most of the way there for most smaller districts. However, you do need to be well versed in formulas.
Propensity Score
Most voter files use a blank cell to indicate someone has not voted, and so I will show you how to handle those. Just be aware that this will not work in all instances.
Write a SUM function and nest an IF(ISBLANK) for each election year you want to include. If the voter field for that year is blank, they didn't vote; if it's filled, give them a point.
=SUM(IF(ISBLANK(B2),0,1), IF(ISBLANK(C2),0,1), IF(ISBLANK(D2),0,1))
Party Affiliation Score
This section applies only to primary votes, so you can feel free to use the most recent years.
Write a SUM of IF functions that check which party ballot a voter pulled in each primary year. Run the same formula twice: once for Republican primaries, once for Democratic. That gives you two columns to compare.
=SUM(IF(B2="Republican Ballot",1,0), IF(C2="Republican Ballot",1,0), IF(D2="Republican Ballot",1,0))
Rolling Scores Into a Label
Once you have raw scores, you want a single column that names each voter's category. That requires nested IF functions. The logic follows this path: if they're registered Republican and voted in multiple Republican primaries, they're Hard Rep. Voted in fewer? Soft Rep. Voted in Democratic primaries despite registration? Independent. And so on through the full five-category system.
Excel Has a Row Limit
Excel tops out at 1,048,576 rows. Most single-district files fit fine. Statewide files won't. If your file is large, you'll need Python or SQL, or you contract someone to run it for you. Excel will also crash without warning on large operations. Save Constantly!
Building Your Target Universe
Once you have party affinity scores and propensity scores, combining them gives you your target universe. There are three basic categories:
Swing Targets
High propensity voters with conflicint or unknown party affiliation. These are the people you send your best volunteers to. They're likely to show up, and you don't know yet if they're voting for you. Persuasion happens here.
GOTV Targets
Voters who lean your direction but have a low propensity to vote. Your job is to get them to the polls. Depending on your state, break this further into absentee, early vote, and Election Day subgroups and contact each one at the right time.
Opposition / Dissuasion
High propensity voters who lean against you. Most campaigns ignore them, which is usually right. Some campaigns send dissuasive messaging to reduce their enthusiasm. Know who they are even if you don't contact them.
Setting It Up in Excel
Use IF, AND, and OR to assign each voter a target label based on their scores. The logic reads like this: if party is Soft Rep, Independent, or Soft Dem, and propensity is above 2, label them a Swing Target. If party is Soft Rep or Hard Rep and propensity is below 3, label them a Rep GOTV Target. And so on.
=IF(AND(OR(E2="Soft Rep",E2="Ind",E2="Soft Dem"),F2>2),"Swing Target",IF(AND(OR(E2="Soft Rep",E2="Hard Rep"),F2<3),"Rep GOTV Target",IF(AND(OR(E2="Soft Dem",E2="Hard Dem"),F2<3),"Dem GOTV Target","None")))
The logic works in three steps. First, it checks whether the voter leans Republican or Democratic by looking at their party label. Second, it checks whether their propensity score is above or below a threshold. Third, it combines those two checks to assign a label. A Swing Target has a mixed or unknown party lean and a high propensity score. A GOTV Target leans one direction but has a low propensity score. Anyone who doesn't meet either condition gets labeled None.
Reporting With Pivot Tables
Once your file is scored and labeled, a pivot table is the fastest way to get a count of each universe. Select all your data (Ctrl+A), go to Insert, and add a pivot table. Put party affiliation as your rows, propensity as your columns, and count any field everyone has, like Voter ID. Filter out inactive voters using the report filter. That gives you a clean breakdown of your district by segment.
A Few Things to Watch
- Always filter out inactive voters before you report counts. They inflate your universe and waste your field team's time.
- If you drag a field into both the Report Filter box and the Row/Column Labels box, unfilter it when you remove the label. Excel remembers the filter even when the label is gone and will quietly mess up your next report.
- Use conditional formatting on your pivot table counts to make it immediately clear which segments are largest. Home, Conditional Formatting, and pick a color scale.
When to Move Beyond Excel
Excel works well for a single district or a smaller county file. Once your file gets large, complex, or spans multiple geographies, you need something more powerful. That means SQL, Python, or both.
SQL
SQL is the language most database systems use. It's relatively easy to pick up and lets you run queries against large files that would freeze Excel instantly.
The commands you'll use most: SELECT to pull records, WHERE to filter by criteria, AND and OR to combine filters, JOIN to merge tables (like your voter file and a surnames file), and GROUP BY to count totals by segment.
Python
Python is better than SQL for manipulation and analysis. The library you need is Pandas, which works like a supercharged Excel. You load your voter file into a dataframe and then write a few lines to clean columns, score propensity, label targets, and export a new file.
A few lines of Python can change the capitalization of millions of cells, merge datasets on a shared ID, and create a new labeled column across every row in seconds. The same task in Excel would take an hour and might crash twice.
For visualization, Python can generate charts and maps directly from your dataframe. For geographic work, QGIS is free and lets you build heat maps by precinct or district from a shapefile. The U.S. Census publishes shapefiles for most legislative districts, states, and counties.
Which Tool Do You Actually Need?
- Single district, under 500,000 voters: Excel does the job. Learn the formulas in this guide and you're set.
- Multi-district or statewide analysis: Python with Pandas. Start with Jupyter Notebook if you have no Python experience.
- Working with commercial campaign software databases: you'll need SQL. Most campaign platforms store data in SQL-based systems.
- Maps and geographic targeting: QGIS. Free, well-documented, and works with Census shapefiles out of the box.
Python and SQL Together
You don't have to pick one. PostgreSQL and MySQL both have Python libraries. If you're pulling data from a field app and doing in-depth analysis on it, you'll likely end up using both. Python for manipulation and analysis, SQL for querying the underlying database.
If your party or organization already uses a specific language, learn that one first. The concepts carry over easily once you know one of them.
FAQ
You work from primary ballot history instead. In open-primary states, you can see which ballot a voter pulled each cycle. That's your affinity signal. Someone who has pulled a Republican ballot in three of the last four primaries is functionally a Soft or Hard Republican regardless of what their registration says.
Label them Independent. A voter who pulls a Republican ballot in one cycle and a Democratic ballot in another is not a hard partisan. They belong in your persuasion universe if they have high propensity, or in your general watch list if they don't. Don't try to force them into a party bucket.
I usually go back three elections of the same type: the last three presidential cycles for a presidential race, the last three gubernatorial cycles for a gubernatorial. Going further back reduces the relevance of the score. Old participation data tells you less about what someone will do this November than recent behavior does.
Save constantly and work on one column at a time. If you're running a VLOOKUP or a nested IF across a large file, let it finish before doing anything else. If crashes are frequent, your file is probably too large for Excel to handle reliably. That's when you move to Python.
Start with whichever one your organization already uses. If you're starting from scratch and just want to analyze voter files, Python with Pandas gets you further faster. SQL becomes necessary when you're pulling data out of a commercial database system. Most serious data work eventually uses both.
Tools and Resources
These are the tools and references I use when working through voter data analysis with candidates and campaign managers.
Data Tools
Want Someone to Run This For You?
Not every campaign has the time or staff to build out a full voter targeting model from scratch. I do this work directly for candidates and campaign teams. If you want clean targeting and a clear universe before your next door push, let's talk.
Schedule a Strategy CallLast updated: June 2026.