Wednesday, September 16, 2009

Seat Projection Explanation (The Long Version)

After I started doing seat projections in July, I took a post to try to explain how my model works. However, I keep getting comments questioning my projections. So without to much overlap, I'm going to try again. I'll try to work this through from the beginning of my thought process. Being an avid follower of politics on both sides of the border, I often get envious of the amount of information available to American politics buffs. One of the things that particularly made me envious was Charlie Cook's Partisan Voting Index (PVI). In Canada, when we say a riding is "safe" we don't really have a quantitative measure, in the US, the PVI gives a decent idea. Originally, I set out to recreate a PVI type of model for Canadian politics. A parliamentary system changes the available data so what I decided to do with my non-existent stats background was compare each riding's vote share in the last three elections for the five major parties and other candidates to the results federally and provincially. I chose the last three elections because the parties and ridings are the same which makes things relatively simple. From that comparison, I derived a general positive or negative rating for each party in each riding. I quickly realized that these numbers are relatively meaningless by themselves but are relatively easily translated into a projection given some decent regional polling.

In other words, to generate the ratings I took for each previous election:

Riding Rating (2004)= Party Vote Share in Riding (2004) - (x% of Federal vote share (2004) x y% of Provincial Vote Share (2004))

I then added the three elections together:

Riding Rating = a% of Riding Rating 2004+b% of Riding Rating 2006 + c% of Riding Rating 2008
(where a%+b%+c% = 100%)

When I fell upon threehundredeight, it gave me a decent way of having a reliable polling aggregate without having to botch one together myself. The numbers I generate are a mixture of the federal and provincial polling added to that negative or positive rating I mentioned above.

In other words I manipulated the formula above to solve for riding vote share:

Party Share in Riding Today = Riding Rating + (x% of Federal Vote Share (Today) + y% of provincial Vote Share (Today))

Other projections use different methods to come to their conclusions. A lot of seat projections don't go riding by riding and just use regional and federal polling to estimate the number of seats a party is going to pick up based on previous experience. When there is a riding-by-riding component, it is usually based on non-polling data. For instance, when Lindsay Duncan defeated Rahim Jaffer in Edmonton-Strathcona, some projections predicted that result which would only be discernible from the facts on the ground or a local poll commissioned because of the local events. Since I have no ambition to keep tabs on 308 races or access to local polling, I pretty much put my numbers out without any changes. I can't tell sitting in Toronto whether a star candidate will be successful (see Thomas Mulcair) or a failure (see Glen Murray), therefore I don't try to guess. The projections that I've put out so far have only two changes from what my numbers tell me. I've noted them before. First, in Cumberland--Colchester--Musqodobolt Valley where Bill Casey's retirement makes my projection of 40.44% for an independent candidate look silly. Second, in Nunavut where there is no regional polling and there was a massive change in voter preference last time out.

As I've mentioned previously, my model seems to work best when riding results are fairly consistent with federal and provincial trends. If a party massively gains or loses fortune, it becomes more difficult to assess. This is more true if the change was between 2006 and 2008 than it is if it was between 2004 and 2006 because of the heavy weight I give to the 2008 result. Thus, there's probably ten or so ridings where I don't really trust my numbers. Outremont and Edmonton--Strathcona spring to mind. Whether or not my model will be proven accurate on election day is not yet tested. After the next election, I'll put out an accuracy measure. Well, probably two accuracy measures. One will be based on my last projection before the election which will rely on pre-election polling. The other one that I'd like to do is insert the election result federally and provincially into my model and see what that would have produced compared to the actual result. While that may be a little bit of revisionist history it does more accurately isolate my model as opposed to the combined accuracy or our various public polling firms. Until I can test the model with an election, you'll just have to take my projections for what they are. Finally if you're interested (and if you've read this far you might be), here's the data for the Conservative held riding of Pontiac (QC) from my most recent projection as a random example:

Riding Rating:
CPC + 2.98
LPC - 1.59
NDP -0.42
GPC -0.06
BQ -12.57

Federal Poll (from threehundredeight):
CPC 33.2
LPC 32.1
NDP 15.8
GPC 9.3
BQ 9.2

Quebec Poll (from threehundredeight):
CPC 16.1
LPC 30
NDP 10.9
BQ 36.8

CPC 25.92
LPC 29.25 *
NDP 12.44
GPC 7.26
BQ 24.23

* Projected winner

I hope that clarifies for everyone. I'll try to get a new projection out in the next few days (assuming Eric at threehundredeight updates Thursday or Friday). No promises between the by-election tomorrow and the holiday this weekend.

