Earlier in this chapter, we used preelection polls with
a probability model to predict Obama’s electoral vote share in the 2008 US
election. In this exercise, we will apply a similar procedure to the Intrade
betting market data analyzed in an exercise in chapter 4 (see section 4.5.1).
The 2008 Intrade data set are available in table 4.9. Recall that each row of
the data set represents daily trading information about the contracts for
either the Democratic or Republican nominee’s victory in a particular state.
The 2008 election results data are available as pres08.csv, with variable names
and descriptions appearing in table 4.1.
1.
We analyze the contract of the Democratic Party
nominee winning a given state . Recall from section 4.5.1
that the data set contains the contract price of the market for each state on
each day leading up to the election. We will interpret PriceD
as the probability that the Democrat would win state if the election were held on day . To treat PriceD
as a probability, divide it by 100 so it ranges from 0 to 1. How accurate is
this probability? Using only the data from the day before the election
(November 4, 2008) within each state, compute the expected number of electoral
votes Obama is predicted to win and compare it with the actual number of
electoral votes Obama won. Briefly interpret the results. Recall that the
actual number of electoral votes for Obama is 365, not 364, which is the sum of
electoral votes for Obama based on the results data. The total of 365 includes
a single electoral vote that Obama garnered from Nebraska’s 2nd Congressional
District. McCain won Nebraska’s 4 other electoral votes because he won the
state overall.
data(pres08, package
= "qss")
2.
Next, using the same set of probabilities used
in the previous question, simulate the total number of electoral votes Obama is
predicted to win. Assume that the election in each state is a Bernoulli trial
where the probability of success (Obama winning) is . Display the results using
a histogram. Add the actual number of electoral votes Obama won as a solid
line. Briefly interpret the result.
3.
In prediction markets, people tend to exaggerate
the likelihood that the training or “long shot” candidate will win. This means
that candidates with low (high) have a true probability that is lower (higher)
than their predicted . Such a discrepancy could
introduce bias into our predictions, so we want to adjust our probabilities to
account for it. We do so by reducing the probability for candidates who have a
less than 0.5 chance. We will calculate a new probability where is the CDF of a standard normal random
variable and is its inverse, the quantile function. The R
functions pnorm()
and qnorm()
can be used to compute and , respectively. Plot , used in the previous
questions, against . In addition, plot this
function itself as a line. Explain the nature of the transformation.
4.
Using the new probabilities , repeat questions 1 and 2.
Do the new probabilities improve predictive performance?
5.
Compute the expected number of Obama’s electoral
votes using the new probabilities for each of the last 120 days of the campaign.
Display the results as a time-series pot. Briefly interpret the plot.
6.
For each of the last 120 days of the campaign,
conduct a simulation as in question 2, using the new probabilities . Compute the quantile of
Obama’s electoral votes at 2.5% and 97.5% for each day. Represent the range
from 2.5% to 9.75 for each day as a vertical line, using a loop. Also, add the
estimated total number of Obama’s electoral votes across simulations. Briefly
interpret the result.
Get Free Quote!
291 Experts Online