Consider a regression of y on x1, x2 and x3. You are told that x1 and x3 are positively correlated but x2 is uncorrelated with the other two variables.

Consider a regression of y on x1, x2 and x3. You are told that x1 and x3 are positively
correlated but x2 is uncorrelated with the other two variables.
(a) [3] What, if anything, can you say about the relative magnitudes of the estimated
coefficients on each of the three explanatory variables?
(b) [6] What, if anything, can you say about the precision with which we can estimate
these coefficients?

  1. Consider a regression of y on two explanatory variables, x1 and x2, which are potentially
    correlated (though not perfectly). Say that x1 can take on any value between 1 and
  2. A researcher draws a random sample of observations, with information on y, x1
    and x2. She runs a regression on this sample, which we refer to as regression A.
    She then takes the subset of the data where x1 is restricted to only take values between
    1 and 50, but there is no restriction on x2. She runs another regression, which we refer
    to as regression B.
    (a) [4] Do you expect the estimated coefficients to differ between regressions A and
    B? Explain.
    (b) [5] Do you expect any difference in the precision of the estimated coefficients
    between regressions A and B? Explain.
  3. [5] Consider a regression on three explanatory variables, x1, x2 and x3. Consider two
    possible F-tests for
    (a) the joint significance of x2 and x3.
    (b) the joint significance of x1, x2 and x3.
    In which of these is the null hypothesis more likely to be rejected? Provide both an
    intuitive and a mathematical explanation.
  4. [10] A researcher has data on a number of different airline routes. In particular, for
    each route, he observes the average fare on the route, avfare and the number of airlines
    operating on the route, carriers. The researcher would like to obtain a measure of
    consumer demand on each route, but is unable to access this variable. He is concerned
    that the lack of this variable will cause a simple regression of avfare on carriers to
    generate an estimate that is biased downward. Do you agree?
    In order to answer this question you will need to make assumptions based on economic
    analysis. State your assumptions clearly, even if you are unsure about them, and then
    explain whether the slope coefficient will indeed be biased, and if so, how.
  5. Suppose you need to estimate a regression using matrix algebra. A potential X matrix
    of explanatory variables is as shown below. There is also a [5 × 1] vector, y, of the
    dependent variable that is not shown.
    x1 x2 x3 x4 x5
    2 4 8 52 44
    2 7 14 47 48
    3 2 4 51 23
    6 0 0 49 47
    8 6 12 47 58
    (a) [4] Suppose you want to regress y on only x1, x2 and x3. Explain what problems,
    if any, you would encounter in doing so.
    (b) [4] Suppose you want to regress y on x1, x2 and x4. Explain whether you would
    encounter any problem doing so. If not, describe in detail how you would go
    about using matrix algebra to do so.
    (c) [4] Suppose you want to regress y on x1, x2, x4 and x5. However, a colleague points
    out that this is not a square matrix. Explain whether this is a valid concern. If
    not, describe in detail how you would go about using matrix algebra to do so.
  6. [5] Consider the results from regressing the log of wages, lwage on years of education,
    educ and years spent in the workforce, exper :
    lwage = 0.532 + .094educ + .026exper (1)
    Suppose that each additional year of education must necessarily reduce workforce experience by one year. What is the marginal effect of an additional year’s education on
    wages? [Use the exact, rather than the approximate, percentage interpretation.]
  7. [5] Consider using a one-tailed as well as a two-tailed test of a null hypothesis regarding an estimated regression coefficient. For the same significance level, which test is
    more stringent? In other words, for which test does rejection of the null hypothesis
    automatically imply rejection in the other test as well, but not vice versa? Explain.
  8. Explain whether, and how, the critical value of a t-test, for a given significance level
    is affected by:
    (a) [3] The number of observations.
    (b) [3] The number of explanatory variables.
  9. Consider the following OLS model of women’s labour force participation:
    inlf = β0 + β1kids0 2 + β2kids2 6 + β1educ + β2faminc
    where inlf is a dummy variable for whether the woman is in the labour force, kids0 2
    is the number of children aged between 0 and 2, kids2 6 is the number of children
    between the ages of 2 and 6, educ is the number of years of education of the woman
    and faminc is the family’s total income.
    Suppose a researcher is interested in testing whether the presence of infants (age 0 to 2)
    has the same effect of being in the labour force as the presence of children aged between
    2 and 6. The alternative is a one-sided test that infants have a greater disincentive
    effect than slightly older children.
    (a) [4] Write down the null and alternative hypotheses formally.
    (b) [8] Can you directly test the null hypothesis if given the coefficients and standard
    errors from this regression? If so, explain how. If not, explain what you would
    need to do instead.
  10. Suppose you have data on a sample of recent house sales, for each of which you observe
    the house price, price, the square feet of the house, sqrft, and a dummy variable for
    whether the house has a garage or not, garage. You want to run a regression of the
    log of the house price, with the other two variables as explanatory variables.
    (a) [4] Briefly explain what signs you expect to find on the two slope coefficients.
    (b) [3] Do you expect a positive or negative correlation between sqrft and garage?
    Explain your reasoning.
    (c) [6] Suppose you run the regression, and also include an interaction between the
    two right-hand side variables, and obtain the following results (standard errors
    not shown):
    log(price) = 2.465 + 0.64 log sqrft + 0.08garage + 0.011 log sqrft × garage (2)
    Calculate the marginal effect of having a garage for a 2000 square foot house.
    (d) [6] Suppose you also now obtain data on coveredlot, which is the total built upon
    area of the lot. In other words this is the sum of the square footage of the house
    and of the garage. Would it be sensible to add this variable to the regression
    above? What problems, if any, might you encounter in estimating the magnitude
    or precision of the coefficient on this new variable?
  11. [8] You have monthly data on gasoline prices in two cities—Vancouver and Toronto,
    for the years 2006–2010. In each month of each year, you observe the average price
    of gasoline in each city. Prices in Vancouver are usually higher than in Toronto, but
    the cities follow similar price trends, as prices rise in the summer months and respond
    similarly to demand and cost shocks. However, there are month-to-month fluctuations
    for various reasons.
    Starting from January 1, 2008, Vancouver imposed a carbon tax which was expected
    to be reflected in higher gasoline prices. Explain how you would use a difference-indifferences framework to estimate the effect of the carbon tax. Carefully define any
    new variables you need based on the data provided. Then, write down a line of R
    code which will run the regression you need. Make sure you point out which regression
    coefficient is the desired estimate.
Still stressed from student homework?
Get quality assistance from academic writers!