A1Consider​​ a three-period​​ panel regression​​ model

Yi1=β0+β1Xi1+α1+εi1

Yi2=β0+β1Xi2+α1+εi2

Yi3=β0+β1Xi3+α1+εi3

where​​ Yitand​​ Xitdenote​​ the​​ values​​ of​​ an​​ outcome​​ and​​ a​​ regressor​​ repectively​​ for the​​ ith​​ individual ​​ in period t, ​​ with​​ i=​​ 1, ..., n and​​ t=1, 2, 3; ​​ αi​​ is​​ an​​ individual-specific​​ unobserved​​ effect,​​ and​​ εit​​ is​​ an​​ unobserved​​ error​​ term.​​ Assume​​ that​​ we​​ have​​ n​​ individuals​​ randomly​​ sampled​​ from​​ the​​ population.

• Suppose​​ we​​ pool​​ all​​ the​​ observations​​ so​​ that​​ there​​ are​​ 3​​ observations​​ per​​ individual corresponding​​ to the 3​​ time​​ periods and​​ thus​​ a total of 3n​​ observations​​ altogether.​​ Now,​​ suppose​​ we​​ regress​​ Yi​​ on​​ Xi​​ (including​​ an​​ intercept)​​ using all 3n​​ observations.​​ Under what assumption​​ will this​​ pooled​​ OLS​​ estimator​​ give​​ us​​ consistent​​ estimates​​ of​​ β1?

• Now​​ consider​​ the​​ second​​ -differenced estimator obtained​​ by​​ regressing

(Yi3 ​​ ​​ ​​​​ ​​ Yi2) ​​​​ ​​ (Yi2 ​​ ​​ ​​​​ ​​ Yi1)​​ on​​ the​​ regressor​​ (Xi3 ​​ ​​ ​​​​ ​​ Xi2) ​​​​ ​​ (Xi2 ​​ ​​ ​​​​ ​​ Xil).​​ ​​ Under​​ what​​ conditions​​ will​​ this​​ estimator​​ be​​ unbiased​​ for​​ the​​ causal​​ effect​​ of X on Y​​ ?

• Under what condition​​ would​​ you​​ prefer​​ the​​ estimator​​ in​​ part​​ (a)​​ over​​ the​​ estimator​​ in​​ part​​ (b)?

1​​ of​​ 12

A2 ​​ ​​ ​​​​ ​​ Consider​​ the​​ regression​​ model​​ Yi=β0+β1Xi+Ui​​ for​​ an​​ I.I.D.​​ sample​​ with​​ N=1000​​ observations.​​ Suppose​​ Ui​​ I.I.D.​​ (0,σ2)​​ and​​ Xi are​​ I.I.D.​​ for​​ i=1,2,,1000,​​ and​​ ​​ that​​ ​​ the​​ ​​ Xi​​ ​​ is​​ ​​ independent​​ ​​ of​​ ​​ Ui. ​​​​ ​​ Let​​ ​​ /β1^denote​​ ​​ the​​ OLS​​ estimator​​ of​​ β1,​​ and​​ consider​​ another​​ estimator​​ β1^ of β1,​​ constructed​​ in​​ the following way:

/

β1^=Y3+Y1-2Y2X3+X1-2X2

You​​ can​​ assume​​ that​​ Xi​​ are​​ continuously​​ distributed​​ and​​ that​​ X3+X1-2X2 never takes the value 0.

• Is​​ β1^​​ ​​ an​​ unbiased​​ estimator​​ of​​ β1?​​ ​​ Why?

• Can​​ /β1^​​ ​​ be​​ a​​ better​​ estimator​​ than​​ the​​ OLS​​ estimator?​​ ​​ Why?

• Is​​ β1^a consistent​​ estimator​​ of​​ β1?​​ ​​ Why?

3​​ of​​ 12

(TURN​​ OVER)

A3Consider​​ the​​ regression​​ model​​ for​​ whether​​ a​​ student​​ gets​​ a​​ first​​ class​​ mark​​ in​​ the final​​ examination:

Firsti=1 if β0+β1 Malei+Ui>0,0 otherwise

for​​ an​​ I.I.D.​​ sample​​ with​​ 331​​ observations,​​ where​​ Malei​​ is​​ a​​ dummy​​ variable which​​ equals​​ 1​​ if​​ student​​ i​​ is​​ male​​ and​​ is​​ zero​​ otherwise,​​ and​​ Firsti​​ is​​ a​​ dummy​​ variable​​ for​​ whether​​ student​​ i​​ got​​ a​​ first​​ class​​ mark,​​ and​​ is​​ zero​​ otherwise.​​ Table​​ 1​​ at​​ the​​ end​​ of​​ this​​ question​​ gives​​ the​​ distribution​​ of​​ First​​ by​​ gender.

• Suppose the​​ unobserved component​​ Ui​​ is​​ independent​​ of​​ Maleiand​​ follows​​ a​​ normal distribution​​ with​​ mean​​ 0 and​​ variance​​ 1.​​ What​​ is the​​ probability​​ of​​ getting​​ a​​ first​​ class​​ degree​​ for​​ male​​ and​​ for​​ female​​ students​​ in​​ terms​​ of​​ β0 and β1?

• Given​​ your​​ answer​​ to​​ part​​ (a),​​ how​​ would​​ you​​ estimate​​ β0 and β1 ​​​​ based​​ on the​​ information​​ provided​​ in​​ Table​​ 1?(Hint: remember​​ that​​ the​​ cumulative distribution function for the normal distribution is strictly increasing).

• Is​​ it​​ reasonable​​ to​​ assume​​ that​​ Ui​​ is​​ independent​​ of​​ M​​ alei?

Table 1

 Female Male Total FirstNot First 19 3497 181 53278 Total 116 215 331

4​​ of​​ 12

A4​​ The​​ following​​ problem concerns​​ the​​ phenomenon​​ of​​ "job-lock"​​ in the​​ United States,​​ where​​ employees​​ cannot​​ leave​​ their​​ present​​ job​​ because​​ of​​ employer​​ provided​​ health-insurance.​​ If they​​ leave​​ the​​ present​​ job and​​ take​​ up a new job,​​ then​​ the​​ new​​ health​​ insurance​​ will​​ not​​ cover​​ them​​ for​​ pre-existing​​ health conditions, leading​​ to a​​ job-lock.​​ In​​ order​​ to study this​​ phenomenon empiri-​​ cally,​​ a​​ researcher​​ estimates​​ the​​ following​​ probit​​ model​​ on​​ male​​ workers​​ in​​ the​​ construction industry who​​ are​​ between​​ 35-45​​ years​​ old and​​ ethnically​​ white:

PrYi=1HIi=ϕ(α0+α1HIi)

where​​ Yi​​ is​​ a​​ dummy​​ variable​​ which​​ equals​​ 1​​ if​​ individual​​ i​​ has​​ changed​​ jobs in​​ 2013​​ and​​ is​​ zero​​ otherwise,​​ HIi​​ is​​ a​​ dummy​​ for​​ whether​​ individual​​ i​​ was​​ covered​​ by​​ employer​​ provided​​ health​​ insurance​​ in​​ 2012​​ and​​ is​​ zero​​ otherwise,​​ and​​ ϕ (.)is​​ the​​ standard​​ normal​​ cumulative​​ distribution​​ function.

• How​​ would​​ you​​ test​​ whether there​​ is​​ any​​ job-lock, based​​ on this​​ equa-​​ tion?​​ Can​​ you​​ think of a​​ reason​​ why​​ this test​​ may​​ not be a​​ satisfactory indicator​​ of​​ job-lock?

• Now​​ let​​ Pibe​​ a​​ dummy​​ variable​​ taking​​ the​​ value​​ 1​​ if​​ individual​​ ihas​​ a​​ chronic​​ medical​​ condition​​ which​​ will​​ not​​ be​​ covered​​ by​​ the​​ new​​ insurance​​ plan​​ if​​ he​​ changed​​ jobs,​​ and​​ Piis​​ zero​​ otherwise.​​ Consider​​ the​​ equation

PrYi=1HIi,Pi=ϕβ0+β1HIi+β2Pi+β3Pi × HIi.

How​​ would​​ you​​ test​​ the​​ presence​​ of​​ job-lock​​ using​​ estimates​​ of​​ parameters appearing in this​​ equation?​​ Why?

• Would​​ you​​ prefer​​ the method​​ based​​ on the​​ equation​​ in​​ part​​ (b) or the one in​​ part​​ (a)?​​ Why?

5​​ of​​ 12

(TURN​​ OVER)

A5A​​ researcher​​ considers​​ the​​ following​​ expectations​​ augmented​​ Phillips​​ curve

πt-πte=βUt-μ+et,

where​​ πtis​​ inflation,​​ πte​​ is​​ expected​​ inflation​​ with​​ the​​ expectation​​ formed​​ in​​ year​​ t​​ -​​ 1, Ut​​ is the​​ unemployment​​ rate,​​ µ is the​​ natural​​ rate of​​ unemployment,​​ and​​ et​​ is​​ a​​ supply​​ shock.​​ The​​ researcher​​ has​​ data​​ on UK​​ annual​​ CPI​​ inflation and the​​ unemployment​​ rates​​ for the period​​ 1989-2014.​​ ​​ OLS​​ regression​​ of​​ πt=πt-πt-1on​​ Ut​​ gives

πt^=1.61-0.25Ut,R2=0.1535, T=25

(0.89)(0.l2)

with standard errors in parentheses.

• Explain​​ how​​ the model​​ estimated​​ by​​ OLS​​ is​​ related​​ to the​​ expectations​​ augmented​​ Phillips​​ curve.​​ What​​ would​​ be​​ your​​ estimate​​ of​​ µ?

• Under​​ what​​ assumptions​​ is​​ the​​ OLS​​ estimator​​ of​​ the​​ coefficient​​ on​​ Ut​​ unbiased? Discuss the validity of these assumptions.

• Test​​ the​​ hypothesis​​ that​​ Ut​​ does​​ not​​ affect​​ πt​​ using​​ a​​ 5%​​ significance​​ level​​ test.​​ Give​​ two​​ reasons​​ why​​ the​​ results​​ of​​ your​​ test​​ might​​ be​​ unreliable.

6​​ of​​ 12

A6​​ A​​ researcher​​ studies​​ demand​​ for​​ cash​​ in​​ the​​ UK.​​ She​​ runs​​ a​​ regression​​ of​​ the​​ logarithm​​ of the cash in​​ circulation,​​ log​​ Mt,​​ on the​​ logarithm​​ of the​​ nominal household​​ final​​ consumption expenditure,​​ log​​ Ct,​​ using​​ quarterly​​ data from​​ 1985q1​​ to​​ 2006q1. The OLS results (standard errors​​ in​​ parentheses)​​ are:

logMt^=0.596+1.10 logCt,  R2=0.9570, T=85

(0.26)(0.026)

• How​​ would​​ you​​ interpret​​ the​​ estimated coefficient​​ on logCt?

• A​​ colleague points​​ out​​ that​​ both log Mt​​ and log Ct​​ contain​​ a​​ clear time trend. What​​ might​​ be​​ consequences​​ of this fact for the​​ validity​​ of the​​ above​​ regression​​ results?

• Reestimating​​ the​​ regression​​ with​​ time trend​​ gives

logMt^=20.08+0.03 t-0.96 logCt,  R2=0.9923, T=85(1)

(1.01)(0.0015)(0.11)

so​​ that​​ the​​ researcher​​ becomes​​ very​​ puzzled​​ about​​ the​​ sign​​ of​​ the​​ esti-​​ mated​​ coefficient​​ on​​ log​​ Ct.​​ ​​ The​​ colleague​​ gets​​ the​​ residuals​​ uAt​​ from​​ (1),​​ and​​ obtains​​ the​​ following​​ OLS​​ result:

.u^t^=0.0006-0.064u^t-1

(0.001)(0.030)

Where​​ u^t=u^t-u^t-1​​ ​​ What​​ does​​ this​​ result​​ tells​​ us​​ about​​ the​​ validity​​ of​​ (1)?​​ Can​​ you​​ propose​​ a​​ better​​ specification​​ for​​ the​​ regression​​ model​​ describing​​ the​​ demand​​ for​​ cash?

7​​ of​​ 12

(TURN​​ OVER)

i

A7 In​​ order​​ to​​ understand​​ the​​ relation​​ between​​ TV​​ watching​​ and​​ obesity among​​ children,​​ a​​ researcher​​ estimates​​ the​​ following​​ equation​​ by​​ OLS,​​ with​​ standard errors​​ reported in​​ parentheses:

ltvyesti=0.99+0.01bmi+0.018agei-0.0009agesqi-0.034femalei+Ui

(0.02) (0,001) (0.011) (0.005) (0,012)

where​​ ltvyestidenotes​​ log​​ of​​ hours​​ spent​​ watching​​ TV​​ by​​ the​​ ith​​ child​​ on​​ the​​ day​​ before​​ the​​ survey​​ was​​ taken,​​ femalei​​ ​​ is​​ a​​ dummy​​ which​​ equals​​ 1​​ if​​ the​​ child​​ is​​ female​​ and​​ is​​ 0​​ otherwise,​​ agesqiis​​ the​​ square​​ of​​ the​​ child's​​ age,​​ bmi,is the​​ child's​​ body​​ mass index​​ (weight​​ for​​ height)​​ and Ui​​ is an​​ unobserved​​ error​​ term.

• Assuming​​ the​​ Gauss-Markov​​ assumptions hold, what​​ is the​​ interpretation​​ of​​ the​​ coefficient​​ -0.034​​ on​​ female?

• What​​ do the​​ above​​ estimates imply​​ about​​ how watching​​ TV​​ varies​​ with​​ age,​​ all else being​​ equal?

• Provide​​ an​​ example​​ of an​​ omitted​​ variable which would​​ imply that​​ the​​ estimated​​ coefficient​​ of​​ bmi​​ in​​ the​​ above​​ equation​​ is​​ biased​​ for​​ the​​ causal​​ effect of​​ bmi​​ on​​ watching​​ TV.

8​​ of​​ 12

SECTION B

B1​​ The​​ following​​ question​​ pertains​​ to​​ the​​ effect​​ of​​ background​​ characteristics​​ on the​​ probability​​ that​​ an​​ applicant​​ is​​ admitted​​ to​​ study​​ economics​​ at​​ a​​ selective​​ UK​​ university.​​ TSA​​ is​​ an​​ aptitude​​ test​​ with​​ two​​ components​​ -​​ Critical​​ Thinking​​ and​​ Problem​​ Solving​​ -​​ in​​ each​​ of​​ which​​ the​​ maximum​​ possible​​ mark​​ is100.​​ We​​ draw​​ a​​ random​​ sample​​ of​​ 800​​ applicants​​ and​​ record​​ their​​ characteristics​​ and​​ whether​​ they​​ were​​ admitted.​​ The​​ summary​​ statistics​​ are​​ reported​​ in​​ Table​​ 2,​​ at​​ the​​ end​​ of​​ this​​ question.​​ Finally,​​ a​​ logit​​ regression​​ of​​ admission​​ on​​ these​​ background​​ characteristics​​ yields​​ the​​ output​​ reported​​ in​​ Table​​ 3​​ which​​ appears​​ on​​ the​​ next​​ page.​​ Based​​ on​​ these​​ output,​​ please​​ answer​​ the​​ following​​ questions:

• A​​ hypothesis​​ test​​ for​​ the​​ joint​​ significance​​ of​​ the​​ two​​ dummy​​ variables​​ and​​ their​​ interaction​​ yields​​ a​​ chi-square​​ test-statistic​​ with​​ p-value​​ equal​​ to​​ 0.0028. What​​ can​​ we​​ infer from​​ this?

• What​​ is the​​ predicted probability​​ of being​​ admitted​​ for a​​ male, independent school​​ applicant​​ who​​ has​​ scored exactly​​ the​​ average​​ mark​​ in the​​ two​​ TSA components? What​​ is the​​ predicted probability​​ of being​​ admitted​​ for​​ a​​ female,​​ non-independent​​ school​​ applicant​​ who​​ has​​ scored exactly​​ the​​ average​​ mark​​ in the​​ two​​ TSA​​ components?

• How​​ would​​ you​​ test​​ whether​​ the​​ differences​​ in​​ probabilities​​ in​​ part​​ (b) are​​ zero?

• If​​ your​​ test​​ suggests​​ that the​​ admission probability​​ is​​ lower​​ for​​ male, independent school applicants,​​ would​​ this​​ imply discrimination against​​ this​​ demographic​​ group?

• What​​ intercept​​ and​​ slope-coefficient​​ estimates​​ would​​ we​​ get​​ if​​ our​​ dummy​​ regressors​​ were​​ female, independent​​ and their​​ interaction, instead​​ of​​ male, independent​​ and their​​ interaction?

Table 2

 Variable Description Mean Min Max got in 1 if admitted, 0 otherwise 0.37 0 1 tsa critical TSA Critical score 68.83 44 95 tsa problem TSA Problem-Solving score 61.68 36 95 indep 1 if from indep school, 0 otherwise 0.46 0 1 male 1 if male, 0 otherwise 0.61 0 1 indep male indepxmale 0.30 0 1

9​​ of​​ 12

(TURN​​ OVER)

Table 3

 Y=got in Coeff Std Error Coeff/Std​​ Error tsa critical 0.10 0.01 8.86 tsa problem 0.12 0.01 9.25 indep 0.002 0.30 0.01 male -0.17 0.26 -0.66 indep male -0.66 0.38 -1.77 constant -14.31 1.06 -13.35

Maximized​​ log-likelihood =​​ -372.1331, N=800,​​ LR​​ chsq(5)=307.91,​​ pvalue=0.00001.

10 of​​ 12

t

B2A​​ researcher​​ estimates​​ the​​ following ADL​​ model using​​ OLS:

vct^=675.6+0.49vct-1+0.015clt-0.018clt-1,

(138.7)(0.11) (0.006) (0.007)

where​​ vctis the​​ monthly​​ number​​ of​​ violent​​ crimes​​ in​​ Cambridgeshire​​ and​​ cltis the​​ number​​ of people in​​ East England who​​ claim​​ unemployment​​ benefits​​ during​​ month​​ t.​​ The​​ estimates​​ of​​ the​​ standard​​ errors​​ are​​ given​​ in​​ parentheses.​​ The​​ OLS​​ estimates​​ are​​ based​​ on​​ 61​​ observations,​​ starting​​ from​​ December​​ 2010​​ up to​​ December​​ 2015. The​​ average number​​ of​​ violent​​ crimes​​ and​​ average number​​ of​​ unemployment​​ benefit​​ claims​​ over​​ this​​ period​​ were​​ 738​​ and​​ 89647,​​ respectively.

• What​​ is the​​ estimated​​ value​​ of the​​ long-run​​ change​​ in the expected​​ value​​ of​​ the​​ violent​​ crimes​​ given​​ a​​ permanent​​ increase​​ in​​ the​​ number​​ of​​ unemployment​​ claims​​ by​​ 1000?​​ Is​​ this​​ estimate​​ economically​​ significant?

• Under what assumptions​​ are the​​ OLS estimators​​ of the​​ coefficients​​ of the​​ ADL​​ model​​ consistent​​ and​​ asymptotically​​ normal?

• Let​​ ut^​​ be​​ the​​ residuals​​ from​​ the​​ above​​ regression.​​ ​​ The​​ OLS​​ regression​​ of​​ ut^on​​ a​​ constant,​​ vct-1, clt, clt-1 and u^t-1yields

u^t^=109.2+0.086vct-1=0.003clt+0.004clt-1-0.171u^t-1

​​ (224.4) ​​ ​​​​ (0.176)(0.008)(0.008)  ​​ ​​ ​​ ​​ ​​​​ (0.217)

What does this tell us about the validity of the original OLS results?

• The​​ researcher​​ next estimates​​ the​​ following OLS​​ regression:

vct^=63.7+1.75t-0.156vct-1-0.25vct-1

(45.3)(0.67)(0.074) (0.03)

What​​ does​​ this​​ result​​ imply​​ about​​ the​​ validity​​ of​​ the​​ assumptions​​ that​​ you​​ discussed​​ in​​ (b)?

• To​​ forecast​​ vct,​​ the​​ researcher​​ estimates​​ the​​ following​​ regression

2vct^=2.92-0.5972vct-1

(12.5) (0.106)

Where​​ 2vct=vct-vct-1=vct-2vct-1+vct-2​​ The​​ values​​ of​​ vct​​ for​​ December,​​ November,​​ and October of​​ 2015​​ were,​​ respectively​​ 997,​​ 1007,​​ 1068.​​ What​​ is​​ the​​ forecast​​ of​​ vctfor​​ January​​ 2016?

11 of​​ 12

(TURN​​ OVER)

B3​​ We​​ are​​ interested​​ in​​ understanding​​ how​​ married couples' hours​​ of​​ work​​ are​​ related​​ using​​ data​​ from​​ n​​ randomly​​ sampled​​ households.​​ Consider​​ the​​ simul-​​ taneous structural​​ equations

• hhoursi=β0+β1whoursi+β2kidsi+Ui

• whoursi=γ0+γ1hhoursi+γ2kidsi+Vi

Here​​ hhoursi​​ denotes​​ the​​ husband's​​ weekly​​ hours​​ of​​ work​​ in​​ the​​ ith​​ sampled​​ household,​​ whoursidenotes​​ the​​ wife's​​ weekly​​ hours​​ of​​ work,​​ kidsi​​ denotes​​ the​​ number​​ of​​ children​​ the​​ couple​​ has,​​ while​​ Ui​​ and​​ Vi​​ denote​​ the​​ error​​ terms. Please​​ answer​​ the​​ following​​ questions.

• How​​ would​​ you​​ interpret​​ the​​ coefficient​​ γ1?​​ ​​ You​​ do​​ not​​ need​​ to​​ discuss​​ any​​ economic​​ model of​​ utility maximization​​ here.

• Solve​​ for​​ hhoursiand​​ whoursiin​​ terms​​ of​​ kidsi,​​ Ui​​ and​​ Vi​​ by​​ solving​​ the​​ two​​ equations. The resulting equations​​ are​​ called​​ the​​ "reduced form" equations. Explain​​ why​​ an​​ OLS​​ of​​ hhoursion​​ whoursiand​​ kidsi​​ (and a​​ constant)​​ give​​ us​​ biased​​ estimates​​ of​​ the​​ causal​​ effect​​ of​​ the​​ wife's​​ hours​​ of​​ work​​ on the​​ husband's​​ hours​​ of​​ work.

• Under what conditions​​ can​​ we​​ use kids as an​​ instrument​​ for​​ whours​​ and​​ ​​ estimate​​ ​​ β1 ​​​​ ​​ by​​ ​​ two-stage​​ ​​ least​​ ​​ squares? ​​​​ ​​ Can​​ ​​ you​​ ​​ write​​ ​​ down​​ ​​ one​​ situation where​​ these​​ assumptions​​ may​​ fail?

• Finally,​​ suppose​​ that​​ the​​ covariance​​ between​​ U​​ and​​ V​​ across​​ households​​ is​​ zero,​​ and​​ suppose​​ kids​​ is​​ a​​ valid​​ IV​​ for​​ whours​​ in​​ equation​​ (1).​​ Can​​ you​​ describe​​ a​​ way​​ to​​ consistently​​ estimate​​ γ1,​​ the​​ cofficient​​ of​​ hhours​​ in​​ the​​ second​​ equation,​​ i.e.,​​ equation​​ (2)?​​ You​​ will​​ need​​ to​​ consult​​ the​​ reduced​​ form for hhours in​​ part​​ (b) of this​​ question.

• Is​​ it​​ reasonable​​ to​​ assume​​ that​​ the​​ covariance​​ between​​ U​​ and​​ V​​ is​​ zero?

12 of​​ 12

