Stats (Batting)

Submitted by: Submitted by

Views: 359

Words: 1231

Pages: 5

Category: Business and Industry

Date Submitted: 02/11/2011 09:30 PM

Report This Essay

SUMMARY

The initial question for our research was to find what are the Average Runs a baseball player scores (as our dependent variable) based on At Bats, Hits, and Strike Outs (our independent variables). Our initial model was:

Y=Bo+B1x,+B2x2,+B3x3+E

Bo=Runs

X1= At Bats

X2=Hits

X3=Strike Outs

In order to come to a conclusion there were many different steps we needed to take. We had to create an additional variable called Hits* At Bats to test in our regression. We also had to test our variables against each other for multicollinearity.

While doing research we came across more important batting statistics. We believed that On Base percentage may be significant to our results. Because of this we created a dummy variable to see how batter with a OBP great and Lower than .36 affected average runs.

X4:{1 if OBP < 0.36, 0 if otherwise

X5:{1 if OBP > 0.36, 0 if otherwise

After pulling all of our variables together we needed to come up with a final regression in which all variables we significant. We had to use backwards elimination to slowly get rid of the variables that were not significant to come up with our final regression equation. Our final regression consisted of the independent variables Strikes, Hits*At Bats, and < 0.36 OBP.

METHODOLOGY

The first step to answering our initial question was to run a regression of our initial model. The overall model was significant with a P-value of 5.68774E-18 which was greater than alpha of 5%. We then tested each variable separately for significance. We found At Bats to not be significant with a P-Value of .733. Both Hits and Strike outs were significant with P-values of 1.37311E-06 and 3.393E-8 respectively.

From the data we found in our first regression we created an estimated regression equation:

Y= -5.078 + -0.012 (at bats) + 0.468 (hits) + 0.209 (strike outs)

Once we examined our regressions results we could interpret our variables:

Intercept - when all other variables...