This essay MANAGERIAL REPORT has a total of 1777 words and 9 pages.
The purpose of this analysis was to develop a regression model to predict mortality. Data was collected, by researchers at General Motors, on 60 U.S. Standard Metropolitan Statistical Areas (SMSA’s), in a study of whether air pollution contributes to mortality. This data was obtained and randomly sorted into two even groups of 30 cities. A regression model to predict mortality was build from the first set of data and validated from the second set of data.
The following data was found to be the key drivers in the model:
• Mean July temperature in the city (degrees F)
• Mean relative humidity of the city
• Median education
• Percent of white collar workers
• Median income
• Suffer dioxide pollution potential
The objective in this analysis was to find the line on a graph, using the variables mentioned above, for which the squared deviations between the observed and predicted values of mortality are smaller than for any other straight line model, assuming the differences between the observed and predicted values of mortality are zero. Once found, this “Least Squared Line” can be used to estimate mortality given any value of above data or predict mortality for any value of above data. Each of the key data elements was checked for a bell shaped symmetry about the mean, the linear (straight line) nature of the data when graphed and equal squares of deviations of measurements about the mean (variance). After determining whether to exclude data points, the following model was determined to be the best model:
-3276.108 + 862.9355x1 - 25.37582x2 + 0.599213x3 + 0.0239648x4 + 0.01894907x5 - 41.16529x6 + 0.3147058x7 +
See list of independent variables on TAB #1. This model was validated against the second set of data where it was determined that, with 95% confidence, there is significant evidence to conclude that the model is useful for predicting mortality.
Although this model, when validated, is deemed suitable for estimation and prediction, as noted by the 5% error ratio (TAB #2), there are significant concerns about the model. First, although the percent of sample variability that can be explained by the model, as noted by the R² value on TAB #3, is 53.1%, after adjusting this value for the number of parameters in the model, the percent of explained variability is reduced to 38.2% (TAB #3). The remaining variability is due to random error. Second, it appears that some of the independent variables are contributing redundant information due to the correlation with other independent variables, known as multicollinearity. Third, it was determined that an outlying observation (value lying more than three standard deviations from the mean) was influencing the estimated coefficients.
In addition to the observed problems above, it is unknown how the sample data was obtained. It is assumed that the values of the independent variables were uncontrolled indicating observational data. With observational data, a statistically significant relationship between a response y and a predictor variable x does not necessarily imply a cause and effect relationship. This is why having a designed experiment would produce optimum results. By having a designed experiment, we could, for instance, control the time period that the data corresponds to. Data relating to a longer period of time would certainly improve the consistency of the data. This would nullify the effect of any extreme or unusual data for the current time period. Also, assuming that white collar workers are negatively correlated with pollution, we do not know how the cities were selected. The optimal selection of cities would include an equal number of white collar cities and non white collar cities. !
Furthermore, assuming a correlation of high temperature and mortality, an optimal selection of cities would include an equal number of northern cities and southern cities.
CONCLUSIONS AND RECOMMENDATIONS
The model has been tested and validated on a second set of data. Although there are some limitations to the model, it appears to provide good results within 95% confidence. If time had permitted, different variations of independent variables could have been tested in order to increase the R² value and decrease the multicolliniarity (mentioned above). However, until more time can be allocated to this project, the results obtained from this model can be deemed appropriate.
In order to select the best model, several exercises were implemented. Sometimes, data transformations are performed on y
Topics Related to MANAGERIAL REPORT
Regression analysis, Multicollinearity, Coefficient of determination, Errors and residuals, Variance, Statistics, Standard deviation, F-test, Normal distribution, Data transformation, Linear regression, Ordinary least squares
Essays Related to MANAGERIAL REPORT
MANAGERIAL REPORTMANAGERIAL REPORT INTRODUCTION The purpose of this analysis was to develop a regression model to predict mortality. Data was collected, by researchers at General Motors, on 60 U.S. Standard Metropolitan Statistical Areas (SMSA’s), in a study of whether air pollution contributes to mortality. This data was obtained and randomly sorted into two even groups of 30 cities. A regression model to predict mortality was build from the first set of data and validated from the second set of data. BODY The
HumeHume I was from the beginning scandalised, I must own, with this resemblance between the Deity and human creatures. --Philo David Hume wrote much about the subject of religion, much of it negative. In this paper we shall attempt to follow Hume\'s arguments against Deism as Someone knowable from the wake He allegedly makes as He passes. This kind of Deism he lays to rest. Then, digging deeper, we shall try our hand at a critique of his critique of religion, of resurrecting a natural belief in G
ChallengerChallenger It was a cold, crisp, and damp morning on the Florida Space Coast as the space shuttle Challenger raced through the sky at speeds approaching mach 2 at an altitude of 104,000 feet when something went perilously wrong. All of America watched, including the family members of the seven doomed crew members, as Challenger exploded into an expansive ball of fire, smoke and steam. An Oh. . . no! came as the crew’s final utterance from the shuttle as the orbiter broke-up. As the reality of
Sigmund FreudSigmund Freud Many believe Freud to be the father of modern psychiatry and psychology and the only psychiatrist of any worth. He is certainly the most well known figure, perhaps because sex played such a prominent role in his system. There are other psychologists, however, whose theories demand respectful consideration. Erik Erickson, born Eric Homburger, whose theories while not as titillating as Freud\'s, are just as sound. This paper will compare the two great men and their systems. In additi