cd "C:\Users\Aki\Documents\stata" log using march7.log, replace **************MARCH 7. INFERENTIAL STATISTICS************* **doing statistical tests **STATISTICS OF TWO CATEGORICAL VARIABLES sysuse auto, clear tab rep78 foreign tab rep78 foreign, chi2 //this test is used to check the relationship between two categorical variables, var x (rep78) and var y (foreign) //H0: there is no relationship. If P-value is small, then you reject the H0. There is some relationship between these two variables. **testing means. **e.g. do men earn more than women? (statistically) **are foreign cars more expensive than domestic? (is there any statistical difference?) **is the difference in means of prices (the average prices) between foreign vs domestic cars is significantly different from 0? su price **is the average price of cars different from 6000? ttest price == 6000 //cannot reject the H0. The mean price of cars is not statistically different from 6000 **are foreign cars more expensive than domestic? ttest price, by(foreign) //H0: Mean price (domestic) = mean price (foreign) //OR: Mean price (domestic) - mean price (foreign) = 0 //cannot reject H0: there is no statistical difference in average prices of foreign and domestic cars su price if foreign==1 su price if foreign==0 //t-test is about two means: means are equal ***bivariate correlation and regression **Pearson's correlation (bivariate correlation) correlate price mpg weight //correlations between mpg and price, between mpg and weight are negative //correlation between weight and price is positive //the strongest correlation is between weight and mpg pwcorr price mpg weight, sig //we have also an information about the statistical significance of correlations. //all correlations here are statistically significant ***tests for variances use nlsw, clear sum wage if country==1 sum wage if country==2 sum wage if country==3 oneway wage country, tab //ANOVA is shown at the second, bottom part of the table. For between groups comparison, high F and small P means that statistically, the difference in variance between different groups of countries is high //Bartlett's test: Ho is that the variances are equal. With small P-value, reject ths H0 and conclude that the variances are not equal anova wage country //checks for equal variances to see if there is a difference between different groups. test if mean of a dependent var is the same for two unrelated groups //analysis of variance //dependent variable should be continous, independent var should be categorical / dummy //dependent var should be approximately normally distributed histogram wage anova wage country##south pwmean wage, over(country) effects //pairwise comparisons of means with equal variables (!) to determine which groups differed from each other log close