//indicate the path and working folder cd "C:\Users\Aki\Documents\stata" log using jan31 ***********EXPORING DATA******************************************************** **view and edit raw data **1. load data sysuse auto, clear **2. browse it browse browse make price **3. describing and summarizing data describe //gives an information about where the data is located, observation count, number of variables, size of database, var names and labels, storage type, etc. summarize //gives summary stats for the variables (mean, st dev, min, max) sum length trunk turn //summary stats for specific variables **you can specify summarize with different conditions summarize price if foreign==1 //will give sum stats for variable price if the car is produced abroad summarize price if foreign==0 codebook foreign //dummy (binary) variable codebook price //integer codebook rep78 //most probably, categorical variable summarize price if rep78<3 //summary stats for price of cars with repair record less than 3 (=1 or =2) summarize price if rep78>=3 //summary stats for price of cars with repair record more than or equal to 3 summarize price if rep78!=3 //summary for price of cars with repair record not equal to 3 (=1, 2, 4, 5) summarize price if foreign==1 & rep78>=3 //summary for price of cars that are foreign and with rep record >=3 (match BOTH conditions) summarize price if foreign==1 | rep78>=3 //summary for price of cars that are either foreign or with rep record >=3 (match ANY of these conditions) sum foreign //not extremely useful for binary and categorical variables ***TABULATING AND CREATING TABLES****** **for categorical variables, tabulating is very useful tab foreign tab foreign, nolabel //to see what variable corresponds to what label tab rep78 *two way tabulation tab foreign rep78 tab foreign rep78, col row ******MISSING VALUES************************ ***Stata reads missing values as positive infinity misstable summarize *Why need to be careful with missing values? tab rep78 tab rep78, miss sum price if rep78>=5 sum price if rep78>=5 & rep78!=. sum price if rep78!=. sum price if rep78==. **if want to recode missing values, mvencode/mvdecode mvencode *, mv(-99) mvdecode *, mv(-99) translate jan31.smcl jan31.log log close