Aggregate
This function essentially allows you to obtain "summaries" of all of the numeric variables in the data set for combinations of categorical variables.
-
Variables: if only one variable is specified, the new data set will have one row for each level of the variable. If two (or more) are specified, then there will be one row for each combination. For example, the categorical variables
gender = {male, female}andethnicity = {white, black, asian, other}will result in a data set with 2x4 rows. -
Summaries: each row will have the chosen summaries given for each numeric variable in the data set. For example, if the data set has the variables
gender (cat)andheight (num), and if the user selectsMeanandSd, then the new data set will have the columnsgender,height.Meanandheight.Sd. In the rows, the values will be for that combination of categorical variables; the row forgender = femalewill have the mean height of the females, and the standard deviation of height for the females.A visual example of this would be do drag
heightinto the Variable 1 slot, andgenderinto the Variable 2 slot. Clicking on "Get Summary" would provide the same information. The advantage of using Aggregate is that the summaries are calculated for every numeric variable in the data set, not just one of them.