The prior examples have assumed one line per unique subject/variable combination. This is not a typical way to enter data. A more typical way (found ., in Systat) is to have one row/subject. We need to "stack" the data to go from the standard input to the form preferred by the analysis of variance. Consider the following analyses of 27 subjects doing a memory study of the effect on recall of two presentation rates and two recall intervals. Each subject has two replications per condition. The first 8 columns are the raw data, the last 4 columns collapse across replications. The data are found in a file on the personality project server.

Kempthorne uses the randomization-distribution and the assumption of * unit treatment additivity* to produce a * derived linear model* , very similar to the textbook model discussed previously. [30] The test statistics of this derived linear model are closely approximated by the test statistics of an appropriate normal linear model, according to approximation theorems and simulation studies. [31] However, there are differences. For example, the randomization-based analysis results in a small but (strictly) negative correlation between the observations. [32] [33] In the randomization-based analysis, there is * no assumption* of a * normal* distribution and certainly * no assumption* of * independence* . On the contrary, * the observations are dependent* !

Let’s make up a little story: let’s say we have three types of wine (A, B and C), and we would like to know which one is the best one (in a scale of 1 to 7). We asked 22 friends to taste each of the three wines (in a blind fold fashion), and then to give a grade of 1 till 7 (for example sake, let’s say we asked them to rate the wines 5 times each, and then averaged their results to give a number for a persons preference for each wine. This number which is now an average of several numbers, will not necessarily be an integer).