I have a large dataset of students and the classes they have taken. Each student has taken 12 to 18 of around 80 available classes. Using Excel (2013) I would like to find out, for any given pair of classes, how many students took both of them. I envisage a table with the 80 classes as both rows and columns, and then for each intersection I would see a count of how many students took that combination.
The data arrive as an Excel file with one row per student per class:
Student Class
Smith E101
Jones E101
Parker E101
Brown E102
Green E102
Smith E201
Jones E202
Parker E201
Brown E202
Green E203
...
Envisaged output:
E101 E102 E201 E202 E203 ...
E101 0 2 1 0
E102 0 0 1 1
E201 2 0 0 0
E202 1 1 0 0
E203 0 1 0 0
...
(Obviously I only need a diagonal half of the above, as the other half mirrors it.)
I've used a pivot table to get the data into a table with students as rows, and all possible classes as columns, showing a 1 where a student took a given class.
E101 E102 E201 E202 E203 ...
Smith 1 1
Jones 1 1
Parker 1 1
Brown 1 1
Green 1 1
...
But I'm then stuck on how to proceed, with as little manual intervention as possible, to my desired output.
Can anyone suggest a way to achieve the output I need in Excel? I've done fairly extensive searching but haven't found anything.
Or should I be looking to other software?