Reddit user Chitinid took the time to plot 200 of the most popular games listed on Board Game Geek and cluster them by type. The result reminds me of a busy radar screen, thankfully these dots aren’t incoming and hostile aliens.
The colour coding tells you what category the game is in. What about the x and y-axis? Ah, those aren’t so straight forward, and there are no labels on them for a reason.
In this chart, the critical thing is how close two dots are together; that means they are two similar board games.
A popular method for exploring high-dimensional data is something called t-SNE, introduced by van der Maaten and Hinton in 2008 . The technique has become widespread in the field of machine learning, since it has an almost magical ability to create compelling two-dimensonal “maps” from data with hundreds or even thousands of dimensions. Although impressive, these images can be tempting to misread. The purpose of this note is to prevent some common misreadings.
The underlying dataset is public too. You can access the files in CSV format over at Kaggle if you want to have a bash at processing the data yourself.
Just colourful dots or has the t-SNE analysis of all these board games helped surface anything insightful for you?