The mysterious correlation: a detective story
In data science, sometimes you stumble across an intriguing property in the data. I will tell you a story of a mysterious correlation - from StackOverflow survey it seems that developers who use spaces have higher salaries than those who use tabs. Correlation doesn’t mean causation: using spaces won’t suddenly increase your salary. But what does it all mean? Follow me into a detective investigation that will show you how to approach complex data science problems. I will show you some of the perils of correlation, model fitting and biases - how they can be dangerous and how to deal with them.
Evelina is a data scientist, machine learning researcher, and an avid conference speaker. Currently she does most of her programming in R and F#, and she got awarded the Microsoft MVP award for her work in the F# community. She originally started as a programmer but got interested in machine learning early on and did a mathematics PhD at the University of Cambridge, developing new machine learning methods to analyze complex biomedical datasets. After that she worked on data analysis in cancer research and now she’s joining the Alan Turing Institute as a data scientist.