By Ahmed El DeebMany technology companies now have teams of smart data-scientists, versed in big-data infrastructure tools and machine learning algorithms, but every now and then, a data set with very few data points turns up and none of these algorithms seem to be working properly anymore.Read Full Story
The New York Times has an article titled For Big-Data Scientists, ‘Janitor Work’ Is Key Hurdle to Insights.
Mostly I really like it. The fact that raw data is rarely usable for analysis without significant work is a point I try hard to
make with my students.
I told them “do not underestimate the difficulty of data preparation”.