Insights

Good Data Science: The Art of Being Skeptical, to Allow Room for Creativity

Domain and technical knowledge aren’t sufficient criteria when looking for your new data science hire. We have found that a better characterization of a data scientist’s innate potential is the way they balance creativity and skepticism.

Good data science comes from a state of precarious balance between two diametrically opposed traits, skepticism and creativity. Fall too far towards one pole and your work stagnates, paralyzed by uncertainty. Too far in the other direction and you waste resources chasing rainbows.

As our society becomes more reliant on data driven insights, people are shifting careers and going back to school to pursue positions in data science. This influx of new candidates, and the increase in available positions, creates a situation where hiring managers aren’t certain of what to look for and new data scientists aren’t certain where they should focus their training. While knowledge base and proficiencies are important metrics, good candidates can learn on the job to fill in knowledge deficiencies and, more importantly, there is a large gap between knowing facts and the ability to apply these effectively. We have found that a better characterization of a data scientist’s innate potential is the way they balance creativity and skepticism.

Data scientists need to be creative

This point shouldn’t be too surprising as innovation throughout history is driven by creative individuals capable of looking at problems in novel ways and envisioning solutions where others have failed. Further, data science tends to be very unstructured and abstract; many businesses struggle to apply machine learning because the process to go from data to results isn’t immediately clear. Data science is the art of looking at data, hearing a business objective, and then envisioning all the steps, features, and validations needed to get from data to success. Inevitably these best laid plans will need revisions, each time requiring the data scientist to come up with new ideas, plans, and models. Feature engineering constantly requires researchers to creatively design novel representations of the data that bring out key insights and patterns. Creativity allows data scientists to answer, “Well, now what?” when presented with negative results and new data. Creativity becomes even more important when working in isolation, absent any team to help with brainstorming. Creativity not only allows data scientists to explore multiple possible solutions, but to envision pitfalls and risks of each, thereby preventing costly tangents and dead end solutions.

Creativity first needs skepticism

If creativity is the solution to problem solving, skepticism is the motivating factor. You can’t solve a problem unless you realize the problem first exists; the only way to innovate your field is to identify the current deficiencies. Being skeptical of others' work is a good place to start, but a great data scientist should be the most critical and skeptical of their own hypotheses, models, and findings. Perhaps the biggest risk of any data science project is trusting results without thorough and robust interrogation; there are countless ways to get acceptable model accuracy from a completely worthless model. The same factors leading to creativity should naturally lead to skepticism, seeing all the confounders, conflicting hypotheses,  and potential sources of error within the pipeline and code base. Trust is the biggest trap of modern data science; data scientists should constantly ask, “How am I being fooled by this result and how can I test that?”.

Balance: Creativity is the blue sky, skepticism is the trajectory

The truly great data scientist exhibits both of these traits in balance. Real world data science applications are constrained by time, money and limited data availability, often with very messy data. Innovation therefore requires a creative solution to overcome these challenges given the unique constraints with each project and dataset. Solutions need to be innovative but also practical, explainable, and succinct. Focusing creativity with skepticism ensures a practical solution that solves not only the data science challenge but also the problem of making it work in the real world.

Written by:
Jonathan Gallion
VP of AI/ML
Published On:
December 21, 2020