When data scientist Joanne Lin was looking for patterns in adoption data from the Austin Animal Center, a surprising result jumped out: cats with names are much more likely to find a home than those without names. In fact, having a name turns out to be one of the most important determinants of whether a cat is adopted, second only to being spayed or neutered. Having a name is more important than how old the cat is or the color and pattern of its coat.
I love this story, as it shows the power of data scientists to effect change. Joanne’s discovery that 63 percent of cats with names are adopted from the shelter while only 17 percent of cats without names enjoy the same positive outcome is incredibly helpful information that can be put to use immediately by shelters everywhere to improve outcomes for cats in their care.
Data science is more than just the buzzword de jour. The discipline underlies some of the most significant social and economic drivers of our age, and mastery of its concepts opens fantastic opportunities to the rising generation of tech workers. It’s estimated that by 2020, there will be 2.7 million jobs in the field. Data scientists are among the country’s best paid and happiest workers.
But the challenge to meet the demand is daunting
First, most data science roles require some form of post graduate education. While there is some debate within the data science community about the necessity of these programs, the proof lies in the job descriptions of current openings. Of the 80+ available entry level data science science jobs in Atlanta, for example, nearly every position requests more than a bachelor’s degree.
Compounding the issue is that few have the prerequisite skills necessary to enroll in those post graduate programs. The typical masters or bootcamp student must know at least one programming language and have an understanding of statistics and probability. It’s estimated that there are about 1.26 million software engineers working in the U.S. today. If every single one of those folks decided to enroll in a post graduate data science program, we’d still come up more than 50 percent short.
And finally, few can actually define Data Science. The discipline is most commonly associated with the marketing machines of tech giants like Facebook and Google, and recent events like the Cambridge Analytica fiasco reinforce a perception of data science as a tool for manipulation. That’s why we at Thinkful launched the WTF is Data Science project, a free resource that presents the discipline in a friendly and accessible format. It’s also why we highlight our students’ work, like Joanne’s, or that of another graduate whose capstone project used neural network techniques to investigate improvements in the identification of cervix types which could lead to improvements in the treatment of precancerous conditions.
In May my husband and I adopted our first cat — and yes, the shelter had named her. I’m glad they did: I wonder how we can use data science to get more cats into owners’ arms, where they belong!
Darrell Silver is CEO and co-founder of the career training company Thinkful, a successful serial entrepreneur, and an angel investor.