Advertisement

What's different about hiring data scientists in 2020?

It’s 2020 and the world has changed remarkably, including in how companies screen data science candidates. While many things have changed, there is one change that stands out above the rest. At The Data Incubator, we run a data science fellowship and are responsible for hundreds of data science hires each year. We have observed these hires go from a rare practice to being standard for over 80% of hiring companies. Many of the holdouts tend to be the largest (and traditionally most cautious) enterprises. At this point, they are at a serious competitive disadvantage in hiring.

Historically, data science hiring practices evolved from software engineering. A hallmark of software engineering interviewing is the dreaded brain teaser, puzzles like “How many golf balls would fit inside a Boeing 747?” or “Implement the quick-sort algorithm on the whiteboard.” Candidates will study for weeks or months for these and the hiring website Glassdoor has an entire section devoted to them. In data science, the traditional coding brain teaser has been supplemented with statistics ones as well -- “What is the probability that the sum of two dice rolls is divisible by three?” Over the years, companies are starting to realize that these brain teasers are not terribly effective and have started cutting down their usage.

In their place, firms are focusing on project-based data assessments. These ask data science candidates to analyze real-world data provided by the company. Rather than having a single correct answer, project-based assessments are often more open-ended, encouraging exploration. Interviewees typically submit code and a write-up of their results. These have a number of advantages, both in terms of form and substance.

First, the environment for data assessments is far more realistic. Brain teasers unnecessarily put candidates on the spot or compel them to awkwardly code on a whiteboard. Because answers to brain teasers are readily Google-able, internet resources are off-limits. On the job, it is unlikely that you’ll be asked to code on a whiteboard or perform mental math with someone peering over your shoulder. It is incomprehensible that you’ll be denied internet access during work hours. Data assessments also allow the applicants to complete the assessment at a more realistic pace, using their favorite IDE or coding environment.

“Take-home challenges give you a chance to simulate how the candidate will perform on the job more realistically than with puzzle interview questions,” said Sean Gerrish, an engineering manager and author of "How Smart Machines Think."

Second, the substance of data assessments is also more realistic. By design, brainteasers are tricky or test knowledge of well-known algorithms. In real life, one would never write these algorithms by hand (you would use one of the dozens of solutions freely available on the internet) and the problems encountered on the job are rarely tricky in the same way. By giving candidates real data they might work with and structuring the deliverable in line with how results are actually shared at the company, data projects are more closely aligned with actual job skills.

Jesse Anderson, an industry veteran and author of "Data Teams," is a big fan of data assessments: “It's a mutually beneficial setup. Interviewees are given a fighting chance that mimics the real-world. Managers get closer to an on-the-job look at a candidate’s work and abilities.” Project-based assessments have the added benefit of assessing written communication strength, an increasingly important skill in the work-from-home world of COVID-19.

Finally, written technical project work can help avoid bias by de-emphasizing traditional but prejudicially fraught aspects of the hiring process. Resumes with Hispanic and African American names receive fewer callbacks than the same resume with white names. In response, minority candidates deliberately “whiten” their resumes to compensate. In-person interviews often rely on similarly problematic gut feel. By emphasizing an assessment closely tied to job performance, interviewers can focus their energies on actual qualifications, rather than relying on potentially biased “instincts.” Companies looking to embrace #BLM and #MeToo beyond hashtagging may consider how tweaking their hiring processes can lead to greater equality.

The exact form of data assessments vary. At The Data Incubator, we found that over 60% of firms provide take-home data assessments. These best simulate the actual work environment, allowing the candidate to work from home (typically) over the course of a few days. Another roughly 20% require interview data projects, where candidates analyze data as a part of the interview process. While candidates face more time pressure from these, they also do not feel the pressure to ceaselessly work on the assessment. “Take-home challenges take a lot of time,” explains Field Cady, an experienced data scientist and author of "The Data Science Handbook." “This is a big chore for candidates and can be unfair (for example) to people with family commitments who can't afford to spend many evening hours on the challenge.”

To reduce the number of custom data projects, smart candidates are preemptively building their own portfolio projects to showcase their skills and companies are increasingly accepting these in lieu of custom work.

Companies relying on old-fashioned brainteasers are a vanishing breed. Of the recalcitrant 20% of employers still sticking with brainteasers, most are the larger, more established enterprises that are usually slower to adapt to change. They need to realize that the antiquated hiring process doesn’t just look quaint, it’s actively driving candidates away. At a recent virtual conference, one of my fellow panelists was a data science new hire who explained that he had turned down opportunities based on the firm’s poor screening process.

How strong can the team be if the hiring process is so outmoded? This sentiment is also widely shared by the Ph.D.s completing The Data Incubator’s data science fellowship. Companies that fail to embrace the new reality are losing the battle for top talent.