7 Must-Haves in your Data Science CV

in news •  4 years ago  (edited)

Managing Riskified’s Data Science department entails a lot of recruiting — we’ve more than doubled in less than a year-and-a-half. As the hiring manager for several of the positions, I also read through a lot of CVs. Recruiters screen through a CV in 7.4 seconds, and after recruiting for several years my average time is pretty fast, but not that extreme. In this blog, I’m going to walk you through my personal heuristics (‘cheats’) that help me screen a resume. While I can’t guarantee that others use the same heuristics, and different roles will differ in the importance of each point, paying attention to these points can help you conquer the CV screen stage. Additionally, some of these heuristics may not seem fair or could potentially overlook qualified candidates. I agree that talented Machine Learning practitioners who don’t invest in their CV could get rejected with this screen, but it’s the best tradeoff considering the time. Remember, a highly sought after position may attract a hundred or more CVs. If you want an efficient process, the CV screen has to be quick.
Here are the 7 heuristics used to quickly screen your Data Science CV:

  1. Prior experience as a Data Scientist
    I’m going to quickly run through your CV to look at your previous positions and see which are marked as ‘Data Scientist’. There are some other adjacent terms (depending on the role I’m hiring for), such as ‘Machine Learning Engineer’, ‘Research Scientist’ or ‘Algorithm Engineer’. I don’t include ‘Data Analyst’ in this bucket as the day-to-day work is typically different from that of a Data Scientist and the Data Analyst title is an extremely broad term.
    If you’re doing data science work at your present job and you have some other creative job description, it’ll probably be in your best interest to have your title changed to a Data Scientist. This can be very true for Data Analysts who are de facto Data Scientists. Remember, even if the CV contains descriptions of the projects you’ve worked on (and they include machine learning), a title other than Data Scientist will add unnecessary ambiguity.
    Additionally, if you’ve undergone a data science bootcamp or full-time masters in the field, this will probably be considered the beginning of your data science experience (unless you worked in a similar role earlier, which will warrant questions at a later stage).
  2. Business-oriented achievements
    Ideally, I’d like to read what you did (technical aspects) and what the business outcome was. There’s a lack of technically savvy data scientists who can talk in business terms. If you can share the business KPIs that your work impacted, that’s a big thumbs-up in my book. For example, indicating your model’s improvement in AUC is alright, but addressing the conversion rate increase as a result of your model improvement means you ‘get it’ — the business impact is what really matters at the end of the day. Compare the following alternatives depicting the same work with a different emphasis (technical vs business):
    a. Bank loan default rate model — improved model’s Precision-Recall AUC from 0.94 to 0.96.
    b. Bank loan default rate model — increased business unit’s annual revenue by 3% ($500K annually) while maintaining constant default rates.
  3. Education
    What’s your formal education and in what field. Is it a well-known institution? For more recent grads, I’ll also look at their GPA and whether they received any excellence awards or honors such as making the Rector’s or Dean’s list. Since Data Science is a wide-open field without any standardized tests or required knowledge, people can enter the field in various methods. In my last blog, I wrote about the 3 main paths taken into the field and based on your education and timing, I’ll figure out which one you probably took. Hence, the timing helps understand your story — how and when did you transition into data science. If you don’t have any formal education in data science, that’s fine, but you need to either demonstrate a track record of work in the field and/or advanced degrees in similar fields.
  4. Layout / visual appeal
    I’ve seen some beautiful CVs (I’ve saved a few of these for personal inspiration) but I’ve also received text files (.txt) that lack any formatting. Working on your CV can be a pain, and if you’ve chosen data science as your endeavor there’s a good chance you don’t enjoy creating aesthetic designs in your spare time. Without going overboard, you do want to look for a nice template that enables you to get everything across in limited space. Use the space wisely — it’s useful to split the page and highlight specific sections that don’t fall under the chronological work/education experience. This can include the tech stack you’re familiar with, a list of self-projects, links to your github or blog and others. A few simple icons can also help with emphasizing section headers.
    Many candidates use 1–5 stars or bar charts next to each language/tool they are familiar with. Personally, I’m not a big fan of this approach for several reasons:
    It’s extremely subjective — is your ‘5 star’ the same as someone else’s ‘2 stars’?
    They mix languages with tools, and in the worst cases with soft skills — saying your ‘4.5 stars’ at Leadership isn’t helpful. As a strong believer in a growth mentality, claiming to max out a skill (especially a harder to quantify and harder to master soft skill) feels very presumptuous.
    I’ve also seen this approach abused even further by taking the subjective measures and turning them into a pie chart (30% python, 10% team-player, etc). While this was probably supposed to be a creative way to stand out, it demonstrates a lack of basic understanding behind the concepts of different charts.
    Here are two examples of CVs I’ve found visually appealing, with details blurred for anonymity.
  5. Machine Learning variety
    There are two types of variety I look for:
    Type of algorithms — structured/classic ML vs Deep Learning. Some candidates have only worked with Deep Learning, including on structured data that could have been better suited with tree-based models. While there’s no problem per se with being an expert at DL, limiting your toolset can limit your solution. As Maslow said: “If the only tool you have is a hammer, you tend to see every problem as a nail.” At Riskified we deal with structured, domain-driven, feature-engineered data which is best dealt with various forms of boosting trees. Having someone whose entire CV points back to DL is an issue.
    ML Domain — this is usually relevant in two domains that require much expertise — computer vision & NLP. Experts in these fields are in demand and in many cases, their entire career will be focused on these domains. While this is crucial if you’re looking for someone to work on that field, it’s typically a bad fit for someone to work in a more general data science role. So, if most of your experience is in NLP and you’re applying for a position outside the domain, try to emphasize positions/projects we’re you’ve worked on structured data to demonstrate variety.
  6. Tech Stack
    This can generally be broken down into languages, specific packages (scikit learn, pandas, dplyr, etc), clouds and their services (AWS, Azure, GCP) or other tools. Some candidates mix this up with algorithms or architectures they are familiar with (RNN, XGBoost, K-NN). On a personal note, I prefer that this revolve around technologies and tools; when a specific algorithm is mentioned it makes me wonder whether the candidate’s theoretical ML knowledge is limited to just those specific algorithms.
    Here, I’m looking for the relevance of the tech stack — are they from the last few years (a positive sign that the candidate is hands-on and learning new skills), the breadth of the stack (are they very limited to specific tools or are they familiar with quite a few things) and the fit with our stack (how much will we need to teach them).
  7. Projects
    Is there something you’ve worked on that you can share on GitHub? Any Kaggle competition or side-project can be very helpful, and enables looking at concise code, types of preprocessing, feature engineering, EDA, choice of algorithm and countless other issues that need to be addressed in a real-life project. Add a link to your GitHub and Kaggle account for interviewers to dive into your code. If you don’t have much experience, there’s a good chance you’ll be asked about one or more of these projects. In some interviews I had, the candidate didn’t remember much about the project and we couldn’t develop a conversation regarding the choices they made and the reason behind them. Be sure you brush up on the work you did or keep it out of the CV. Similarly, make sure you present your best work and you’ve put enough time and effort into it. It’s better to have 2–3 high-quality projects than 8–10 medium (or lower) quality.
    Summary
    If you’re looking for a new data science position, take some time and go through the points in this article. It’s fine if you can’t check off all of these marks, but the more you can, the better. Hopefully, these tips will help you stand out from the crowd and pass the CV screen with flying colors.
    Good luck and happy job hunting!
    Do you have a different approach when screening Data Science CVs?
    Please share in the comments section below!
Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!