Overcoming Imposter Syndrome in Data Science

Jennetta George
7 min readMar 4, 2021

Let’s start off with some confessions, just to level out the author-to-reader playing field.

  • I can’t tell you too much about GPUs.
  • Federated learning has been on my “to learn” list for longer than I’d like to admit.
  • Dockerfiles are NOT my best friend.

I could go on, but I’d like to still be qualified for data science jobs in the future, so I’ll cut it short. Why am I admitting some of my deepest, darkest shortcomings as a data scientist for all to see, you might ask. Isn’t the point of posting on Medium to show off your skills to your extremely knowledgable peers?

My goal with this post is to normalize not knowing in the field of data science and break free of the “fake it till you make it” mentality. As data scientists (and as humans), we will never be an encyclopedia of models, metrics, and compute wisdom. A good hiring manager will know that. So, let’s start cutting ourselves a break, ok?

Data Science is a New, Expansive, Conglomerate Field

Think back to college, when you were picking out a major. Neoclassical Art History, Poetry, Civil Engineering, Applied Statistics: all fields in which a specified rubric was set out from the start that once you mastered a certain set of areas, you could consider yourself an expert in the field.

Most of these fields of study have been around for a centuries, and all having in common that at some point, groups of academics and industry folks got together and decided on what knowledge was a bare minimum to enter the professional arena in that field.

Now consider the history of data science. Yes, it has been around for some time; depending on your definition, we can safely say at least 30 years if we think of data science relating to data being manipulated in computer systems, and 50 years if we think about how long mathematicians have been prophesying about future computer-based scientific endeavors.

However, consider how drastically computer technology has advanced over those years. A data scientist even 20 years ago was not trying to wrap their head around which cloud computing system to host their deep learning model on, but today this is a commonly expected skill for a data scientist to have. While there are well-defined threads of statistical and mathematical theory that serve as the foundation for data science education, the amount of knowledge being heaped onto the obstreperous load of requirements for the average data scientist in the 2020’s is growing faster than a person can process and learn.

It’s important to realize that up until a few years ago, “data science” was not a topic that you could major in, so no one is entering the job market with a standardized set of skills. Most of my colleagues have started their data science professions with masters or Ph.Ds in bioscience, chemical engineering, statistics, English Lit, etc. We all had to learn on the job. And when we left that first data science job for a new data science job, we found out that the subset of skills that company A needed a data scientist to posses were not equal to the subset that company B is requiring, so you never stop having to learn on the job — no matter how senior you get in your carrer.

These days, a data scientist is, in some circles, expected to be at minimum (or at least have a very high understanding of the work performed by):

  • a Data Engineer — to understand the ETL process
  • a Statistician — to have deep theory of how to choose the perfect pre-processing tools, prediction model, and metrics for a given dataset
  • a DevOps engineer — to glue the ETL, model building, and deployment process together
  • a Business Analyst — to extract business insight from the models

Data science is as much about knowing the tool as it is about having experience applying it to real-world problems, about having that ‘gut feeling’ that raises your eyebrows when the results are suspiciously positive (or just weird).- Michael R. Berthold

Companies are having a hard time building strong data science teams because many teams are unsure of what they really want and are unclear of what is really meant by data science. I once interviewed for a data science role in a company that never had a data scientist work for them. My interviewers were software engineers who didn’t know much about data science. They asked me a few basic questions, and offered me the job. A few months later, they realized that I was not a pro at building out serverless architecture on AWS, so they let me go.

This was a big lesson for me that 1) I needed to up my skills in AWS, but more importantly, 2) I needed to be more picky about the teams that I joined and better understand what I wanted out of my data science career .

There is a team out there who is looking for your exact set of skills; don’t let anyone intimidate you into thinking that your knowledge is not valuable.

“Fake It Till You Make It” Is No Longer Sustainable

I remember prepping for my first set of data science interviews many years ago. It was a full time job in which I created a 30 page dictionary of terminology that I spent months memorizing to ensure I didn’t miss a single theory question (which was much to my dismay when the first offer I took consisted of one interview where the interviewer only asked me to talk to him about my masters thesis on category theory and didn’t ask me a single data science question).

Now I lead a group of data scientists, and I always try to be the first one in the group to admit when I don’t know something. Why? Because as a leader, you set the tone for the type of communication that is appreciated, encouraged, and accepted within your cadre, and thus I consider it my responsibility to help my teammates feel comfortable expressing any weaknesses they feel that they have so that I can help get them the resources they need to succeed. Because what good is a group of scientists who have a bunch of knowledge gaps and are too afraid to vocalize that they need time and assistance to learn a skill? I would much prefer to hire a data scientist who can admit he doesn’t know everything but can exemplify how he learns in a productive and collaborative manner.

That being said, I don’t recommend you walk into your next client meeting proclaiming everything that you don’t know and don’t plan on learning anytime soon. That’s a really easy way to find out how long it takes for HR to send out an “employee performance plan” email. Instead, if you feel like you are in a position where you are constantly feeling stressed out about being asked to know or do more than you are capable of, find someone in the company that you trust to confide in. Opening the door to this conversation will take a lot of weight off of your shoulders and will hopefully start to shift the culture within your workplace.

Take a Deep Breath and Stay Focused

Have you fallen into the pit of “I need to learn everything this week?” Where all of a sudden, learning:

  • how to build an end-to-end machine learning pipeline
  • how to know if you really need to use a pretrained BERT model to optimize your NLP project
  • how to set up a graph database and ontology for the entire motor industry
  • how to invert a binary tree, just in case Facebook calls tomorrow and asks you to interview for a machine learning engineer role. even though you’ll probably never need to do this as a machine learning engineer and you’d much rather spend your time studying data science theory.

is all suddenly your top priority?

Can you relate to this set of open tabs?

Take a step off the safari-open-tab cliff and pour yourself a cup of hot tea. Preferably non-caffeinated.

Take a minute to prioritize what you really need to learn to be successful at your next project, and create a SMART plan to achieve it.

Sustainable knowledge acquisition takes time — think back to college again: if a 101 level course in music theory took you 4 months to ace, what makes you think that this 401 level course in NLP, Neural Nets, or Kubernetes is going to be any quicker? You will learn everything you need to learn. And then realize that theres 100 other things that you need to learn next. If you’re stressed out now, just know that the stress will never stop.

Instead, try to avoid letting the pleasures of learning and self-improvement cause you stress, and take some time to appreciate what a badd-ass data scientist you already are!

And one last plug — keep an eye out for my next article on Figuring out What Kind of Data Scientist You Are to help narrow down your learning objectives and find focus and meaning in your career path.

Leave a comment below if you are new on your journey to data science mastery and can relate to any of the topics discussed in this article. Reach out on LinkedIn. Find mentors in your field so you feel supported. Let’s come together to destigmatize being okay with not knowing everything in the field of Data Science.

*Images in this article were created by the author on canva.com

--

--

Jennetta George

I started my Data Science career as a student of theoretical mathematics. Now I find passion in helping develop DS teams and creating education tech content.