This week I went to Strata conference in New York and returned from the conference with the warm confirmed feeling that is where I belong, just like when I moved from Idaho to Washington a year ago. So far working on HoloLens team has being an exciting ride learning things about software engineering, myself and others. Throughout this journey I realized something very important that made the transition to data science only more meaningful.
Features do not mean use
As software engineers, we build many features daily. We deploy to production, write dozens of tests, prepare reports… A lot of busy work sometimes that often does not transition to value. And while we do this, our companies pay us quite a bit of cash. I do not know about you, but I feel an extreme pang of guilt and imposter syndrom-y feeling when I am doing something at work that I know will unlikely bring value to the team and the company. Working on a couple anti-features in all the companies I have worked in made me ask a question. Is there a better way? Can I spend my time on something that will bring the information forward and help the team make a decision?
That is when I started my realization that the answer is in the data. Pure engineering with few data justifications that this is the right feature to build is moot.
What is this data science you are talking about?
My time at the Microsoft Analog team and during Strata conference showed me that data science is more than just a catchphrase.
Let’s take Netflix as an example. You have millions of events coming from your product. For every click on a movie, how much you watched of it, did you watch it to the end, how long did you stay browsing “Sci fi movies” group – all this telemetry is collected like a myriad of little flowers into some type of storage. Those events then have to be aggregated, shaped, made sense of.
Just collecting the data and not trying to make any sense of it does not make much sense of course. You need to understand what questions crucial to your business you would like to ask of the data. Otherwise, you will have a knowledge dump of charts and tables that do not mean anything. This is where Lean Analytics book was a great read.
Yet sometimes you have occasions when data in front of you does not make any sense because there was an anomaly or something really wrong went in your system of products and users. You need to go and do an ad-hoc investigation, following the breadcrumbs like Sherlock Holmes (my awesome manager’s trademarked analogy for the process) and finding an insight that may be a turning point for the entire business. I love this feeling!
Where I am going
As a data science practitioner, it is very important to understand what stages of the process make the data available to you for querying and discovery. I have been at all the stages of the process since I graduated from college, and really excited to start on a new journey to dig deeper into the data, analyze and understand the anomalies and bring value to the business and the team.
For those of you who are curious as to why this post exists at all in the context of my own life: there has been reshuffling in my team that ended up bringing me closer to the ad-hoc and ML modeling part of the data science. This is an opportunity that I am extremely grateful for – an opportunity that is close to my heart as a researcher driven by infinite curiosity to understand why things work the way they do.
With that, be CodeBrave and DataBrave!