Using Software Engineering in Your Data Science Projects

Happy New Year everyone! I took a long break from blogging with all the vacations and the end of the year activities, but now I am happy to be back in the loop to deliver more insights about data science. At Microsoft HoloLens I have been diving head-first into the fun world of data, and being a software engineer, I am continuously looking for ways in which to improve code and structure the data science projects with better reusability.

Recycle your code

One of the first things I noticed when creating my first Jupyter notebook data science project was I continued performing the same analysis on different parts of the data. Using the software reusability tenet, I quickly converted the repeated parts of code into well-named and well-documented functions and imported them into my Jupyter notebooks using the %run iPython magic command which runs the Python code in the file you provide it:

%run lib/decision_trees.py

walle

Write libraries

Some of those functions were mission critical, i.e. decisions of the entire group or company would be made based on the numbers and insights I provided them. There could be no room for error. While looking around, I did not see a lot of Jupyter notebooks written as software should be written – with tests. Fortunately, the work we just did in the previous step where we extracted reusable code blocks into their own functions will make writing tests incredibly easy. If you do not know how to create unit tests or need a refresher, I would recommend taking a look at the Art of Unit Testing book by Roy Osherove. It saved my skin once, maybe it will save yours. You could go even further and require the tests to be run any time you are committing code to git, but I will talk about that in one of the future blog posts.

hermoinebooks

Find a way to get your data back

Imagine you performed an experiment. You found a really cool data insight. You committed the Jupyter notebook code to git and continued your life with cooler, fresher, more exciting projects. A couple months later the team comes back to you: “We were wondering, did the statistic X change since you ran the experiment way back when?” You run back to the Jupyter notebook only to discover that while the code still works, you have no idea how you munged and got the data to run it. You have to spend an entire day reinventing the wheel of data extraction, which is beyond inefficient.

A lot of blogs out there advice to put the correctly shaped data together with the code you are committing. While this may work just fine for small public databases, when you have a couple hundred thousand rows with hundreds of columns of private user data, such practice is not an option. Imagine cloning a git repo with hundreds of projects full of 1GB data files per project. I might as well go and play Witcher 3 now. Also, remember, your data may have to deal with NGP compliance and HIPAA and other blah blah important boring government edicts.

A much better way to deal with creating reusable data is to put all your data extraction and data munging into one script. This script could pull from your Hadoop storage, or a SQL server, or a publicly stored file and shape the data according to the needs of your insights extraction machine. Viola!

giphy

In short

I could continue rambling about the perks of software engineering practices in data science for all eternity, but there is only much time in the day. For now remember to place your reusable python code into functions, treat those functions like libraries and write unit tests for them, and finally, make sure you commit to git the steps you took to extract the data for your project because in the end this will save you hours of work when you have to rerun your experiments. Go and play with data and be CodeBrave!

aloy

Advertisements
Using Software Engineering in Your Data Science Projects

Spell: How to do hand tracking on HoloLens

I am finishing reading Off to Be the Wizard book by Scott Meyer about a guy who dabbles in programming, finds a file that allows him to modify reality and travels to Medieval England to become a wizard. It is a fun and easy read, but the especially interesting part about the book is how it describes creation of magic through code. Reading through the book could not make me stop thinking that they are describing my experiences creating apps for HoloLens. You may not be modifying the real world per say in HoloLens, but you are seeing your hands do magic in the real world. Powerful stuff.

wizard

How can you create magic with HoloLens code?

HoloLens sees your hand as long as it is in front of the display (where the cameras are). How can you track your hand with code?

warcraft

Continue reading “Spell: How to do hand tracking on HoloLens”

Spell: How to do hand tracking on HoloLens

How to fight the nightly demons of doubt

Have you ever woken up in the middle of the night absolutely terrified about something you said/wrote/did? Did you offend someone? Are they going to think that you are stupid? Did you fail that compilers test? Have you lost your friend forever? Did you take the wrong path in life? Most likely one of those questions popped into your head at one time or another and kept bothering you for hours until you finally managed to get some sleep or until it was time to awake and start the new day.

Those hours of self-torture used to leave me absolutely exhausted when I was at school. A perfectionist, I would spend all night debating if I should have answered a question worth 2 points on a 100 point test differently. How do you overcome this sleep sabotage and save some energy for the day instead of fighting demons of the night? Here are the things I discovered when I awoke after the sleepless nights of self-doubt that continue helping me to this day when the doubting mind decides to go rogue at night.

dragon_age_inquisition_soldiers_armor_106828_3840x2160

Continue reading “How to fight the nightly demons of doubt”

How to fight the nightly demons of doubt

Recognizing Emotions with Cognitive Services

I love writing about the cool technologies I am working on, and being a machine learning buff in addition to AR/VR crazy head, today I am going to write about Micrsoft Cognitive Services.

A lot of code has been created for us, code creators so we no longer have to spend weeks writing a loop to populate an array using assembly. This goes not only for low level code, but for high level code as well. Working for Microsoft as a dev, I can’t help noticing Microsoft strategy concentrating on the cloud infrastructure and services. As a result, Microsoft is creating some really cool tools for Azure.

steven

Continue reading “Recognizing Emotions with Cognitive Services”

Recognizing Emotions with Cognitive Services

HoloLens: How far is the future seen in movies?

When thinking of the user interfaces of the future, what comes to your mind? I immediately start thinking of Tony Stark interacting with an epic holographic user interface while talking to Jarvis. Apparently this future is not very far.

tony-stark-hologram

As promised in the post from the beginning of July, I am going to share what I learned so far about HoloLens capabilities while hacking on an emulator during Microsoft //oneweek hackathon.

Continue reading “HoloLens: How far is the future seen in movies?”

HoloLens: How far is the future seen in movies?

Can you demo your creative work?

Today I am going to discuss something I have been having on top of my mind for the past several weeks.

How do you show what distinguishes you from other creators?

We are code brave, we have all been taught how to code at school, bootcamp and Pluralsight tutorials. We learn very fast and code with a ton of unit tests. Our code looks beautiful and well-structured, and product managers love us for coding up their every whim (I certainly hope not!). We are all software businesswomen, and this should be enough, right? Is just knowing how to code and grok new technology enough to survive in this industry?

Continue reading “Can you demo your creative work?”

Can you demo your creative work?

Spell: Let’s color the world with shaders!

After many Russian adventures, I am finally back home ready to discuss another cool technology and get the CodeBrave spirit running. Have you ever wondered how your 3D character is colored? How do the newest video games and animation studios achieve the perfect-looking shiny armor and soft leather? How does Cassandra from the Dragon Age Inquisition game screenshot below look so realistic?

Cassandra

Shaders. What kind of shady beasts are those?

Continue reading “Spell: Let’s color the world with shaders!”

Spell: Let’s color the world with shaders!