Using Software Engineering in Your Data Science Projects

Happy New Year everyone! I took a long break from blogging with all the vacations and the end of the year activities, but now I am happy to be back in the loop to deliver more insights about data science. At Microsoft HoloLens I have been diving head-first into the fun world of data, and being a software engineer, I am continuously looking for ways in which to improve code and structure the data science projects with better reusability.

Recycle your code

One of the first things I noticed when creating my first Jupyter notebook data science project was I continued performing the same analysis on different parts of the data. Using the software reusability tenet, I quickly converted the repeated parts of code into well-named and well-documented functions and imported them into my Jupyter notebooks using the %run iPython magic command which runs the Python code in the file you provide it:

%run lib/decision_trees.py

walle

Write libraries

Some of those functions were mission critical, i.e. decisions of the entire group or company would be made based on the numbers and insights I provided them. There could be no room for error. While looking around, I did not see a lot of Jupyter notebooks written as software should be written – with tests. Fortunately, the work we just did in the previous step where we extracted reusable code blocks into their own functions will make writing tests incredibly easy. If you do not know how to create unit tests or need a refresher, I would recommend taking a look at the Art of Unit Testing book by Roy Osherove. It saved my skin once, maybe it will save yours. You could go even further and require the tests to be run any time you are committing code to git, but I will talk about that in one of the future blog posts.

hermoinebooks

Find a way to get your data back

Imagine you performed an experiment. You found a really cool data insight. You committed the Jupyter notebook code to git and continued your life with cooler, fresher, more exciting projects. A couple months later the team comes back to you: “We were wondering, did the statistic X change since you ran the experiment way back when?” You run back to the Jupyter notebook only to discover that while the code still works, you have no idea how you munged and got the data to run it. You have to spend an entire day reinventing the wheel of data extraction, which is beyond inefficient.

A lot of blogs out there advice to put the correctly shaped data together with the code you are committing. While this may work just fine for small public databases, when you have a couple hundred thousand rows with hundreds of columns of private user data, such practice is not an option. Imagine cloning a git repo with hundreds of projects full of 1GB data files per project. I might as well go and play Witcher 3 now. Also, remember, your data may have to deal with NGP compliance and HIPAA and other blah blah important boring government edicts.

A much better way to deal with creating reusable data is to put all your data extraction and data munging into one script. This script could pull from your Hadoop storage, or a SQL server, or a publicly stored file and shape the data according to the needs of your insights extraction machine. Viola!

giphy

In short

I could continue rambling about the perks of software engineering practices in data science for all eternity, but there is only much time in the day. For now remember to place your reusable python code into functions, treat those functions like libraries and write unit tests for them, and finally, make sure you commit to git the steps you took to extract the data for your project because in the end this will save you hours of work when you have to rerun your experiments. Go and play with data and be CodeBrave!

aloy

Advertisements
Using Software Engineering in Your Data Science Projects

Poltergeist Statistics: Correlation Coefficient with pandas and numpy

As a data scientist in training, I get to do a lot of exploratory analysis these days, examining different variables in data and see how they may be related. There is a nice little trick you can do with data to understand this relationship which comes from a magical field of statistics.

Imagine that as a part of your education at Hogwarts, you need to measure how ghosts and the messines at Hogwarts are related. You buy some sort of ghost metronome at the Diagon Alley to measure the present of the bodiless undead. As your messiness measure, you break into the Filch’s office and steal his meticulously collected instances of messiness he had to clean up at Hogwarts.

ghost_nicolas

Continue reading “Poltergeist Statistics: Correlation Coefficient with pandas and numpy”

Poltergeist Statistics: Correlation Coefficient with pandas and numpy

Strata, Mindless tasks and why I am in data science

This week I went to Strata conference in New York and returned from the conference with the warm confirmed feeling that is where I belong, just like when I moved from Idaho to Washington a year ago. So far working on HoloLens team has being an exciting ride learning things about software engineering, myself and others. Throughout this journey I realized something very important that made the transition to data science only more meaningful.

aloy

Features do not mean use

As software engineers, we build many features daily. We deploy to production, write dozens of tests, prepare reports… A lot of busy work sometimes that often does not transition to value. And while we do this, our companies pay us quite a bit of cash. I do not know about you, but I feel an extreme pang of guilt and imposter syndrom-y feeling when I am doing something at work that I know will unlikely bring value to the team and the company. Working on a couple anti-features in all the companies I have worked in made me ask a question. Is there a better way? Can I spend my time on something that will bring the information forward and help the team make a decision?

That is when I started my realization that the answer is in the data. Pure engineering with few data justifications that this is the right feature to build is moot.

Princess-Mononoke-post2

Continue reading “Strata, Mindless tasks and why I am in data science”

Strata, Mindless tasks and why I am in data science

Introverts and Why Open Office Culture Needs to Die

I am about to finish Quiet by Susan Cain and I cannot express how thrilled I am to have found this book! If you have not read it, or “have it on your list” like I did for many years, drop everything and go get that  book (finish reading the post though). Especially if you are an introvert. This book will change your worldview. It changed mine because I suddenly realized what has been bothering me for the past few years working in the software industry.

neighbor

Continue reading “Introverts and Why Open Office Culture Needs to Die”

Introverts and Why Open Office Culture Needs to Die

Prototyping with HoloLens Emulator: Class Notes Edition

Have you been waiting for that letter from Hogwarts to invite you to become a wizard? Then you are in the right place! In this blog post I will guide you through detailed steps of becoming a magician and learning how to develop your first prototypes for HoloLens. HoloLens is a holographic computer through which you can see the real world and modify it with the power of code.

chamber-of-secrets-harry-potter-hermione-granger-potion-witch-Favim.com-58522

You will create a prototype for a HoloLens application that can be used in music concerts and festivals. In this prototype, the music fans will be able to discover musical instruments in the room where bands will be playing and interact with the musical instruments to hear music samples.
 
You will learn how to:
– Setup a Windows Holographic project in Unity
– Use spatial mapping to place a hologram of a guitar in the room
– Map a tap gesture to the guitar hologram to interact with it and open a UI menu
– Map tap gesture to interact with UI menu and play spatial sound
– Deploy the prototype onto HoloLensemulator 

Part 0. Installation Chapter

What do you need to install to start developing prototypes for HoloLens?

For more details on how to install the tools, please consult Microsoft HoloLens page.

Part 1. Hello, Holocube!

Once your tools are installed, we are ready to create a Unity project for the HoloLens application!

  1. Launch Unity.
  2. Create a new project.
    1. Click on New to create a new Unity project.
    2. Select 3D and name the project IntroToHoloLens.
    3. Click on Create project button.1.create
  3. Add HoloToolkit to the project.
    1. Download HoloToolkit-Unity Asset Package (I have unity 5.5.2f1 installed, so I downloaded HoloToolkit-Unity for Unity 5.5.2f1+).1.download-holotoolkit
    2. Import HoloToolkit package into your project: Assets->Import Package -> Custom Package.
    3. Select the HoloToolkit asset package you just downloaded and click on Import.
    4. You should now have a HoloToolkit menu item in Unity.1.holotoolkit
    5. Set up Project settings for your HoloLens project (it may prompt you to restart Unity, do so): HoloToolkit -> Configure -> Apply HoloLens Project Settings.
    6. Set up scene settings for your HoloLens project: HoloToolkit -> Configure -> Apply HoloLens Scene Settings.
  4. Setup your first scene.
    1. Create a new Scene: File -> New Scene.
    2. Remove the default Main Camera and Directional Light objects.
    3. Add the HoloLensCamera.prefab (found under HoloToolkit/Input/Prefabs).
    4. Add a Cube object to the scene: Create->3D Object -> Cube.
    5. Set the position of the Cube to (0, 0, 5) and rotation to (45, 45, 45).1.cube
  5. Build.
    1. Open Build settings: Build -> Build Settings…
    2. Click on Add Open Scenes.
    3. Click on Windows Store.
    4. Click on Switch Platform.
    5. Enable Unity C# projects.
    6. Click on Build.
    7. Create a new folder, name it App and click on Select Folder.1.build-settings
  6. Deploy.
    1. Once the build completes, you should see a new explorer window pop up. Go to App folder you created in the previous step, and double click on IntroToHoloLens.sln. We can now deploy the application to HoloLens emulator!
    2. In Visual Studio, switch Debug to Release, ARM to x86 and Local Machine to HoloLens Emulator.1.vs-build-settings
    3. To start the emulator, click on Debug- -> Start Without Debugging
  7. Enjoy.
    1. When the emulator launches and your projects gets deployed, you will see your first hologram. Hello, Cube!1.cube-hologram

Part 2. Your Input matters – even in the emulator!

  1. Add the Cursor.prefab (found under HoloToolkit/Input/Prefabs/Cursor).
  2. Create an empty object in your scene. Rename it ‘Managers’.
  3. Add the InputManager.prefab (found under HoloToolkit/Input/Prefabs) as a child to your new ‘Managers’ Object.
  4. Let’s add the functionality to move the Cube using the power of HoloLens Tap gesture!
    1. Click on the Cube in the Heirarchy list, and look in the Inspector panel for a conspicuously named button named ‘Add Component’.2-add-component
    2. Type ‘GuitarInputManager’ to create a new C# script that will handle interaction with the Cube.
    3. Open the script in Visual Studio and replace it with the following code:
  5. using HoloToolkit.Unity.InputModule;
    using UnityEngine;
    
    public class GuitarInputManager : MonoBehaviour, IInputClickHandler
    {
     public virtual void OnInputClicked(InputClickedEventData eventData)
     {
     Debug.Log("Input clicked");
     }
    }
    
  6. Since we are deploying to an emulator, and not a real device, we will not be able to see the environment surrounding us. However, we can emulate the room using the emulator’s pre-uploaded room scans and visualize them via mesh drawing.
    1. To enable spatial mapping in your scene, add the SpatialMapping.prefab (found under HoloToolkit/SpatialMapping/Prefabs) to your ‘Managers’ object.
    2. Enable Spatial PerceptionCapabilities: Edit/Project Settings/Player -> Inspector -> Publishing Settings/Capabilities.2-spatial-perception
  7. Finally, let’s build and deploy the application to emulator following steps 6-7 in Part I! In the emulator you will see the cube surrounded by surreal looking triangles, the spatially mapped room. You can click on the cube using a right mouse button to simulate the air tap gesture, and in the Debug window of Visual Studio you should see the “Input Clicked” text printed out.

2.cube-hologram

Part 2. Where is my guitar asset?

At this point you are probably wondering. “Wait, you promised there would be a guitar and music playing and we will have a wizard party with martinis.” And you are right, I did promise some of those things. Let’s implement them!

  1. Find the guitar asset in the Unity Asset Store and import it into the project (uncheck the Scenes folder when importing, since we already have a cool scene in our project). Note that to download an asset from the Unity Asset Store into your project, you have to have a Unity account.3.guitar-import.PNG
  2. It’s time to replace the boring cube with a cool guitar.
    1. Delete the cube object, and drag the Guitar.prefab (found under Assets\Prefabs) into the Heirarchy panel.
    2. Set guitar Position to (0, 0, 5) and Scale to (0.5, 0.5, 0.5). Leave the rest to default.
  3. With Guitar selected, add GuitarInputManager component to Guitar.
  4. Add Mesh Collider component to Guitar.
  5. You can test deploy the app once again to make sure that the guitar shows up correctly, is tappable, and that the ‘Input clicked’ message is displayed in the debug window.
  6. The Input Clicked message is not the most exciting or realistic feature a guitar can have, so let’s add something cooler and more useful. For this tutorial, when the Guitar object is tapped, information about a band will be shown and giving the user an option to play a music sample for the band. Off we go! Let’s add the band info panel.
    1. In the Hierarchy panel, click on Create->UI->Text. This will create a Canvas object with a Text object as its child as well as the EventSystem object to handle UI events.
    2. Change the Canvas object to map to World Space, and have 10 Dynamic Pixels Per Unit to make sure the text is displayed with higher resolution.
    3. Change Canvas Position, Width, Height and Scale properties to match that of the image below.3-canvas
    4. Change Text Position, Width, Height, Scale, Text and Font Size properties to match that of the image below.3-text
    5. The last UI component we need to add is a button that will allow us to play the music sample for the band. Right click on the Canvas object, and select UI -> Button to add the button.
    6. Change the button Position, Width, Height, Scale and Color settings to match that of the image below.3-button
    7.  If you expand the button object and click on the Text object, set the button test to a little cute Play triangle character ▶.

 3-textplay

3-buttontext

  1. Build and deploy to see a Guitar and band info billboard appear in your virtual room! Clicking the buttons will not do anything right now. We will add that functionality in the next steps.3-guitarfinal

Part 3. Time to have a wizard guitar party

Now that we have a guitar and a UI menu for playing music samples, it is time to wire up the code to display the menu when the guitar is tapped and play a music sample when the Play button is pressed.

  1. Modify the code for Guitar Manager to add Billboard game object, AudioSource object and IsMusicPlaying flag. Add a PlaySample function to play the music when the play button is clicked. The final version of GuitarInputManager should look like this:
  2. using HoloToolkit.Unity.InputModule;
    using UnityEngine;
    
    public class GuitarInputManager : MonoBehaviour, IInputClickHandler
    {
     public GameObject billboard;
     public AudioSource BandSample;
    
     private bool IsMusicPlaying;
    
     public virtual void OnInputClicked(InputClickedEventData eventData)
     {
     Debug.Log("Input clicked");
     billboard.SetActive(true);
     }
    
     public void PlaySample()
     {
     IsMusicPlaying = !IsMusicPlaying;
     if (IsMusicPlaying)
     {
     BandSample.Play();
     }
     else
     {
     BandSample.Pause();
     }
     }
    }
    
  3. Add a AudioSource component to Guitar and set AudioClip to Across the Universe midi music file, which can be downloaded here. Make sure to uncheck “Play on Awake”.3-audiosource
  4. Drag the Audio Source created in step 3 onto the GuitarInputManager’s BandSample object. Drag the Canvas object created in Part 2 onto the Billboard. 3-bandsample
  5. Disable the Canvas object in the Inspector.3.-canvas
  6. Add spatial sound capability so that when we turn away from the guitar, the sound comes from the correct location, and being far away from the guitar makes the music sound quieter, just like in the real world.
    1. Set the MS HRTF Spatializer in your audio settings:
      Edit -> Project Settings -> Audio -> Spatializer.
    2. In Guitar Inspector settings, check the Spatialize to enable spatial sound and drag SpatialBlend slider all the way to the right for 3D sound.3-spatial
  7. Finally, select the Button object and look at the inspector Inspector panel. Let’s map the OnClick event at the bottom of the Inspector panel to the PlaySample function we created in Part 2. Select the Guitar object, and in the function drop down choose GuitarInputManager->PlaySample.3-playsample
  8. We are all set for the final build, deploy and enjoy! When you launch the application, the menu will be hidden. Try tapping on the guitar, and the menu will appear. Tapping on the play button will start the iconic Across the Universe trek by The Beatles.3-guitarfinal

Congratulations! You have taken the first step to becoming a HoloLens wizard and filling this world with holograms.

hermione

Prototyping with HoloLens Emulator: Class Notes Edition

How’s Seattle?

I moved from Idaho to greater Seattle area for a job over half a year now (wow, it has been seven months already!), and when I travel to Boise the first thing I get asked is ‘how is Seattle?’ Given the frequency of this question, I decided to do a small write up of goods and bads of living in Seattle area as I have been experiencing.

Cost of living

Let’s get this out of the way first. We had a house in Boise, we sold it, and still could not afford a house close to work. Housing prices here are expensive, and if you are used to living close to work or in downtown,  you have to rent to save up.

hobbithouseimages0006_O

Women in tech, women working and women’s rights

This was one of the biggest reasons why I fled Idaho. The amount of technical women I knew and worked with was very low. Way too many women left workforce in drones to raise children, leaving me with no mentors to learn from and whose path to follow. And way too many men in the Idaho government messed with women’s reproductive rights. It was extremely depressing.

For the first time since 2007 when I worked on a minimum wage doing filing and answering international student emails, I have a rockstar woman boss. There are also a ton of mentors who I can learn from and look up to. Women stay in the workforce and rock it. Plus Seattle area is as blue as anything can be, shutting down any man who opens his mouth to talk about cancelling abortions. Seattle area is truly a heaven for a career-oriented technical woman.

zoe

Food is amazing

40% of Bellevue, the city where I currently live, is a minority race or ethnicity. A huge and pleasant change from very non-diverse Idaho. Moving to Bellevue came with an awesome perk: the food. I love the availability of Thai, Indian and Japanese restaurants here. While there were a couple good non-American restaurants in Boise, I never before moving to Bellevue tried good Mexican or Italian food! Now the foodie in me is very happy.

 

david-tennant-doctor-who-bbc-24

Things to do

That’s a big one. If you look in the Stranger, a Seattle alternative newspaper, a concert is taking place every day. And most importantly, the concerts feature the bands I actually like. Metal, alternative rock faerie folk music – it is all here. In the 10 years I lived in Boise, only one of my favorite bands was scheduled to perform (Nightwish), and lo and behold – it cancelled the performance the day before.

Nature freak in me is ecstatic too. The ever-green has the most amazing scenery and beautiful hikes with the cleanest seashore air you will ever experience.

tenant

Verdict

Moving to Seattle was one of the most awesome decisions my husband and I made. Given we are liberal career-oriented technical people who love good food, fun music/geek events and hikes in the nature, this is haven.

Here is my advice to people who are feeling stuck/trapped/depressed in Idaho: move to where you will feel at home. Having a giant cheap home where you are alienated because of your beliefs, culture, or gender is simply not worth it.

How’s Seattle?

Everybody lies, constantly

Do people know what they are talking about?

Understanding that most of the students who confidently discussed the awesome projects they were working on were actually bluffing was the biggest realization in my life.

When I was a master’s student working in machine learning and artificial intelligence fields, I accidentally ran out of printer paper in my lab and had to walk to undergrad student lab where a couple students were working on a project. As they worked, one of the students recited facts about machine learning and the others nodded obviously fascinated by his extensive knowledge. None of the facts he claimed with such confidence were true had no solid ground to it, yet the students sitting next to the talkative guy did not know any better and believed what he had to say.

I walked out of that lab my belief system shattered. Since then I started to doubt anything a man says to me with confidence. The answer is no, people do not know what they are talking about.

amelie

What about experts?

Continue reading “Everybody lies, constantly”

Everybody lies, constantly