Journey of Analytics

Deep dive into data analysis tools, theory and projects

Category: career

Career resources

DataScience Portfolio Ideas for Students & Beginners

A lot has been written on the importance of a portfolio if you are looking for a DataScience role. Ideally, you should document your learning journey so that you can reuse code, write well-documented code and also improve your data storytelling skills.

DataScience Portfolio Ideas

However, most students and beginners get stumped on what to include in their portfolio, as their projects are all the same that their classmates, bootcamp associates and seniors have created. So, in this post I am going to tell you what projects you should have in your portfolio kitty, as well as a list of ideas that you can use to construct a collection of projects that will help you stand out on LinkedIn, Github and in the eyes of prospective hiring managers.

Job Search Guide

You can find many interesting projects on the “Projects” page of my website JourneyofAnalytics. I’ve also listed 50+ sources for free datasets in this blogpost.

In this post though, I am classifying projects based on skill level along with sample ideas for DIY projects that you can attempt on your own.

On that note, if you are already looking for a job, or about to do so, do take a look at my book “DataScience Jobs“, available on Amazon. This book will help you reduce your job search time and quickly start a career in analytics.

Since I prefer R over Python, all the project lists in this post will be coded in R. However, feel free to implement these ideas in Python, too!

a. Entry-level / Rookie Stage

  1. If you are just starting out, and are not very comfortable with even syntax, your main aim is to learn how to code along with DataScience concepts. At this stage, just try to write simple scripts in R that can pull data, clean it up and calculate mean/median and create basic exploratory graphs. Pick up any competition dataset on Kaggle.com and look at the highest voted EDA script. Try to recreate it on your own, read through and understand the hows and whys of the code. One excellent example is the Zillow EDA by Philipp Spachtholz.
  2. This will not only teach you the code syntax, but also how to approach a new dataset and slice/dice it to identify meaningful patterns before any analysis can begin.
  3. Once you are comfortable, you can move on to machine learning algorithms. Rather than Titanic, I actually prefer the Housing Prices Dataset. Initially, run the sample submission to establish a baseline score on the leaderboard. Then apply every algorithm you can look up and see how it works on the dataset. This is the fastest way to understand why some algorithms work on numerical target variables versus categorical versus time series.
  4. Next, look at the kernels with decent leaderboard score and replicate them. If you applied those algorithms but did not get the same result, check why there was a mismatch.
  5. Now pick a new dataset and repeat. I prefer competition datasets since you can easily see how your score moves up or down. Sometimes simple decision trees work better than complex Bayesian logic or Xgboost. Experimenting will help you figure out why.

Sample ideas –

  • Survey analysis: Pick up a survey dataset like the Stack overflow developer survey and complete a thorough EDA – men vs women, age and salary correlation, cities with highest salary after factoring in currency differences and cost of living. Can your insights also be converted into an eye-catching Infographic? Can you recreate this?
  • Simple predictions: Apply any algorithms you know on the Google analytics revenue predictor dataset. How do you compare against the baseline sample submission? Against the leaderboard?
  • Automated reporting: Go for end-to-end reporting. Can you automate a simple report, or create a formatted Excel or pdf chart using only R programming? Sample code here.

b. Senior Analyst/Coder

  1. At this stage simple competitions should be easy for you. You dont need to be in the top 1%, even being in the Top 30-40% is good enough. Although, if you can win a competition even better!
  2. Now you can start looking at non-tabular data like NLP sentiment analysis, image classification, API data pulls and even dataset mashup. This is also the stage when you probably feel comfortable enough to start applying for roles, so building unique projects are key.
  3. For sentiment analysis, nothing beats Twitter data, so get the API keys and start pulling data on a topic of interest. You might be limited by the daily pull limits on the free tier, so check if you need 2 accounts and aggregate data over a couple days or even a week. A starter example is the sentiment analysis I did during the Rio Olympics supporting Team USA.
  4. You should also start dabbling in RShiny and automated reports as these will help you in actual jobs where you need to present idea mockups and standardizing weekly/ daily reports.
Yelp College Search App

Sample ideas –

  • Twitter Sentiment Analysis: Look at the Twitter sentiments expressed before big IPO launches and see whether the positive or negative feelings correlated with a jump in prices. There are dozens of apps that look at the relation between stock prices and Twitter sentiments, but for this you’d need to be a little more creative since the IPO will not have any historical data to predict the first day dips and peaks.
  • API/RShiny Project: Develop a RShiny dashboard using Yelp API, showing the most popular restaurants around airports. You can combine a public airport dataset and merge it with filtered data from the Yelp API. A similar example (with code) is included in this Yelp College App dashboard.
  • Lyrics Clustering: Try doing some text analytics using song lyrics from this dataset with 50,000+ songs. Do artists repeat their lyrics? Are there common themes across all artists? Do male singers use different words versus female solo tracks? Do bands focus on a totally different theme? If you see your favorite band or lead singer, check how their work has evolved over the years.
  • Image classification starter tutorial is here. Can you customize the code and apply to a different image database?

c. Expert Data Scientist

DataScience Expert portfolio
  1. By now, you should be fairly comfortable with analyzing data from different datasource types (image, text, unstructured), building advanced recommender systems and implementing unsupervised machine learning algorithms. You are now moving from analyze stage to build stage.
  2. You may or may not already have a job by now. If you do, congratulations! Remember to keep learning and coding so you can accelerate your career further.
  3. If you have not, check out my book on how to land a high-paying ($$$) Data Science job job within 90 days.
  4. Look at building Deep learning using keras and apps using artificial intelligence. Even better, can you fully automate your job? No, you wont “downsize” yourself. Instead your employer will happily promote you since you’ve shown them a superb way to improve efficiency and cut costs, and they will love to have you look at other parts of the business where you can repeat the process.

Sample project ideas –

  • Build an App: College recommender system using public datasets and web scraping in R. (Remember to check terms of service as you do not want to violate any laws!) Goal is to recreate a report like the Top 10 cities to live in, but from a college perspective.
  • Start thinking about what data you need – college details (names, locations, majors, size, demographics, cost), outlook (Christian/HBCU/minority), student prospects (salary after graduation, time to graduate, diversity, scholarship, student debt ) , admission process (deadlines, average scores, heavy sports leaning) and so on. How will you aggregate this data? Where will you store it? How can you make it interactive and create an app that people might pay for?
  • Upwork Gigs: Look at Upwork contracts tagged as intermediate or expert, esp. the ones with $500+ budgets. Even if you dont want to bid, just attempt the project on your own. If you fail, you will know you still need to master some more concepts, if you succeed then it will be a superb confidence booster and learning opportunity.
  • Audio Processing: Use the VOX celebrity dataset to identify the speaker based on audio/speech dataset. Audio files are an interesting datasource with applications in customer recognition (think bank call centers to prevent fraud), parsing for customer complaints, etc.
  • Build your own package: Think about the functions and code you use most often. Can you build a package around it? The most trending R-packages are listed here. Can you build something better?

Do you have any other interesting ideas? If so, feel free to contact me with your ideas or send me a link with the Github repo.

How to Become a Data Scientist

This question and its variations are the most searched topics on Google. As a practicing datascience professional, and manager to boot, dozens of people ask me this question every week.

This post is my honest and detailed answer.

Step 1 – Coding & ML skills

  • You need to master programming in either R or Python. If you don’t know which to pick, pick R, or toss a coin. [Or listen to me, and pick R – programming as it is used at Top Firms like NASDAQ, JPMorgan, and many more..] Also, when I say master, you need to know more than writing a simple calculator or “Hello World” function. You should be able to perform complex data wrangling, pull data from databases, write custom functions and apply algorithms, even if someone wakes you up at midnight.
  • By ML, I mean the logic behind machine learning algorithms. When presented with a problem, you should be able to identify which algorithm to apply and write the code snippet to do this.
  • Resources – Coursera, Udacity, Udemy. There are countless others, but these 3 are my favorites. Personal recommendation, basic R from Coursera (JHU) and Machine learning fundamentals from Kirill’s course on Udemy.

Step 2 – Build your portfolio.

  • Recruiters and hiring managers don’t know you exist, and having an online portfolio is the best way to attract their attention. Also, once employers do come calling, they will want to evaluate your technical expertise, so a portfolio helps.
  • The best way to showcase your value to potential employers is to establish your brand via projects on Github, LinkedIn and your website.
  • If you do not have your own website, create one for free using wordpress or Wix.
  • Stumped on what to post in your project portfolio?
  • Step1 – Start by looking in the kernels portion on the site www.kaggle.com there are tons of folks who have leveraged free datasets to create interesting visualizations. Also enroll in any active competitions and navigate to the discussion forums. You will find very generous folks who have posted starter scripts and detailed exploratory analysis. Fork the script and try to replicate the solution. My personal recommendation would be to begin with titanic contest or the housing prices set. My professional website journeyofanalytics also houses some interesting project tutorials, if you want to take a look.
  • Step 2 – pick a similar datasets from kaggle or any other open source site, and apply the code to the new datasets. Bingo, a totally new project and ample practice for you.
  • Step3 – Work your way up to image recognition and text processing.

Step 3 – Apply for jobs strategically.

  • Please don’t randomly apply to every single datascience job in the country. Be strategic using LinkedIn to reach out to hiring managers. Remember, its better to hear “NO” directly from the hiring manager than to apply online and wait in eternity.
  • Competition is getting fierce, so be methodical. Books like “Data Science Jobs” will help you pinpoint the best jobs in your target city, and also connect with hiring managers for jobs that are not posted anywhere else.
  • Yes, I wrote the book listed above – this is the book I wished I had when I started in this field! Unlike other books on the market with random generalizations, this book is written specifically for jobseekers in the datascience field. Plus, I’ve successfully helped a dozen folks land lucrative jobs (data analyst/data scientist roles) using the strategies outlined in this book. This book will help you cut your datascience job search time in half!
  • Upwork is a fabulous site to get gigs to tide you until you get hired full-time. It is also a fabulous way of being unique and standing out to potential employers! As a recruiter once told me, “it is easier to hire someone who already has a job, than to evaluate someone who doesn’t!”
  • If your first job is not at your dream job, do not despair. Earn and learn, every company, big or small, will teach you valuable skills that will help you get better and snag your ideal role next year. I do recommend staying at roles for at least 12 months, before switching, otherwise you won’t have anything impactful to discuss in the next interview.

Step 4 – Continuous learning.

  • Even if you’ve landed the “data scientist” job you always wanted, you cannot afford to rest on your laurels. Keep your skills current by attending online classes, conferences and reading up on tech changes.
    Udemy, again is my go to resource to stay abreast of technical skills.
  • Network with others to know how roles are changing, and what skills are valuable.

Finally, being in this filed is a rewarding experience, and also quite lucrative. However, no one can get to the top without putting in sufficient effort and time. So, master the basics and apply your skills, you will definitely meet with success.

If you are looking to establish a career in datascience, then don’t forget to take a look at my book – “Data Science Jobs‘ now available on Amazon.

Data Science Job in 90 days – Book Review

Are you an R-programmer or Datascience enthusiast looking for a break in the datascience field? If so, my latest book “Data Science Jobs – land a lucrative job in 90 days” will help you find one quickly.

Imagine reducing your job hunt time by 2 weeks, or even 4 weeks? The strategies in this book are designed to do just that. [ Amazon book link here.]

As an analytics manager I get countless requests for job search advice, resume feedback and brilliant students who are somehow unable to find a job in this exciting field. There are tons of books on the internet on how to learn the skills to become a data scientist/ data analyst, but none to prepare folks for the frustrating job search.

I repeat the same advice to all these requests and am delighted to say that a dozen people were successfully able to land their dream roles with companies like LinkedIn, Walmart Labs, Comcast and others. I decided to publish the book so others can also benefit from the same advice.

Target Reader Audience

  • Students with solid knowledge of programming in R or Python looking to find a role as a data analyst/scientist or BI developer. A background in computer science or math will help, but not necessary.
  • International student on F-1/OPT visa looking for employment after a graduate degree in analytics.
  • Employed professionals looking to pivot their career, or seeking better pay/manager/location.

Book Summary

The book aims to provide you with creative techniques to get your resume directly in the hands of hiring managers, instead of relying hopelessly on online application systems that rarely produce a response. Don’t be fooled by the length of the book – it is deliberately kept short so that jobseekers can read through quicky and apply these principles in their job search.

The book chapters provide detailed guidelines on these broad themes:

  • Personal Branding – Create an online profile that helps you bubble up when hiring managers look for candidates. Make the jobs come to you! Tips to tweak your resume to achieve the same. For project inspiration, look at learning communities like R-bloggers, Kaggle, etc.
  • LinkedIn – Secret ways to leverage LinkedIn to engage hiring managers. Do NOT simply accept connections or indiscriminately apply to every open job. How to use LinkedIn to improve personal SEO!
  • Strategic Networking – How to actively reach out to the decision-makers who can hire you!
  • Niche sites – Hiring managers understand that the best venue to hire talent are the datascience communities where folks go to learn. The book lists these niches job boards on sites like R-bloggers, Kaggle.com and many others.
  • Upwork – despite popular opinion (about the site’s ineffectiveness), this site is a quick way to earn money and position yourself for your dream role.
  • And many more…

In conclusion, this book is a condensed guide with practical strategies to make the job search process less stressful, and help readers quickly get hired. So get the ebook on Amazon, and get started on a lucrative career!

Top 5 Secret Tips to make the most of your online MBA

Online and executive MBAs are becoming more common and widely respected as more professionals balk at the hefty price tag of regular degrees, and the opportunity cost of leaving a cushy job.

With many employers also supporting tuition reimbursements, and colleges becoming better supporting their online students, this is a trend that shows no signs of slowing down.

However, it can get lonely and many online students do not fully leverage the opportunities of an online MBA and often feel a little disillusioned at not being able to attract the networking groups that traditional in-person programs offer. All articles on how to make the most of an MBA seem catered towards those who have the luxury of doing their courses full-time.

So here are top 5 tips to help you make the most of your online MBA. If you are taking any other online degree, then fear not, these techniques still apply.

Why are they “secret” ? Because no one ever tells you, because “everyone” is supposed to know. But students never do, until its too late to do anything about it… So without further ado, here are the tips to maximize your online degree or MBA program.

  1. Time Management.
  2. Go beyond the minimum required coursework.
  3. Leverage LinkedIn to forge better connections.
  4. Tell your manager.
  5. Showcase your skills.

What makes me qualified to talk about online degree programs? I am 80% complete through my online MBA while holding a full-time job, and maintaining ~3.6 GPA. Received A/B for almost all my classes. Despite not taking any classes last summer, I am on track to complete my MBA in less than 20 months, since I started in Aug 2017. A lot of folks have asked me this question, so I figured a post might be helpful to others who don’t know me personally, and looking for the same info.

1. Time Management

If you are working full-time and/or have a family, then one of the first things you will notice is that you are pressed for time. BIG TIME. Do not worry, this is a skill you need to master as an MBA, and future manager. You will have to juggle and excel at working on multiple (and often conflicting) priorities, so it is best you learn this well now.

To me personally, studying was easy, finding time to do it was incredibly challenging. So how can you cope?

  1. First make a list of your daily routine. Include office hours, commute time, travel plans, kids’ activities, etc.
  2. Now assign time for studies. You may find that you need to delegate some stuff to your partner, older kids, or give up some items altogether. Or you may realize that taking 2 courses per semester is a stretch.
  3. Get creative. If you commute by train, can you read on the train. If you commute by car, there are software programs that can “read” textbook content as audio files, so you can listen in your car. Do you need to block 2 hours in the weekend when someone else can look after the kids, so you can head to the library to study in silence?
  4. Some employers are a little accommodating too. For example, one friend told me how his manager allowed him to book a huddle room for 2 hours per week, so he could sit and watch his class videos.
  5. I used to buy used bad-condition textbooks so I could literally tear off 10 pages of a chapter, and carry them around everywhere. So I could read while waiting in doctor’s clinics, connecting trains, once even in a serpentine queue in the post office. I had to bind these books before selling them back, others I donated. The “A” grades I received were worth the effort.
  6. Guard your time ruthlessly. Once you’ve found time for studying, do not use it for personal appointments, cleaning the house, getting a facial or any of the 101 things that we all have in our to-do list. Study time is for study ONLY.
  7. Create a calendar with deadlines for quizzes, discussions, papers due, etc. Keep additional deadlines 1 or 2 days before the due date, so you have buffer for completing them.
  8. Know the scores needed to make the grade. If your employer is paying partially or fully, know what is the cutoff. Some employers want a B+ or above, others will settle for a C or above. However, colleges also have a mandatory threshold for grad students, and you can go on probation or pay for recovery grade classes, if you receive less than Bs in too many classes.
  9. If your class allows, try to complete as much work in the first 2-3 weeks, so you have leeway to lose scores in the midterms and finals, which are typically harder. For my first class, I realized A-grade was scored at 95%! What????!! I lost 20 points out of 1000 in the first quiz itself, so I knew within 2 weeks that an A for that class was not possible. However, I did make it to a B (85%-95%) and thanks to an optional assignment, just scraped into the A-grade. Being familiar with your syllabi is crucial to get this done.

2. Go beyond minimum coursework.

This may sound ironical, given that finding time for regular work itself is challenging. However, you can still do it by being smart about your work and your time.

  1. For online discussions, if you have to post 2 peer responses, post 3. Most colleges have a smartphone app, so you can easily do this in small chunks of time (lunchbreak or waiting for a boring meeting to start).
  2. Look for responses that are completely counter to your argument, so you can really view the topic from a fresh perspective.
  3. Do try to work on at least a couple self-assessment Qs from the back of the chapters, and look it up on Google Scholar or regular Google. If in doubt, ask your teaching assistant or professor via email. Most of them will be delighted to help you understand.

3. Leverage LinkedIn.

This is no-brainer, but I despair at the people who still fail to follow it.

  1. Update your LinkedIn profile with the degree program you are pursuing. If you don’t have a LinkedIn profile, then for heavens sake create one.
  2. All major colleges have student groups, and alumni groups. Add yourself to both. Then contribute meaningful conversations, without spamming folks.
  3. A lot of professors, assistant professors and teaching assistants are also on LinkedIn. So connect with them, by adding a note about the class you took with them. Professors rarely refuse.
  4. Look for classmates and alumni who work in your industry/ company/ area, and invite them to connect. Make sure to add a note about being fellow alumni. Perhaps you could even ask them which courses they liked or found most challenging. Most folks are very generous on LinkedIn.
  5. Add your LinkedIn url (customize them please) to your signature, and add the link to all the emails and conversations you have in school. This includes introductions (every class will have this), posts. Don’t add to peer responses, if you feel hesitant, but definitely include in emails and group project conversations, etc. Online learning platforms don’t allow this very easily, so I had it pasted as a note on my desktop and manually added it to every introductory conversation. If this sounds creepy and self-pushing to you, let me add that I have completed 8 classes with 3.6 GPA and till date no one has ever called me out on it. The 50+ classmates who connected with me on their own, was totally worth it though.
  6. For introductions, read through what others are doing and saying. Even if they are not in your location, do say hi, and request to connect. You never know who will be able to help out whom. For example, I was able to connect 7 of my classmates from different courses (3 pairs basically) because they lived in same area, and 2 worked for the same company and location, without knowing each other. Many others have helped me understand concepts and with homework when I was struggling, and one motivated me via daily LinkedIn messages when I was feeling completely overwhelmed.
  7. Aim to connect with at least 5 people from each class. Plus, like all LinkedIn connections stay connected beyond the class. Send them hello for New year or Thanksgiving and congratulate them on role changes, birthdays and so on.
  8. If you work on folks for group projects, then do send them a request to connect beyond the class. These folks are at least a good source for skill endorsements and recommendations.

4. Tell your manager.

  1. Irrespective of whether your employer is reimbursing your course or not, do tell them about the course. Tell them your hopes and expectations from the course. Most of us do hope for a promotion, and salary hike, so having your manager in the loop helps.
  2. Do not tell your manager about your MBA as a threat; and definitely do not use it as a hostile negotiation tactic. Instead tell them that you are looking to improve your skills and how you hope your new skills will help you increase your value in the team. You may be surprised how happy your manager is with your proactive nature, and may even offer you additional projects to help you apply your skills.
  3. If possible, tell your manager before or during the application process itself, as they may be able to tell you about partner universities, or help you connect with others who have taken similar courses. My manager (at Nasdaq) did tell me about a great subsidized program, at a college right across the street, although I finally ended up joining a totally different program. However, he did introduce me to 2 amazing colleagues in unrelated departments who were also pursuing executive MBAs (diff university). Their tips helped me navigate my program more efficiently.
  4. Keep your manager updated about courses, so he/she knows how you are doing and all the fantastic skills you are picking up! You don’t really need to tell them exact grades if you don’t want too, but if you got good ones, TELL. Believe me, it will come up in conversations with his peers and seniors, and you will be glad to have a positive note worth sharing.
  5. Obviously, your work should be priority over the degree, since your pay depends on it. But should a problem arise, your manager at least knows you have other deadlines, and may help you when you are crunched during an important final. Don’t make it a habit of it, or your manager may question if you really should be pursuing the course, or if you really have the ability to take on additional responsibilities.

5. Showcase your skills.

This is crucial, yet I am amazed how many people never bother doing it.

  1. LinkedIn allows you to add courses, so add ones that will probably work as keywords. This is apart from your educational qualifications. For example, my profile lists data analysis, financial accounting for managers, and strategy management. As a strategy/risk analyst, these are quite relevant. This is aside from the MBA that is listed on my education section.
  2. You will be doing projects for courses, so add a summary in the projects section. If your classmates are on LinkedIn, tag them as well. This helps to boost your profile.
  3. This article from Udacity graduate Nirupama, has excellent tips on how to make the most of project-based courses. It was written for MOOCs, but translates very well for courses from online degree programs.
  4. Add skills. LinkedIn allows you to allow up to 50 skills, so make sure yours are the most relevant and important, from the perspective of your current role, and the role you want to get into. Plus, you can ask your classmates to endorse you for these skills, as they know firsthand that you worked to learn them during the course. Remember to return the favor to them as well. I normally endorse the top 5 skills, and ask them if there are specific ones that they want to bubble up to the top. (LinkedIn has some sort of algorithm, so the ones with the most votes, generally show up on top. You can re-order manually)
  5. Ask classmates to add recommendations.
  6. Ask professors/ teaching assistants to recommend you, if you did particularly well in class. I did have one professor who said he doesn’t recommend people on LinkedIn, since he has thousands of students, but he did send me a very nice email note, and agreed to serve as a reference source should I need one. Not ideal, but quite helpful.

Those were the tips that I found most helpful to ace my MBA, and hopefully should make the journey easier for you as well. So study well, and enjoy your program!

Facebook
LinkedIn