Journey of Analytics

Deep dive into data analysis tools, theory and projects

Category: Data visualization (page 1 of 2)

Mapping Anthony Bourdain’s Travels

Travel maps tutorial

Anthony Bourdain was an amazing personality – chef, author, world traveler, TV showhost. I loved his shows as much for the exotic locations as for the yummilicious local cuisine. So I was delighted to find a dataset that included all travel location data, from all episodes of his 3 hit TV shows. Dataset attributed to Christine Zhang for publishing the dataset on Github.

In today’s tutorial, we are going to plot this extraordinary person’s world travels in R. So our code will cover geospatial data mapping using 2 methods:

  • Leaflets package to create zoomable maps with markers
  • Airplane route style maps to see the paths traveled.

Step 1 – Prepare the Workspace

Here we will load all the required library packages, and import the dataset.

places <- data.frame(fread(‘bourdain_travel_places.csv’), stringsAsFactors = F)

Step 2 – Basic Exploration

Our dataset has data for 3 of Bourdain’s shows:

  • No Reservations
  • Parts Unknown – which I personally loved.
  • The Layover

Let us take a sneak peak into the data:

dataset preview

How many countries did Bourdain visit? We can calculate this for the whole dataset or by show:

numshow <- sqldf(“select show , count(distinct(country)) as num_ctry from places group by show”) # Num countries by show.

numctry <- nrow(table(places$country)) # Total countries visited
numstates <- nrow(table(places$state[places$country == ‘United States’])) ## Total states visited in the US.

Wow! Bourdain visited 93 countries overall, and 68 countries for his show “No Reservations”. Talk about world travel.

I did notice some records have state names as countries, for example California, Washington and Massachussets. But these are exceptions, and overall the dataset is extremely clean. Even disregarding those records, 80+ countries is nothing to be scoffed at, and I had never even heard of some of these exotic locations.

P.S.: You know who else gets to travel a lot? Data scientists earning $100k+ per year. Here’s my new book which will help you how to land such a dream job.

Step 3 – Create a Leaflet to View Sites on World Map

Thankfully, the data already has geographical coordinates, so we don’t need to add any processing steps. However, if you have cities which are missing coordinates then use the “worldcities” file from the Projects page under “Rent Analysis”.

We do have some duplicates, where Bourdain visited the same location in 2 or more shows. So we will de-duplicate before plotting.

Next we will add an info column to list the city and state name that we can use on the marker icons.

places4$info <- paste0(places4$city_or_area, “, “, places4$country) # marker icons

mapcity <- leaflet(places4) %>%
setView(2.35, 48.85, zoom = 3) %>%
addTiles() %>%
addMarkers(~long, ~lat, popup = ~info,
options = popupOptions(closeButton = T),
clusterOptions = markerClusterOptions())
mapcity # Show the leaflet

leaflet view – the markers are interactive in R

Step 4 – Flight Route View

Can we plot the cities in flight view style? Yes, we can as long as we transform the dataframe where each record has a departure and arrival city. We do have the show and episode number so this is quite easy.

Once we do that we will use a custom function which basically plots a circle marker at the two cities and a curved line between the two.

plot_my_connection=function( dep_lon, dep_lat, arr_lon, arr_lat, …){
inter <- gcIntermediate(c(dep_lon, dep_lat), c(arr_lon, arr_lat), n=50, addStartEnd=TRUE, breakAtDateLine=F) inter=data.frame(inter) diff_of_lon=abs(dep_lon) + abs(arr_lon) if(diff_of_lon > 180){
lines(subset(inter, lon>=0), …)
lines(subset(inter, lon<0), …)
}else{
lines(inter, …)
}
} # custom function

For the actual map view, we will create a background world map image, then use the custom function in a loop to plot each step of Bourdain’s travels. Depending on how we create the transformed dataframe, we can plot Bourdain’s travels for a single show, single season or all travels.

Here are two maps separately for the show “Parts Unknown” and “The Layover” respectively. Since the former had more seasons, the map is a lot more congested.

Parts Unknown seasons – travel maps

par(mar=c(0,0,0,0)) # background map
map(‘world’,col=”#262626″, fill=TRUE, bg=”white”, lwd=0.05,mar=rep(0,4),border=0, ylim=c(-80,80) ) # other cols = #262626; #f2f2f2; #727272
for(i in 1:nrow(citydf3)){
plot_my_connection(citydf3$Deplong[i], citydf3$Deplat[i], citydf3$Arrlong[i], citydf3$Arrlat[i],
col=”gold”, lwd=1)
} # add every connections:
points(x=citydf$long, y=citydf$lat, col=”blue”, cex=1, pch=20) # add points and names of cities
text(citydf$city_or_area, x=citydf$long, y=citydf$lat, col=”blue”, cex=0.7, pos=2) # plot city names

As always, the code files are available on the Projects Page. Happy Coding!

Call to Action:

If you read this far and also want a job or promotion in the DataScience field, then please do take a look at my new book “Data Science Jobs“. It will teach you how to optimize your profile to land great jobs with high salary; 100+ interview Qs and niche job sites which everybody else overlooks.

Email Automation for Google Trends

This blogpost will teach you set up automated email reports to view how search volumes i.e. Google Trends vary over time.

Email automation for Google Trends over time

The email report will also include important search terms that are “rising” or near a “breakout point”. This can be really useful as the breakout keywords indicate users across the globe have recently started paying great attention to these search terms. So you could create original content to attract people using these phrases in their Google searches. This is an amazing way to boost traffic/ sales leads, etc. by taking advantage of new trends.

Note, that the results are relative, not absolute search volume terms. But it is still quite useful, as you could narrow down your keywords list, and login to Adsense/ keyword planner to get the exact search volume. Export the list from this script to a .csv and simply upload in Adsense. It also gives you a clear indication of how search volumes compare across similar terms – classic SEO !

We will perform all our data pull requests and manipulations in R. The best part is that you do not need any API keys or logins for this tutorial!

[ Do you know what else is trending? My book on how to get hired as a datascientist. Check it out on Amazon, where it is the #1 NEW RELEASE and top 10 in its categories. Whether you are currently enrolled in a data science course or actively job searching, this book is sure to help you attract multiple job offers in this lucrative niche. ]

Without further interruptions – let’s dive right in to the trend analysis.

Step 1 – Load Libraries

Start by loading the libraries – “gtrends” is the main library package which will help us pull data for search results.

library (gtrendsR)
library(ggplot2)
library(plotly)
library(dplyr)
library(sqldf)
library(reshape2)

Step 2 – Search terms and timeframe

We need to specify the variables we need to pass to the search query:

  • search terms of interest. Unfortunately, you can only pass 5 terms at one time. Theoretically you could store the value and repeat the searches, but since numbers are relative, make sure that you keep at least one term in common and then normalize.
  • Remember, we get relative search volumes (max = 100); NOT absolute volumes!
  • timeframe under consideration, has to be in the format “YYYY-mm-dd” ONLY. No exceptions. End date = today, whereas start date will be the the first of the month, six months ago.
  • country codes – I’ve only used “US”, but you can search for multiple countries using a string array like c(“US”, “CA”, “DE”) which stands for US, Canada and Germany.
  • Channels – default is “web”, but you can search for “new” or “images”. Images would be good for those promoting content in Pinterest or Instagram.

date6mthpast <- Sys.Date() – 180
startdate <- paste0( substr(as.character(date6mthpast), 1,7), “-01”, sep = “”)
currdate = as.character(Sys.Date())
timeframe = paste( startdate, currdate)
keywords <- c(“Data Science jobs”, “MS Analytics Jobs”, “Analytics jobs”,
“Data Scientist course”, “job search”)
country=c(‘US’)
channel=’web’

We will use the gtrends() function to pull the data. This function returns a list of variables and dataframes. Of main interest to us is the “interest_over_time” data frame, “interest_by_region” (state names mostly). There is also a dataframe named “interest_by_city”. The first one does hold data with date value, while the others do not, so it will be aggregated over the entire data frame. I worked around this by pulling once for 6 months, and again for the last 45 days.

trends = gtrends(keywords, gprop =channel,geo=country, time = timeframe )

Step 3 – Search Volume over time

We will use the ggplot function to get a graphical representation of the search volume.

ggplot(data=time_trend, aes(x=date, y=hitval, group=keyword, col=keyword))+
geom_line()+xlab(‘Time’)+ylab(‘Relative Interest’)+ theme_bw() +
theme(legend.title = element_blank(), legend.position=”bottom”, legend.text=element_text(size=12)) +
ggtitle(“Google Search Volume over time”)

Google search volume over time
Google trends over time

Notice that the interest in term “job search” is significantly higher than any other keyword and spike in April and May (people graduating maybe?) This makes sense as it is a more generic term, and applies to larger number of users. On a similar vein, look at the volumes for “MS analytics jobs” . It is almost nil, so clearly a non-starter in terms of targeting.

Step 4 – Interest by Regions

Let us look at interest by region (state names in the US ) . This might be useful to target folks by region, or even location-based ads.

locsearch = trends$interest_by_region
plot2 <- ggplot(subset(locsearch, locsearch$keyword != “job search”),
aes(x=as.factor(location), y= hits, fill=as.factor(keyword) )) +
geom_bar(position=”dodge”, stat=”identity”) +
coord_flip() +
theme(legend.title = element_blank(), legend.position=”bottom”, legend.text=element_text(size=9)) +
xlab(‘Region’)+ ylab(‘Relative Interest’) +
ggtitle(“Google Search Volumes by Region/State”)

Note, that “data scientist course” has almost zero interest compared to broader terms like “analytics jobs” or “datascience Jobs”. Interestingly, “analytics jobs” seems to be a preferred term over “data science” only in NY, MA and Washington DC.

Step 5 – Trending Terms

The original list also returns a dataframe titled “related_queries”. Some of these can have tags as “rising” or “breakout”. These are the terms we would like to know as they occur.

breakoutdf = reltopicdf[reltopicdf$related_queries == ‘rising’,]
volbrkdf = breakoutdf[1:10,]
row.names(volbrkdf) = NULL
volbrkdf

trending seach terms related to keywords of interest
trending seach terms related to keywords of interest

Step 6 – Create pdf for email attachments

We will use the pdf() command to create an attachment with all the graphs and add to the email we want to send/receive on a daily basis.

filename2 = paste0(“Google Search Trends – “, Sys.Date(), “.pdf”)
pdf(paste0(“Google Search Trends – “, Sys.Date(), “.pdf”))
plot1
plot2
print(volbrkdf)
dev.off()
dev.off()

If you want to embed one of these figures in the email itself, instead of attachment then please see my post on automated reports.

Step 7 – Send the email

We use the RDCOM library to send out emails. You do need to have Outlook mail client on your PC.

library(RDCOMClient)
OutApp <- COMCreate(“Outlook.Application”)
outMail = OutApp$CreateItem(0)
outMail[[“To”]] = “an**@gmail.com”
outMail[[“subject”]] = paste(“Google Search Trend Report -“, Sys.Date())
email_body1 <- “Write email body content with correct html tags”
outMail[[“HTMLBody”]] = email_body1
file_location = paste0(paste0(abs_path,”/”),filename2)
outMail[[“Attachments”]]$Add(file_location)
outMail
outMail$Send()

The first time you run this, you might need to “allow” RStudio to access your email account. Just add this script to a task scheduler, and choose frequency of delivery, and you are all set! Note, a dummy email address is used, so remember to change the recipient address.

Try it out and let me know if you have any questions, or run into errors. The script is available on the Projects page – link here

Last but not the least, if you are interested in getting hired in a data science field, then please do take at my job-search book. Here is the Amazon link.

Happy Coding!

DataScience Professionals : India vs US | Men vs Women

Introduction

This is an analysis of the Kaggle 2018 survey dataset. In my analysis I am trying to understand the similarities and differences between men and women users from US and India, since these are the two biggest segments of the respondent population. The number of respondents who chose something other than Male/Female is quite low, so I excluded that subset as well.

The complete code is available as a kernel on the Kaggle website. If you like this post, do login and upvote! 🙂  This post is a slightly truncated version of the Kernel available on Kaggle.

You can also use the link to go to the dataset and perform your own explorations. Please do feel free to use my code as a starter script.

 

Kaggle users - India vs US

Kaggle users – India vs US

Couple of disclaimers:

NOT intending to say one country is better than the other. Instead just trying to explore the profiles based on what this specific dataset shows.
It is very much possible that there is a response bias and that the differences are due to the nature of the people who are on the Kaggle site, and who answered the survey.
With that out of the way, let us get started. If you like the analysis, please feel to fork the script and extend it further. And do not forget to Upvote! 🙂

Analysis

Some questions that the analysis tries to answer are given below:
a. What is the respondent demographic profile for users from the 2 countries – men vs women, age bucket?
b. What is their educational background and major?
c. What are the job roles and coding experience?
d. What is the most popular language of use?
e. What is the programming language people recommend for an aspiring data scientist?

I deliberately did not compare salary because:
a. 16% of the population did not answer and 20% chose “do not wish to disclose”.
b. the lowest bracket is 0-10k USD, so the max limit of 10k translates to about INR 7,00,000 (7 lakhs) which is quite high. A software engineer, entering the IT industry probably makes around 4-5 lakhs per annum, and they earn much more than others in India. So comparing against US salaries feels like comparing apples to oranges. [Assuming an exchange rate of 1 USD = 70 INR].

Calculations / Data Wrangling:

  1. I’ve aggregated the age buckets into lesser number of segments, because the number of respondents tapers off in the higher age groups. They are quite self-explanatory, as you will see from the ifesle clause below:
  2. Similarly, cleaned up the special characters in the educational qualifications. Also added a tag to the empty values in the following variables – jobrole (Q6), exp_group (Q8), proj(Q40), years in coding (Q24), major (Q5).
  3. I also created some frequency using the sqldf() function. You could use the summarise() from the dplyr package. It really is a matter of choice.

Observations

Gender composition:

As seen in the chart below, many more males (~80%) responded to the survey than women (~20%).
Among women, almost 2/3rd are from US, and only ~38% from India.
The men are almost split 50/50 among US and India.

 

Age composition:

There is a definite trend showing that the Indian respondents are quite young, with both men and women showing >54% in the youngest ge bucket (18-24), and another ~28% falling in the (25-29) category. So almost 82% of the population is under 30 years of age.
Among US respondents, the women seem a bit more younger, with 68% under 30 years, compared to ~57% men of women. However, both men and women had a larger segment in the 55+ category (~20% for women, and 25% for men. Compare it with Indians, where the 55+ group is barely 12%.

 

Educational background:

Overall, this is an educated lot, and most had a bachelors degree or more.
US women were the most educated of the lot, with a whopping 55% with masters degrees and 16% with doctorates.
Among Indians, women had higher levels of education – 10% with Ph.D, 43% masters degree, compared with men where ~34% had a masters degree and only 4% had a doctorate.
Among US men, ~47% had a masters degree, and 19% had doctorates.
This is interesting because Indians are younger compared to US respondents, so many more Indians seem to be pursuing advanced degrees.

Undergrad major:

Among Indians, the majority of respondents added Computer Science as their major.
Maybe because :
(a) Indians have to declare a major when they join, and the choice of majors is not as wide as in the US. ,

  1. Parents tend to force kids towards majors which are known to translate into a decent paying job, which is engineering or medicine.
  2. A case of response bias? The survey came from Kaggle, so not sure if non-coding majors would have even bothered to respond.Among US respondents, the major is also computer science, but followed by maths & stats for women.
    For men, the second category was a tie between non-compsci Engg , followed by maths&stats.

 

Job Roles:

Among Indians, the biggest segment are predominantly students (30%). Among Indian men, the second category is software engineer.
Among US women, the biggest category was also “student” but followed quite closely by “data scientist”. Among US men , the biggest category was “data scientist” followed by “student”.
Note, “other” category is something we created now, so not considering those. They are not the biggest category for any sub-group anyway.
CEOs, not surprisingly are male, 45+ years from the US, with a masters degree.

 

Coding Experience:

Among Indians, most answered <1 year of coding experience , which correlates well with the fact that most of them are under 30, with a huge population of students.
Among US respondents, the split is even between 1-2 years of coding and 3-5 years of coding.
Men seem to have a bit more coding experience than women, again explained by the fact that women were slightly younger overall, compared to US men.

 

Most popular programming language:

Python is the most popular language, discounting the number of people who did not answer. However, among US women, R is also popular (16% favoring it).

I found this quite interesting because I’ve always used R at work, at multiple big-name employers. (Nasdaq, Td bank, etc.) Plus, a lot of companies that used SAS seem to have found it easier to move code to R. Again this is personal opinion.
Maybe it is also because many colleges teach Python as a starting programming language?

 

Conclusions:

  1. Overall, Indians tended to be younger with more people pursuing masters degrees.
  2. US respondents tended to older with stronger coding experience, and many more are practicing data scientists.
    This seems like a great opportunity for Kaggle, if they could match the Indian students with the US data scientists, in a sort of mentor-matching service. 🙂

Automated Email Reports with R

R is an amazing tool to perform advanced statistical analysis and create stunning visualizations. However, data scientists and analytics practitioners do not work in silos, so these analysis have to be copied and emailed to senior managers and partners teams. Cut-copy-paste sounds great, but if it  is a daily or periodic task, it is more useful to automate the reports. So in this blogpost, we are going to learn how to do exactly that.

The R-code uses specific library packages to do this:

  • RDCOMClient – to connect to Outlook and send emails. In most offices, Outlook is still the defacto email client, so this is fine. However, if you are using Slack or something different it may not work.
  • r2excel – To create an excel output file.

The screenshot below shows the final email view:

email screenshot

email screenshot

As seen in the screenshot, the email contains the following:

  • Custom subject with current date
  • Embedded image
  • Attachments – 1 Excel and 1 pdf report

Code Explanation:

The code and supporting input files are available here, under the Projects page under Report automation. The code has 4 parts:

  • Prepare the work space.
  • Pull the data from source.
  • Cleaning and calculations
  • Create pdf.
  • Create Excel file.
  • Send email.

The workflow should look something like this. For really simple reports, where you do not need much formatting, you can do everything within the first script itself.

Prepare the work space

I always set the relative paths and working directories at the very beginning, so it is easier to change paths later. You can replace the link with a shared network drive path as well.

Load library packages and custom functions. My code uses the r2excel package which is not directly available as an R-cran package. So you need to install using devtools using the code below.

It is possible to do something similar using the “xlsx” package, but r2excel is easier.

Some other notes:

  • you need the first 2 lines of code only for the first time you installation. From the second time onwards, you only need to load the library.
  • r2excel seems to work only with 64-bit installations of R and Rstudio.
  • you do need Java installed on your computer. If you see an error about java namespace, then check the path variables. There is a very useful thread on Stackoverflow, so take a look.
  • As always, if you see errors Google it and use the Stack Overflow conversations. In 99% of cases, you will find an answer.

Pull the data from source

This is where we connect to an Excel CSV (or text) file. In practice, most people connect to a database of some kind. The R-script I am using connects to a .csv file, but I have added the code to a connect to a SQL database.

That code snippet is commented out, so feel free to substitute your own sql database links. The code will also work for Amazon EC2 cluster.

Some points to keep in mind:

  • If you are using sqlquery() then please note that if your query has an error then R sadly shows only a standard error message. So test your query on SQL server to ensure that you are not missing anything.
  • Some queries do take a long time, if you are pulling from a huge dataset. Also the time taken will be longer in R compared to SQL server direct connection. Using the  Sys.time() command before and after the query is helpful to know how long the query took to complete.
  • If you are only planning to pull the data randomly, it may make sense to pull from SQL server and store locally. Use the fread() function to read those files.
  • If you are using R desktop instead of R-server, the amount of data you can pull may be limited to what your system configuration.
  • ALWAYS optimize your query. Even if you have unlimited memory and computation power, only pull the data you absolutely need. Otherwise you end up unnecessarily sorting through irrelevant data.

Cleaning and calculations

For the current data, there are no NAs, so we don’t need to account for those. However, the read.csv() command creates factors, which I personally do not like, as they sometimes cause issues while merging.

Some of the column names have “.” where R converted the space in the names. So we will manually replace those with an underscore using the gsub() function.

We will also rank the apps based on categories of interest, namely:

  • Most Popular Apps – by number of Reviews
  • Most Popular Apps – by number Downloads and Reviews
  • Most Popular Categories – Paid Apps only
  • Most popular apps with 1 billion installations.

Create pdf

We are going to use the pdf() function to paste all graphs to a pdf document. Basically what this function does is write the graphs to a file rather than show on the console. So the only thing to remember is that if you are testing graphs or make an incorrect graph, everything will get posted to the pdf until you hit the “dev.off()” function. Sometimes if the graph throws an error you may end up with a blank page, or worse, with a corrupt file that cannot be opened.

Currently, the code I am only printing 2 simple graphs using ggplot() and barplot() functions, but you can include many other plots as well.

Create Excel file.

The Excel is created in the sequence below:

  • Specify the filename and create an object of type .xlsx This will create an empty Excel placeholder. It is only complete when you save the Workbook using the saveWorkbook() at the end of the section.
  • Use the sheets() to create different worksheets within the Excel.
  • The  xlsx.addHeader() adds a bold Header to each sheet which will help readers understand the content on the page. The r2excel package has other functions to add more informative text in smaller (non-header) font as well, if you need to give some context to readers. Obviously, this is optional if you don’t want to add them.
  • xlsx.addTable() – this is the crucial function that adds the content to Excel, the main “meat” of what you need to show.
  • saveWorkbook() – this function will save the Excel to the folder.
  • xlsx.openFile() – this function opens the file so you can view contents. I typically have the script running on automated mode, so when the Excel opens I am notified that the script completed.

Send email

The email is sent using the following functions:

  • OutApp() – creates an Outlook object. As I mentioned earlier, you do need Outlook and need to be signed in for this to work. I use Outlook for work and at home, so I have not explored options for Slack or other email clients.
  • outmail[[“To”]] – specify the people in the “to” field. You could also read email addresses from a file and pass the values here.
  • outmail[[“cc’]] – similar concept, for the cc field.
  • outmail[[“Subject”]] – I have used the paste0() function to add the current date to the subject, so recipients know it is the latest report.
  • outMail[[“HTMLBody”]] – I used the HTML body so that I can embed the image. If you don’t know HTML programming, no worries! The code is pretty intuitive, you should be able to follow what I’ve done. The image basically is an attachment which the HTML code is forcing to be viewed within the body of the email. If you are sending the email to people outside the organization, they may see a small box instead of the image with a cross on the top left (or right) of the box. Usually, when you hover your mouse near box and right click, it will ask them to download images. You may have seen similar messages in gmail, along with a link to “show images” or ‘always show images from this sender’. You obviously cannot control what the recipient selects, but testing by sending to yourself first helps smoothing out potential aesthetic issues.
  • outMail[[“Attachments”]] – function to add attachments.
  • outMail$Send() – until you run this command, the mail will not be send. If you are using this in office, you may get a popup asking you to do one of the following. Most  of these will generally go away after the first use, but if they don’t, please look up the issue on StackOverflow or contact your IT support for firewall and other security settings.
    • popup to hit “send”
    • popup asking you to “classify” the attachments (internal / public/ confidential) Select as appropriate. For me, this selection is usually  “internal”
    • popup asking you to accept “trust” settings
    • popup blocker notifying you to allow backend app to access Outlook.

That is it – and you are done! You have successfully learned how to send an automated email via R.

Tutorials – Dashboards with R programming

In this blogpost we are going to implement dashboards using R programming, using the latest R library package “flexdashboard”.

R programming already offers some good features for graphs and charts (packages like ggplot, leaflet, etc). Plus there is always the option to create web applications using the Shiny library and presentations with RMarkdown documents.

However, this new library leverages these libraries and allows us to create some stunning dashboards, using interactive graphs and text. What I loved the most, was the “storyboard” feature that allows me to present content in Tableau-style frames. Please note that for this you need to create RMarkdown (.Rmd) files and insert the code using the R chunks as needed.

Do I think it will replace Tableau or any other enterprise BI dashboard tool? Not really (at least in the near future). But I do think it offers the following advantages:

  1. great alternative to static presentations since your audience can interact with the data.
  2. RMarkdown allows both programming and regular text content to be pooled together in a single document.
  3. Open source (a very big plus, in my opinion!)
  4. Storyboard format allows you to logically move the audience through the analysis : problem statement, raw data and exploration, different parts of the models/simulations/ number crunching, patterns in data, final summary and recommendations. Presenting the patterns that allow you to accept or reject a hypothesis has never been easier.

So, without further ado, let us look at the dashboard implementation with two examples:

  1. Storyboard dashboard.
  2. Simple dashboard with Shiny elements.

Library Installation instructions:

To start off, please install the “flexdashboard” package in your RStudio IDE. If installation is completed correctly, you will see the flex-dashboard feature when you create a new RMarkdown document, as shown in images below:

Step1 image:

Step 2 image:

Storyboard Dashboard:

Instead of analyzing a single dataset, I have chosen to present different interactive graph types using the storyboard feature. This will allow you to experience the range of options possible with this package.

An image of the storyboard is shown below, but you can also view the live document here (without source code or data files) .  The complete data and source code files are available for download here, simply search for “Dashboard Codefiles” on the Projects page.

The storyboard elements are described below:

  • Element 1 – Click on each frame to see the graph and explanation associated with that story point. (click element 5 to see Facebook stock trends)
  • Element 2 – This is the location for your graphs, tables, etc. One below each story point.
  • Element 3 – This explanation column in the right can be omitted, if required. However, my personal opinion is that this is a good way to highlight certain facts about the graph or place instructions, hyperlinks, etc. Like a webpage sidebar.
  • Element 4 – My tutorial only has 4 story elements, but if you have more flexdashboard automatically provides left-right arrow for navigation. (Just like Tableau).
  • Element 6 – Title bar for the project. Notice the social sharing button on the far right.

Shiny Style Simple Dashboard:

Here I have used Shiny elements to allow user to select the variables to plot on a graph. This also shows how you can divide your page into columns and rows, to show different content in different panes.

See image below for some of the features incorporated in this dashboard:

Feature explanation:

  • Element 1 – Title pane. The red color comes from the “cerulean” theme I used. You can change colors to match your business logo or personal preferences.
  • Element 2 – Shiny inputs. The dropdown is populated automatically from the dataset, so you don’t have to specify values separately.
  • Element 3 – Output graph, based on choices from element 1.
  • Element 4 – notice how this is separated vertically from element3, and horizontally from element 5.
  • Element 5 – Another graph. You can render text, image or pull in web content using the appropriate Shiny commands.
  • Element 6 – you can embed social share button, and also the source code. Click the code icon “</>” to view the code. You can also download the data and program files here, from my  Projects page.

Hope you found these dashboard implementations useful. Please add your valuable comments and feedback in the comments section.

Until next time, happy coding! 😊

Older posts
Facebook
LinkedIn