Creating maps in R and analysing traffic patterns with geolocation

Here is how we can use the maps, mapdata and ggplot2 libraries to create maps in R.

Creating map and plotting coordinates

In this particular example, we’re going to create a world map showing the points of Beijing and Shanghai, both cities in China. For this particular map, we will be displaying the Northern Hemisphere from Europe to Asia.

require(maps)
require(mapdata)
library(ggplot2)
library(ggrepel)

cities = c("Beijing","Shanghai")

global <- map_data("world")
ggplot() + geom_polygon(data = global, aes(x=long, y = lat, group = group)) + 
  coord_fixed(1.3)

ggplot() + 
  geom_polygon(data = global, aes(x=long, y = lat, group = group), fill = NA, color = "red") + 
  coord_fixed(1.3)

gg1 <- ggplot() + 
  geom_polygon(data = global, aes(x=long, y = lat, group = group), fill = "green", color = "blue") + 
  coord_fixed(1.3)
gg1

coors <- data.frame(
  long = c(122.064873,121.4580600),
  lat = c(36.951968,31.2222200),
  stringsAsFactors = FALSE
)  

#xlim and ylim can be manipulated to zoom in or out of the map
coors$cities <- cities
gg1 + 
  geom_point(data=coors, aes(long, lat), colour="red", size=1) + 
  ggtitle("World Map") +
  geom_text_repel(data=coors, aes(long, lat, label=cities)) + xlim(0,150) + ylim(0,100)

Upon running this code, here is our map:

china maps

A few points to note:

  • The "cities" variable is used to specify the labels for the cities.
  • The "coors" data frame is used to define the latitude and longitude for each city.
  • The xlim and ylim under ggplot is used to zoom in or out of the map, depending on the coordinates we set.

Note that we are also using the ggrepel library in order to space out the labels on the points for each city. Were this library not to be incorporated, then the labels have the potential to overlap each other, and it doesn't look very visually appealing...

Zooming in on a particular region

cities = c("Paris","Berlin")
coors <- data.frame(
  lat = c(48.864716,52.520008),
  long = c(2.349014,13.404954),
  stringsAsFactors = FALSE
)
#xlim and ylim can be manipulated to zoom in or out of the map
coors$cities <- cities
gg1 + 
  geom_point(data=coors, aes(long, lat), colour="red", size=1) + 
  ggtitle("World Map") +
  geom_text_repel(data=coors, aes(long, lat, label=cities)) + xlim(-10,40) + ylim(35,60)

europe maps

As mentioned, xlim and ylim are set to a narrower margin. Here, xlim is set to (-10,40) and ylim is set to (35,60). However, in the previous map xlim was set to (0,150) and ylim was set to (0,100).

Note that because this method is using a world map database, you might often find that the countries surrounding the ones we want (in this case, France and Germany) appear somewhat "broken up". This may not be an issue if you are simply looking to represent a particular country, but you could also choose to plot one country in isolation, e.g. specifying map_data("usa") instead of map_data("world").

Here is a particularly good example.

If you're interested, you can also see how we can generate an interactive Shiny Web App, whereby the latitude and longitude is set by means of a slider.

Calculating Distance

Now, suppose that we wish to calculate the distance between two points?

For instance, we already know the latitude and longitude of Paris and Berlin respectively:

  • Paris: 48.8566, 2.3522
  • Berlin: 52.5200, 13.4050
p=0.017453292519943295
a = 0.5 - cos((coors$lat[2] - coors$lat[1]) * p)/2 + cos(coors$lat[1] * p) * cos(coors$lat[2] * p) * (1 - cos((coors$long[2] - coors$long[1]) * p)) / 2
distance <-12742*sin(sqrt(a))
distance #Distance in kilometers

Here, we are taking our latitude and longitude coordinates, and calculating the distance as above:

> distance #Distance in kilometers
[1] 876.0783

Upon doing so, we end up with a distance of 876km between Paris and Berlin.

Practical Application - Traffic monitoring with social media

So, what are some practical applications of being able to plot maps in R?

Well, suppose that we wish to use geolocation data sourced from social media. Let's take an example.

We wish to use Twitter data to analyse traffic patterns across the United Kingdom. In this regard, let's search Twitter for 10,000 tweets with the term "UK traffic".

We will do this and then plot the map with the geolocation, i.e. latitude and longitude of each tweet for which geolocation data is available.

require(maps)
require(mapdata)
library(ggplot2)
library(ggrepel)
library(devtools)
library(twitteR)
library(plyr)
library(dplyr)

# Twitter
setup_twitter_oauth(consumerkey, consumersecret, accessToken, accessSecret)

tweets<-searchTwitter("uk traffic", n=10000, since='2018-10-25')
df <- twListToDF(tweets)
attach(df)

# Maps

global <- map_data("world")
ggplot() + geom_polygon(data = global, aes(x=long, y = lat, group = group)) + 
  coord_fixed(1.3)

ggplot() + 
  geom_polygon(data = global, aes(x=long, y = lat, group = group), fill = NA, color = "red") + 
  coord_fixed(1.3)

gg1 <- ggplot() + 
  geom_polygon(data = global, aes(x=long, y = lat, group = group), fill = "green", color = "blue") + 
  coord_fixed(1.3) + xlim(-15,10) + ylim(40,60)
gg1

tweetlocations<-data.frame(df$longitude,df$latitude)
tweetlocations<-na.omit(tweetlocations)
attach(tweetlocations)
longitude<-as.numeric(as.character(tweetlocations$df.longitude))
latitude<-as.numeric(as.character(tweetlocations$df.latitude))

#xlim and ylim can be manipulated to zoom in or out of the map
gg1 +
  geom_point(data = tweetlocations, aes(x = longitude, y = latitude),
             colour = 'red', alpha = .5)

Now, when we plot the map, we can see a plot of the coordinates for the locations where a tweet concerning "UK traffic" was sent in the past month (I ran this on 29th November 2018).

uk traffic map

It is quite noteworthy that the locations of many of the tweets appear to be clustered around the M1 and M40 motorways, which are widely regarded as the two busiest motorways in the UK. In this regard, using social media in conjunction with mapping visualizations appear to allow us to identify areas of high traffic congestion.

Author: Michael Grogan

Michael Grogan is a machine learning consultant and educator, with a profound passion for statistics and data science.