A Tale of Two cities

Gursimar Singh
7 min readDec 16, 2020

--

Clustering the Neighborhoods of London and Paris

Introduction

A Tale of Two cities, a novel written by Charles Dickens was set in London and Paris which takes place during the French Revolution. These cities were both happening then and now. A lot has changed over the years and we now take a look at how the cities have grown.

London and Paris are quite the popular tourist and vacation destinations for people all around the world. They are diverse and multicultural and offer a wide variety of experiences that is widely sought after. We try to group the neighbourhoods of London and Paris respectively and draw insights to what they look like now.

Business Problem

The aim is to help tourists choose their destinations depending on the experiences that the neighbourhoods have to offer and what they would want to have. This also helps people make decisions if they are thinking about migrating to London or Paris or even if they want to relocate neighbourhoods within the city. Our findings will help stakeholders make informed decisions and address any concerns they have including the different kinds of cuisines, provision stores and what the city has to offer.

Data Description

We require geolocation data for both London and Paris. Postal codes in each city serve as a starting point. Using Postal codes we use can find out the neighbourhoods, boroughs, venues and their most popular venue categories.

London

To derive our solution, We scrape our data from https://en.wikipedia.org/wiki/List_of_areas_of_London

This wikipedia page has information about all the neighbourhoods, we limit it London.

  1. borough : Name of Neighbourhood
  2. town : Name of borough
  3. post_code : Postal codes for London.

This wikipedia page lacks information about the geographical locations. To solve this problem we use ArcGIS API

ArcGIS API

ArcGIS Online enables you to connect people, locations, and data using interactive maps. Work with smart, data-driven styles and intuitive analysis tools that deliver location intelligence. Share your insights with the world or specific groups.

More specifically, we use ArcGIS to get the geo locations of the neighbourhoods of London. The following columns are added to our initial dataset which prepares our data.

  1. latitude : Latitude for Neighbourhood
  2. longitude : Longitude for Neighbourhood

Paris

To derive our solution, We leverage JSON data available at https://www.data.gouv.fr/fr/datasets/r/e88c6fda-1d09-42a0-a069-606d3259114e

The JSON file has data about all the neighbourhoods in France, we limit it to Paris.

  1. postal_code : Postal codes for France
  2. nom_comm : Name of Neighbourhoods in France
  3. nom_dept : Name of the boroughs, equivalent to towns in France
  4. geo_point_2d : Tuple containing the latitude and longitude of the Neighbourhoods.

Foursquare API Data

We will need data about different venues in different neighbourhoods of that specific borough. In order to gain that information we will use “Foursquare” locational information. Foursquare is a location data provider with information about all manner of venues and events within an area of interest. Such information includes venue names, locations, menus and even photos. As such, the foursquare location platform will be used as the sole data source since all the stated required information can be obtained through the API.

After finding the list of neighbourhoods, we then connect to the Foursquare API to gather information about venues inside each and every neighbourhood. For each neighbourhood, we have chosen the radius to be 500 meters.

The data retrieved from Foursquare contained information of venues within a specified distance of the longitude and latitude of the postcodes. The information obtained per venue as follows:

  1. Neighbourhood : Name of the Neighbourhood
  2. Neighbourhood Latitude : Latitude of the Neighbourhood
  3. Neighbourhood Longitude : Longitude of the Neighbourhood
  4. Venue : Name of the Venue
  5. Venue Latitude : Latitude of Venue
  6. Venue Longitude : Longitude of Venue
  7. Venue Category : Category of Venue

Based on all the information collected for both London and Paris, we have sufficient data to build our model. We cluster the neighbourhoods together based on similar venue categories. We then present our observations and findings. Using this data, our stakeholders can take the necessary decision.

Methodology

The approach taken here is to explore each of the cities individually, plot the map to show the neighborhoods being considered and then build our model by clustering all of the similar neighborhoods together and finally plot the new map with the clustered neighborhoods. We draw insights and then compare and discuss our findings.

Exploring London:

Neighborhoods of London

We begin to start collecting and refining the data needed for the our business solution to work.

Data Collection

To get the neighborhoods in london, we start by scraping the list of areas of london wiki page.

Geolocations of the London Neighbourhoods

ArcGis API

We need to get the geographical co-ordinates for the neighbourhoods to plot out map. We will use the arcgis package to do so.

Arcgis doesn’t have a limitation on the number of API calls made so it fits our use case perfectly.

Latitude

Extracting the latitude from our previously collected coordinates

Longitude

Extracting the Longitude from our previously collected coordinates

Co-ordinates for London

Getting the geocode for London to help visualize it on the map

Visualize the Map of London

To help visualize the Map of London and the neighbourhoods in London, we make use of the folium package.

Venues in London

To proceed with the next part, we need to define Foursquare API credentials.

Using Foursquare API, we are able to get the venue and venue categories around each neighbourhood in London.

Model Building

K Means

Let’s cluster the city of london to roughly 5 to make it easier to analyze.

We use the K Means clustering technique to do so.

Visualizing the clustered neighbourhood

Let’s plot the clusters

Exploring Paris

Neighbourhoods of Paris

Data Collection

We read the json data with pandas.

Gelocations of the Neighbourhoods of Paris

We don’t need to get the geo coordinates using an external data source or collect it with the arcgis API call since we already have it stored in the geo_point_2d column as a tuple in the df_paris dataframe.

Checking one of the geo coordinates.

Visualize the Map of Paris

Venues in Paris

Using our previously defined function. Let’s get the neaby venues present in each neighbourhood of Paris

Model Building

K Means

Let’s cluster the city of Paris to roughly 5 to make it easier to analyze.

We use the K Means clustering technique to do so.

Results and Discussion

The neighbourhoods of London are very mulitcultural. There are a lot of different cusines including Indian, Italian, Turkish and Chinese. London seems to take a step further in this direction by having a lot of Restaurants, bars, juice bars, coffee shops, Fish and Chips shop and Breakfast spots. It has a lot of shopping options too with that of the Flea markets, flower shops, fish markets, Fishing stores, clothing stores. The main modes of transport seem to be Buses and trains. For leisure, the neighbourhoods are set up to have lots of parks, golf courses, zoo, gyms and Historic sites.

Overall, the city of London offers a multicultural, diverse and certainly an entertaining experience.

Paris is relatively small in size geographically. It has a wide variety of cusines and eateries including French, Thai, Cambodian, Asian, Chinese etc. There are a lot of hangout spots including many Restaurants and Bars. Paris has a lot of Bistro’s. Different means of public transport in Paris which includes buses, bikes, boats or ferries. For leisure and sight seeing, there are a lot of Plazas, Trails, Parks, Historic sites, clothing shops, Art galleries and Museums. Overall, Paris seems like the relaxing vacation spot with a mix of lakes, historic spots and a wide variety of cusines to try out.

Conclusion

The purpose of this project was to explore the cities of London and Paris and see how attractive it is to potential tourists and migrants. We explored both the cities based on their postal codes and then extrapolated the common venues present in each of the neighbourhoods finally concluding with clustering similar neighbourhoods together.

We could see that each of the neighbourhoods in both the cities have a wide variety of experiences to offer which is unique in it’s own way. The cultural diversity is quite evident which also gives the feeling of a sense of inclusion.

Both Paris and London seem to offer a vacation stay or a romantic gateaway with a lot of places to explore, beautiful landscapes and a wide variety of culture.Overall, it’s upto the stakeholders to decide which experience they would prefer more and which would more to their liking.

--

--

Gursimar Singh
Gursimar Singh

Written by Gursimar Singh

Google Developers Educator | Speaker | Consultant | Author @ freeCodeCamp | DevOps | Cloud Computing | Data Science and more

No responses yet