So I discovered Folium about two months ago and decided to map the primitive way with it. Coordinates data is retrieved from Strava gpx files and cleaned up leaving only latitude and longitude as below.
head Camin_prim_stage1.csv
lat,lon
43.3111770,-5.6941620
43.3113360,-5.6943420
43.3114370,-5.6944600
43.3115000,-5.6945420
43.3116970,-5.6948090
43.3119110,-5.6950900
43.3122360,-5.6956830
43.3123220,-5.6958090
43.3126840,-5.6963740
Below is the python file we will use to retrieve data and create the map with the routes.
import folium
from pyspark.sql import SparkSession
from pyspark.sql.functions import col
spark = SparkSession.builder.master("local").getOrCreate()
# Change Spark loglevel
spark.sparkContext.setLogLevel('FATAL')
# Load the rides and ride_routes data from local instead of HDFS
position1 = spark.read.load("/home/user/Camin_prim_stage1.csv", format="csv", sep=",", inferSchema="true", header="true")
position2 = spark.read.load("/home/user/Camin_prim_stage2.csv", format="csv", sep=",", inferSchema="true", header="true")
position3 = spark.read.load("/home/user/Camin_prim_stage3.csv", format="csv", sep=",", inferSchema="true", header="true")
position = [position1, position2, position3]
m = folium.Map()
col=0
colArray=['red','blue','green']
# Check file was correctly loaded
for x in position:
# x.printSchema()
# x.show(2)
# Map position
coordinates = [[float(i.lat), float(i.lon)] for i in x.collect()]
# Make a Folium map
#m = folium.Map()
m.fit_bounds(coordinates, padding=(25, 25))
folium.PolyLine(locations=coordinates, weight=5, color=colArray[col]).add_to(m)
folium.Marker(coordinates[1], popup="Origin").add_to(m)
folium.Marker(coordinates[-1], popup="Destination").add_to(m)
col = col + 1
# Save to an html file
m.save('chamin_prim.html')
# Cleanup
spark.stop()
Continue reading →