So I discovered Folium about two months ago and decided to map the primitive way with it. Coordinates data is retrieved from Strava gpx files and cleaned up leaving only latitude and longitude as below.
head Camin_prim_stage1.csv lat,lon 43.3111770,-5.6941620 43.3113360,-5.6943420 43.3114370,-5.6944600 43.3115000,-5.6945420 43.3116970,-5.6948090 43.3119110,-5.6950900 43.3122360,-5.6956830 43.3123220,-5.6958090 43.3126840,-5.6963740
Below is the python file we will use to retrieve data and create the map with the routes.
import folium from pyspark.sql import SparkSession from pyspark.sql.functions import col spark = SparkSession.builder.master("local").getOrCreate() # Change Spark loglevel spark.sparkContext.setLogLevel('FATAL') # Load the rides and ride_routes data from local instead of HDFS position1 = spark.read.load("/home/user/Camin_prim_stage1.csv", format="csv", sep=",", inferSchema="true", header="true") position2 = spark.read.load("/home/user/Camin_prim_stage2.csv", format="csv", sep=",", inferSchema="true", header="true") position3 = spark.read.load("/home/user/Camin_prim_stage3.csv", format="csv", sep=",", inferSchema="true", header="true") position = [position1, position2, position3] m = folium.Map() col=0 colArray=['red','blue','green'] # Check file was correctly loaded for x in position: # x.printSchema() # x.show(2) # Map position coordinates = [[float(i.lat), float(i.lon)] for i in x.collect()] # Make a Folium map #m = folium.Map() m.fit_bounds(coordinates, padding=(25, 25)) folium.PolyLine(locations=coordinates, weight=5, color=colArray[col]).add_to(m) folium.Marker(coordinates[1], popup="Origin").add_to(m) folium.Marker(coordinates[-1], popup="Destination").add_to(m) col = col + 1 # Save to an html file m.save('chamin_prim.html') # Cleanup spark.stop()