Primitive way with Folium

So I discovered Folium about two months ago and decided to map the primitive way with it. Coordinates data is retrieved from Strava gpx files and cleaned up leaving only latitude and longitude as below.

head Camin_prim_stage1.csv
lat,lon
43.3111770,-5.6941620
43.3113360,-5.6943420
43.3114370,-5.6944600
43.3115000,-5.6945420
43.3116970,-5.6948090
43.3119110,-5.6950900
43.3122360,-5.6956830
43.3123220,-5.6958090
43.3126840,-5.6963740

Below is the python file we will use to retrieve data and create the map with the routes.

import folium
from pyspark.sql import SparkSession
from pyspark.sql.functions import col
spark = SparkSession.builder.master("local").getOrCreate()

# Change Spark loglevel
spark.sparkContext.setLogLevel('FATAL')

# Load the rides and ride_routes data from local instead of HDFS
position1 = spark.read.load("/home/user/Camin_prim_stage1.csv", format="csv", sep=",", inferSchema="true", header="true")
position2 = spark.read.load("/home/user/Camin_prim_stage2.csv", format="csv", sep=",", inferSchema="true", header="true")
position3 = spark.read.load("/home/user/Camin_prim_stage3.csv", format="csv", sep=",", inferSchema="true", header="true")

position = [position1, position2, position3]

m = folium.Map()
col=0
colArray=['red','blue','green']

# Check file was correctly loaded
for x in position:
# x.printSchema()
# x.show(2)

# Map position
coordinates = [[float(i.lat), float(i.lon)] for i in x.collect()]

# Make a Folium map
#m = folium.Map()
m.fit_bounds(coordinates, padding=(25, 25))
folium.PolyLine(locations=coordinates, weight=5, color=colArray[col]).add_to(m)
folium.Marker(coordinates[1], popup="Origin").add_to(m)
folium.Marker(coordinates[-1], popup="Destination").add_to(m)
col = col + 1
# Save to an html file
m.save('chamin_prim.html')

# Cleanup
spark.stop()

We execute below ans result gets saved into a file called chamin_prim.html:

spark-submit camin_prim.py; echo $?; ls -ltr | tail -1
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
18/08/19 20:05:47 WARN Utils: Your hostname, server resolves to a loopback address: 127.0.0.1; using 192.168.0.99 instead (on interface eth0)
18/08/19 20:05:47 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
18/08/19 20:05:48 INFO SparkContext: Running Spark version 2.2.0
18/08/19 20:05:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/08/19 20:05:49 INFO SparkContext: Submitted application: camin_prim.py
18/08/19 20:05:49 INFO SecurityManager: Changing view acls to: user
18/08/19 20:05:49 INFO SecurityManager: Changing modify acls to: user
18/08/19 20:05:49 INFO SecurityManager: Changing view acls groups to: 
18/08/19 20:05:49 INFO SecurityManager: Changing modify acls groups to: 
18/08/19 20:05:49 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(user); groups with view permissions: Set(); users  with modify permissions: Set(user); groups with modify permissions: Set()
18/08/19 20:05:49 INFO Utils: Successfully started service 'sparkDriver' on port 41115.
18/08/19 20:05:49 INFO SparkEnv: Registering MapOutputTracker
18/08/19 20:05:49 INFO SparkEnv: Registering BlockManagerMaster
18/08/19 20:05:49 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/08/19 20:05:49 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/08/19 20:05:49 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-aedd3da1-84c0-4e3c-b094-30590422f0ca
18/08/19 20:05:49 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
18/08/19 20:05:49 INFO SparkEnv: Registering OutputCommitCoordinator
18/08/19 20:05:49 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
18/08/19 20:05:49 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
18/08/19 20:05:49 INFO Utils: Successfully started service 'SparkUI' on port 4042.
18/08/19 20:05:50 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.0.99:4042
18/08/19 20:05:50 INFO SparkContext: Added file file:/home/user/camin_prim.py at file:/home/user/camin_prim.py with timestamp 1534701950407
18/08/19 20:05:50 INFO Utils: Copying /home/user/camin_prim.py to /tmp/spark-47458e06-2b68-4e19-b2a1-3172bf40e4e5/userFiles-5730b68b-f428-406a-9fb3-19ed8385f6ea/camin_prim.py
18/08/19 20:05:50 INFO Executor: Starting executor ID driver on host localhost
18/08/19 20:05:50 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39599.
18/08/19 20:05:50 INFO NettyBlockTransferService: Server created on 192.168.0.99:39599
18/08/19 20:05:50 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
18/08/19 20:05:50 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.0.99, 39599, None)
18/08/19 20:05:50 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.99:39599 with 366.3 MB RAM, BlockManagerId(driver, 192.168.0.99, 39599, None)
18/08/19 20:05:50 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.0.99, 39599, None)
18/08/19 20:05:50 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.0.99, 39599, None)
18/08/19 20:05:51 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/home/user/spark-warehouse/').
18/08/19 20:05:51 INFO SharedState: Warehouse path is 'file:/home/user/spark-warehouse/'.
18/08/19 20:05:51 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
0
-rw-r--r--  1 user user     866998 Aug 19 20:05 chamin_prim.html

HTML file result can be seen here.

Leave a Reply