{"id":1559,"date":"2018-08-20T16:08:02","date_gmt":"2018-08-20T20:08:02","guid":{"rendered":"http:\/\/www.xavignu.com\/?p=1559"},"modified":"2018-08-20T16:08:41","modified_gmt":"2018-08-20T20:08:41","slug":"primitive-way-with-folium","status":"publish","type":"post","link":"https:\/\/www.xavignu.com\/?p=1559","title":{"rendered":"Primitive way with Folium"},"content":{"rendered":"<p>So I discovered <a href=\"https:\/\/pypi.org\/project\/folium\/\" target=\"_blank\" rel=\"noopener\">Folium<\/a> about two months ago and decided to map the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Camino_Primitivo\" target=\"_blank\" rel=\"noopener\">primitive way<\/a> with it. Coordinates data is retrieved from <a href=\"https:\/\/www.strava.com\/\" target=\"_blank\" rel=\"noopener\">Strava<\/a> gpx files and cleaned up leaving only latitude and longitude as below.<\/p>\n<pre id=\"terminal\">head Camin_prim_stage1.csv\r\nlat,lon\r\n43.3111770,-5.6941620\r\n43.3113360,-5.6943420\r\n43.3114370,-5.6944600\r\n43.3115000,-5.6945420\r\n43.3116970,-5.6948090\r\n43.3119110,-5.6950900\r\n43.3122360,-5.6956830\r\n43.3123220,-5.6958090\r\n43.3126840,-5.6963740\r\n<\/pre>\n<p>Below is the python file we will use to retrieve data and create the map with the routes.<br \/>\n[python]<br \/>\nimport folium<br \/>\nfrom pyspark.sql import SparkSession<br \/>\nfrom pyspark.sql.functions import col<br \/>\nspark = SparkSession.builder.master(&#8220;local&#8221;).getOrCreate()<\/p>\n<p># Change Spark loglevel<br \/>\nspark.sparkContext.setLogLevel(&#8216;FATAL&#8217;)<\/p>\n<p># Load the rides and ride_routes data from local instead of HDFS<br \/>\nposition1 = spark.read.load(&#8220;\/home\/user\/Camin_prim_stage1.csv&#8221;, format=&#8221;csv&#8221;, sep=&#8221;,&#8221;, inferSchema=&#8221;true&#8221;, header=&#8221;true&#8221;)<br \/>\nposition2 = spark.read.load(&#8220;\/home\/user\/Camin_prim_stage2.csv&#8221;, format=&#8221;csv&#8221;, sep=&#8221;,&#8221;, inferSchema=&#8221;true&#8221;, header=&#8221;true&#8221;)<br \/>\nposition3 = spark.read.load(&#8220;\/home\/user\/Camin_prim_stage3.csv&#8221;, format=&#8221;csv&#8221;, sep=&#8221;,&#8221;, inferSchema=&#8221;true&#8221;, header=&#8221;true&#8221;)<\/p>\n<p>position = [position1, position2, position3]<\/p>\n<p>m = folium.Map()<br \/>\ncol=0<br \/>\ncolArray=[&#8216;red&#8217;,&#8217;blue&#8217;,&#8217;green&#8217;]<\/p>\n<p># Check file was correctly loaded<br \/>\nfor x in position:<br \/>\n# x.printSchema()<br \/>\n# x.show(2)<\/p>\n<p># Map position<br \/>\ncoordinates = [[float(i.lat), float(i.lon)] for i in x.collect()]<\/p>\n<p># Make a Folium map<br \/>\n#m = folium.Map()<br \/>\nm.fit_bounds(coordinates, padding=(25, 25))<br \/>\nfolium.PolyLine(locations=coordinates, weight=5, color=colArray[col]).add_to(m)<br \/>\nfolium.Marker(coordinates[1], popup=&#8221;Origin&#8221;).add_to(m)<br \/>\nfolium.Marker(coordinates[-1], popup=&#8221;Destination&#8221;).add_to(m)<br \/>\ncol = col + 1<br \/>\n# Save to an html file<br \/>\nm.save(&#8216;chamin_prim.html&#8217;)<\/p>\n<p># Cleanup<br \/>\nspark.stop()<br \/>\n[\/python]<\/p>\n<p><!--more--><\/p>\n<p>We execute below ans result gets saved into a file called chamin_prim.html:<\/p>\n<pre id=\"terminal\">spark-submit camin_prim.py; echo $?; ls -ltr | tail -1\r\nUsing Spark's default log4j profile: org\/apache\/spark\/log4j-defaults.properties\r\n18\/08\/19 20:05:47 WARN Utils: Your hostname, server resolves to a loopback address: 127.0.0.1; using 192.168.0.99 instead (on interface eth0)\r\n18\/08\/19 20:05:47 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address\r\n18\/08\/19 20:05:48 INFO SparkContext: Running Spark version 2.2.0\r\n18\/08\/19 20:05:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable\r\n18\/08\/19 20:05:49 INFO SparkContext: Submitted application: camin_prim.py\r\n18\/08\/19 20:05:49 INFO SecurityManager: Changing view acls to: user\r\n18\/08\/19 20:05:49 INFO SecurityManager: Changing modify acls to: user\r\n18\/08\/19 20:05:49 INFO SecurityManager: Changing view acls groups to: \r\n18\/08\/19 20:05:49 INFO SecurityManager: Changing modify acls groups to: \r\n18\/08\/19 20:05:49 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(user); groups with view permissions: Set(); users  with modify permissions: Set(user); groups with modify permissions: Set()\r\n18\/08\/19 20:05:49 INFO Utils: Successfully started service 'sparkDriver' on port 41115.\r\n18\/08\/19 20:05:49 INFO SparkEnv: Registering MapOutputTracker\r\n18\/08\/19 20:05:49 INFO SparkEnv: Registering BlockManagerMaster\r\n18\/08\/19 20:05:49 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information\r\n18\/08\/19 20:05:49 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up\r\n18\/08\/19 20:05:49 INFO DiskBlockManager: Created local directory at \/tmp\/blockmgr-aedd3da1-84c0-4e3c-b094-30590422f0ca\r\n18\/08\/19 20:05:49 INFO MemoryStore: MemoryStore started with capacity 366.3 MB\r\n18\/08\/19 20:05:49 INFO SparkEnv: Registering OutputCommitCoordinator\r\n18\/08\/19 20:05:49 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.\r\n18\/08\/19 20:05:49 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.\r\n18\/08\/19 20:05:49 INFO Utils: Successfully started service 'SparkUI' on port 4042.\r\n18\/08\/19 20:05:50 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http:\/\/192.168.0.99:4042\r\n18\/08\/19 20:05:50 INFO SparkContext: Added file file:\/home\/user\/camin_prim.py at file:\/home\/user\/camin_prim.py with timestamp 1534701950407\r\n18\/08\/19 20:05:50 INFO Utils: Copying \/home\/user\/camin_prim.py to \/tmp\/spark-47458e06-2b68-4e19-b2a1-3172bf40e4e5\/userFiles-5730b68b-f428-406a-9fb3-19ed8385f6ea\/camin_prim.py\r\n18\/08\/19 20:05:50 INFO Executor: Starting executor ID driver on host localhost\r\n18\/08\/19 20:05:50 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39599.\r\n18\/08\/19 20:05:50 INFO NettyBlockTransferService: Server created on 192.168.0.99:39599\r\n18\/08\/19 20:05:50 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy\r\n18\/08\/19 20:05:50 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.0.99, 39599, None)\r\n18\/08\/19 20:05:50 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.0.99:39599 with 366.3 MB RAM, BlockManagerId(driver, 192.168.0.99, 39599, None)\r\n18\/08\/19 20:05:50 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.0.99, 39599, None)\r\n18\/08\/19 20:05:50 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.0.99, 39599, None)\r\n18\/08\/19 20:05:51 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:\/home\/user\/spark-warehouse\/').\r\n18\/08\/19 20:05:51 INFO SharedState: Warehouse path is 'file:\/home\/user\/spark-warehouse\/'.\r\n18\/08\/19 20:05:51 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint\r\n0\r\n-rw-r--r--  1 user user     866998 Aug 19 20:05 chamin_prim.html\r\n<\/pre>\n<p>HTML file result can be seen <a href=\"http:\/\/www.xavignu.com\/wp-content\/uploads\/2018\/08\/chamin_prim.html\" target=\"_blank\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>So I discovered Folium about two months ago and decided to map the primitive way with it. Coordinates data is retrieved from Strava gpx files and cleaned up leaving only latitude and longitude as below. head Camin_prim_stage1.csv lat,lon 43.3111770,-5.6941620 43.3113360,-5.6943420 43.3114370,-5.6944600 43.3115000,-5.6945420 43.3116970,-5.6948090 43.3119110,-5.6950900 43.3122360,-5.6956830 43.3123220,-5.6958090 43.3126840,-5.6963740 Below is the python file we will use [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[74,92],"tags":[20,97,6,23,99,67,98],"jetpack_featured_media_url":"","jetpack_publicize_connections":[],"jetpack_shortlink":"https:\/\/wp.me\/pTQgt-p9","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.xavignu.com\/index.php?rest_route=\/wp\/v2\/posts\/1559"}],"collection":[{"href":"https:\/\/www.xavignu.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.xavignu.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.xavignu.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.xavignu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1559"}],"version-history":[{"count":52,"href":"https:\/\/www.xavignu.com\/index.php?rest_route=\/wp\/v2\/posts\/1559\/revisions"}],"predecessor-version":[{"id":1615,"href":"https:\/\/www.xavignu.com\/index.php?rest_route=\/wp\/v2\/posts\/1559\/revisions\/1615"}],"wp:attachment":[{"href":"https:\/\/www.xavignu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1559"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.xavignu.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1559"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.xavignu.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1559"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}