Category Archives: UNIX

Posts regarding UNIX and Linux systems

Run httpd with docker

So below is the script:

#!/bin/bash

echo "Running httpd with docker."

docker run   --rm -v "$PWD":/usr/local/apache2/htdocs  httpd

We use the following options:
-v, –volume list Bind mount a volume
–rm Automatically remove the container when it exits

Quite simple, right? Execution below.

user@computer:$ bash ~/docker/run_httpd.sh
Running httpd with docker.
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 172.17.0.6. Set the 'ServerName' directive globally to suppress this message
[Thu Aug 17 22:22:55.249981 2017] [mpm_event:notice] [pid 1:tid 140029488904064] AH00489: Apache/2.4.27 (Unix) configured -- resuming normal operations
[Thu Aug 17 22:22:55.250079 2017] [core:notice] [pid 1:tid 140029488904064] AH00094: Command line: 'httpd -D FOREGROUND'

And we proceed to test.

user@computer:$ curl http://172.17.0.6
<HTML>
<HEAD>
First page
</HEAD>
<BODY>
Testing docker httpd



We are getting below index.html because we are mapping /tmp/httpd (current $PWD) to /usr/local/apache2/htdocs. In /tmp/httpd we created an example index.html as shown above.
More info here

Run python script with docker

So I started playing with docker and asked myself whether it would be possible to run a python script with docker. Well, answer is yes. Example of the script below.

#!/usr/bin/python

import sys
print "Running script!!"
print sys.version_info

Execution below:

user@computer:$ docker run -it --rm --name pythonscript -v "$PWD":/usr/src/myapp -w /usr/src/myapp python:2 python script.py
Running script!!
sys.version_info(major=2, minor=7, micro=13, releaselevel='final', serial=0)

References:
Docker documentation

Using sqoop to import a DB table into HDFS

In the world of Big Data to import data from a DB into HDFS you need Apache Sqoop.

user@computer:$ sqoop import --connect jdbc:mysql://localhost/mysql --username training -P --warehouse-dir /home/training/db --table user
17/02/23 10:38:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.2.0
Enter password:
17/02/23 10:38:24 INFO manager.SqlManager: Using default fetchSize of 1000
17/02/23 10:38:24 INFO tool.CodeGenTool: Beginning code generation
17/02/23 10:38:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user` AS t LIMIT 1
17/02/23 10:38:24 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user` AS t LIMIT 1
17/02/23 10:38:24 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-0.20-mapreduce
Note: /tmp/sqoop-training/compile/7f3a9709c50f58c2c6bb24de91922c6b/user.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/02/23 10:38:29 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-training/compile/7f3a9709c50f58c2c6bb24de91922c6b/user.jar
17/02/23 10:38:29 WARN manager.MySQLManager: It looks like you are importing from mysql.
17/02/23 10:38:29 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
17/02/23 10:38:29 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
17/02/23 10:38:29 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
17/02/23 10:38:29 WARN manager.CatalogQueryManager: The table user contains a multi-column primary key. Sqoop will default to the column Host only for this job.
17/02/23 10:38:29 WARN manager.CatalogQueryManager: The table user contains a multi-column primary key. Sqoop will default to the column Host only for this job.
17/02/23 10:38:29 INFO mapreduce.ImportJobBase: Beginning import of user
17/02/23 10:38:31 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
17/02/23 10:38:33 INFO db.DBInputFormat: Using read commited transaction isolation
17/02/23 10:38:33 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`Host`), MAX(`Host`) FROM `user`
17/02/23 10:38:33 WARN db.TextSplitter: Generating splits for a textual index column.
17/02/23 10:38:33 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
17/02/23 10:38:33 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.
17/02/23 10:38:33 INFO mapred.JobClient: Running job: job_201702221239_0003
17/02/23 10:38:34 INFO mapred.JobClient: map 0% reduce 0%
17/02/23 10:38:55 INFO mapred.JobClient: map 17% reduce 0%
17/02/23 10:38:56 INFO mapred.JobClient: map 33% reduce 0%
17/02/23 10:39:08 INFO mapred.JobClient: map 50% reduce 0%
17/02/23 10:39:09 INFO mapred.JobClient: map 67% reduce 0%
17/02/23 10:39:21 INFO mapred.JobClient: map 83% reduce 0%
17/02/23 10:39:22 INFO mapred.JobClient: map 100% reduce 0%
17/02/23 10:39:26 INFO mapred.JobClient: Job complete: job_201702221239_0003
17/02/23 10:39:26 INFO mapred.JobClient: Counters: 23
17/02/23 10:39:26 INFO mapred.JobClient: File System Counters
17/02/23 10:39:26 INFO mapred.JobClient: FILE: Number of bytes read=0
17/02/23 10:39:26 INFO mapred.JobClient: FILE: Number of bytes written=1778658
17/02/23 10:39:26 INFO mapred.JobClient: FILE: Number of read operations=0
17/02/23 10:39:26 INFO mapred.JobClient: FILE: Number of large read operations=0
17/02/23 10:39:26 INFO mapred.JobClient: FILE: Number of write operations=0
17/02/23 10:39:26 INFO mapred.JobClient: HDFS: Number of bytes read=791
17/02/23 10:39:26 INFO mapred.JobClient: HDFS: Number of bytes written=818
17/02/23 10:39:26 INFO mapred.JobClient: HDFS: Number of read operations=6
17/02/23 10:39:26 INFO mapred.JobClient: HDFS: Number of large read operations=0
17/02/23 10:39:26 INFO mapred.JobClient: HDFS: Number of write operations=6
17/02/23 10:39:26 INFO mapred.JobClient: Job Counters
17/02/23 10:39:26 INFO mapred.JobClient: Launched map tasks=6
17/02/23 10:39:26 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=89702
17/02/23 10:39:26 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0
17/02/23 10:39:26 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
17/02/23 10:39:26 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
17/02/23 10:39:26 INFO mapred.JobClient: Map-Reduce Framework
17/02/23 10:39:26 INFO mapred.JobClient: Map input records=8
17/02/23 10:39:26 INFO mapred.JobClient: Map output records=8
17/02/23 10:39:26 INFO mapred.JobClient: Input split bytes=791
17/02/23 10:39:26 INFO mapred.JobClient: Spilled Records=0
17/02/23 10:39:26 INFO mapred.JobClient: CPU time spent (ms)=5490
17/02/23 10:39:26 INFO mapred.JobClient: Physical memory (bytes) snapshot=666267648
17/02/23 10:39:26 INFO mapred.JobClient: Virtual memory (bytes) snapshot=4423995392
17/02/23 10:39:26 INFO mapred.JobClient: Total committed heap usage (bytes)=191102976
17/02/23 10:39:26 INFO mapreduce.ImportJobBase: Transferred 818 bytes in 56.7255 seconds (14.4203 bytes/sec)
17/02/23 10:39:26 INFO mapreduce.ImportJobBase: Retrieved 8 records.

Example above dumps table user from mysql DB into hadoop.
First connect to DB using –connect
–username would by the authentication username, -P to ask for password at prompt. –warehouse-dir HDFS parent for table destination and –table to select the table to import.

Below dumped content is shown.

user@computer:$ hdfs dfs -cat /home/training/db/user/part-m-0000*
127.0.0.1,root,,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,,,,,0,0,0,0
localhost,root,,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,,,,,0,0,0,0
localhost.localdomain,root,,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,,,,,0,0,0,0
localhost,,,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,,,,,0,0,0,0
localhost.localdomain,,,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,,,,,0,0,0,0
localhost,training,*27CF0BD18BDADD517165824F8C1FFF667B47D04B,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,N,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,Y,,,,,0,0,0,0
localhost,hiveuser,*2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,,,,,0,0,0,0
localhost,hue,*15221DE9A04689C4D312DEAC3B87DDF542AF439E,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,N,,,,,0,0,0,0

Python script to download all NASDAQ and NYSE ticker symbols

Below is a short python script to get all NASDAQ and NYSE common stock tickers. You can then use the resulting file to get a lot of info using yahoofinance library.

#!/usr/bin/env python

import ftplib
import os
import re

# Connect to ftp.nasdaqtrader.com
ftp = ftplib.FTP('ftp.nasdaqtrader.com', 'anonymous', 'anonymous@debian.org')

# Download files nasdaqlisted.txt and otherlisted.txt from ftp.nasdaqtrader.com
for ficheiro in ["nasdaqlisted.txt", "otherlisted.txt"]:
        ftp.cwd("/SymbolDirectory")
        localfile = open(ficheiro, 'wb')
        ftp.retrbinary('RETR ' + ficheiro, localfile.write)
        localfile.close()
ftp.quit()

# Grep for common stock in nasdaqlisted.txt and otherlisted.txt
for ficheiro in ["nasdaqlisted.txt", "otherlisted.txt"]:
        localfile = open(ficheiro, 'r')
        for line in localfile:
                if re.search("Common Stock", line):
                        ticker = line.split("|")[0]
                        # Append tickers to file tickers.txt
                        open("tickers.txt","a+").write(ticker + "\n")

Small script to calculate future value

Heres a small script I wrote to calculate the future value.

#!/usr/bin/env python

# This script calculate the future value

iv = float(raw_input("Enter initial value: "));
rate = float(raw_input("Enter interest rate: "));
times = int(raw_input("Enter the amount of years: "));

fv = iv*((1 + (rate/100))**times)

print "Final amount is: %.2f." %fv