AdSense

Friday, December 29, 2017

Machine Leaning by Andrew Ng on Coursera

This is the best machine learning online resource I've ever seen! Thank you, Andrew Ng. This is going to be great online course materials during my year-end and new-year holidays. Everyone who learns machine learning should take this course.
https://www.coursera.org/learn/machine-learning

  • Machine Learning
    • Supervised Learning (correct answers / labels are given by humans)
      • Regression (e.g., predict continuous valued output like estimated sales figures or prices)
      • Classification (discrete valued output, e.g., 0 or 1, benign or malignant, etc.)
    • Unsupervised Learning
      • e.g., given a set of news articles found on the web, group them into set of articles about the same story
      • e.g., automatically discover market segments



Python: Number Guessing Game

# number guessing game

import random

# random numbers from 1 to 100
secret_number = random.randint(1, 100)

# your guess
guess = 0

# number of guess
tries = 0
max_tries = 7

print("Welcome to the number guessing game.")
print("I am thinking of a number between 1 and 100.")
print("I will give " + str(max_tries) + " tries to get it. Good luck!")

while (guess != secret_number) and (tries <  max_tries):

guess = int(input("What is your guess? "))

if guess > secret_number:
print("That is too HIGH.")

elif guess < secret_number:
print("That is too LOW.")

tries = tries + 1


if guess == secret_number:

print("You guessed my number!")
print("##############################")
print("       Congratulations!       ")
print("##############################")

else:

print("Better luck next time")
print("The secret number was", secret_number)

Reference:
https://youtu.be/7CTONNO6YMc

Thursday, December 28, 2017

Python: Simple Image Recognition System

The goal of the image recognition system explained here is to find parameter values that result in the model’s output being correct as often as possible. This kind of training, in which the correct solution is used together with the input data, is called supervised learning. (Unsupervised learning is out of the scope of this article.)

After the training phase has been done, the model’s parameter values don’t change anymore and the model can be used in the testing phase for classifying images which were not part of its training dataset.


Preparation


[1] Python
https://www.python.org/downloads/
Download 3.6.4 (or later) and install it.


[2] TensorFlow
https://www.tensorflow.org/versions/r0.12/get_started/os_setup

Pip installation
Access
http://pip.readthedocs.org/en/stable/installing/

Download
get-pip.py

Then save get-pip.py on your directory.

Run the following scripts on your directory. (e.g., Terminal, Mac OS)
python get-pip.py

https://www.tensorflow.org/versions/r0.12/get_started/os_setup#pip_installation

pip install tensorflow


[3] The CIFAR-10 python version dataset:
https://www.cs.toronto.edu/~kriz/cifar.html
Download "CIFAR-10 python version"

Then extract the downloaded file and save the cifar-10-batches-py holder on your directory.

It consists of 60,000 images (10 different categories * 6,000 images per category). Each image has a size of 32 by 32 pixels. Each pixel is described by three floating point numbers representing the red (R), green (G) and blue (B) values for this pixel. This results in 32 x 32 x 3 = 3,072 values for each image.


[4] Python scripts
Download all the py scripts on
https://github.com/wolfib/image-classification-CIFAR10-tf
then save them on your directory.


1.Softmax (not a neutral network)

Run the following scripts on your directory. (e.g., Terminal, Mac OS)
python softmax.py

Step     0: training accuracy 0.08
Step   100: training accuracy 0.3
Step   200: training accuracy 0.26
Step   300: training accuracy 0.24
Step   400: training accuracy 0.34
Step   500: training accuracy 0.28
Step   600: training accuracy 0.35
Step   700: training accuracy 0.27
Step   800: training accuracy 0.37
Step   900: training accuracy 0.37
Test accuracy 0.266
Total time:  3.95s

The accuracy of evaluating the trained model on the test set is about 27% (this might vary on your environment).  So our model is able to pick the correct label for an image it has never seen before around 27% of the time. There are 10 different labels, so random guessing would result in an accuracy of 10%.


2. Neutral Network

Let's build a neural network that performs the same task.

two_layer_fc.py
This defines the model

run_fc_model.py
This runs the model (‘fc’ stands for fully connected).

Run the following scripts on your directory. (e.g., Terminal, Mac OS)
python run_fc_model.py


Parameters:
batch_size = 400
hidden1 = 120
learning_rate = 0.001
max_steps = 2000
reg_constant = 0.1
train_dir = tf_logs

2017-12-29 12:50:00.619290: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
Step 0, training accuracy 0.1
Step 100, training accuracy 0.3225
Step 200, training accuracy 0.365
Step 300, training accuracy 0.3925
Step 400, training accuracy 0.4125
Step 500, training accuracy 0.4575
Step 600, training accuracy 0.3925
Step 700, training accuracy 0.4475
Step 800, training accuracy 0.4875
Step 900, training accuracy 0.475
Saved checkpoint
Step 1000, training accuracy 0.5025
Step 1100, training accuracy 0.5125
Step 1200, training accuracy 0.5025
Step 1300, training accuracy 0.45
Step 1400, training accuracy 0.515
Step 1500, training accuracy 0.515
Step 1600, training accuracy 0.5625
Step 1700, training accuracy 0.5575
Step 1800, training accuracy 0.5525
Step 1900, training accuracy 0.5375
Saved checkpoint
Test accuracy 0.4628
Total time: 34.08s


This indicates that our model is not significantly overfitted. The performance of the softmax classifier was around 30%, so 46% is an improvement of about 50%.


Run the following scripts on your directory. (e.g., Terminal, Mac OS)
tensorboard --logdir=tf_logs

Then access the following URL via your browser.
(your hostname).local:6006




Source:
http://www.wolfib.com

Saturday, December 2, 2017

Black-Litterman Portfolio Optimization with Python

Black-Litterman Portfolio Optimization with Python


This is a very basic introduction of the Black-Litterman portfolio optimization with the Python coding samples.

[0] Traditional Optimization: Mean-Variance Approach by Markowitz

In the mean-variance approach, we have to estimate both expected returns and variance-covariance (risks), and then optimize our portfolios by maximizing its return and minimizing its risk (variance or standard deviation, i.e., volatility, of the return). This is a very straightforward approach, but there are some practical issues:
Risks are relatively stable and easier to estimate while returns are unstable and harder to estimate
You have to estimate returns of every single asset in the mean-variance approach; also, if the expected returns vary, the optimization result (optimized portfolio) changes a lot.


[1] Reverse Optimization and Bayesian Inference: Black-Litterman Approach

In the Black-Letterman approach, expected returns are not directly estimated; instead, it hypothesizes that the entire market itself is optimally allocated. Namely, the current market portfolio with market cap weights are derived based on the market-estimated risks and returns. It calculates expected (or implied) returns by using the current market cap weights and estimated risks.

Further more, an investor’s view is blended into the estimated equilibrium returns above by using the Bayesian inference. For instance, an investor’s view goes like this. An asset A outperforms an asset B by 5%, or an asset C’s return is 5%. For each view, an investor can input confidence as a parameter. An confident view for a return has a bigger impact on the expected portfolio return.

Finally, based on updated expected returns (=implied returns + views and confidence) and risks, an optimal portfolio is computed.


[2] Black-Letterman Portfolio Optimization with Python

# Intro
#
# On the Mac OS, run Terminal and then use the following “python” command before running the following Python scripts:
#
# python


# Let’s reproduce the results in the following paper:
# He & Litterman (1999)
# https://faculty.fuqua.duke.edu/~charvey/Teaching/IntesaBci_2001/GS_The_intuition_behind.pdf

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Equilibrium Portfolio Weights
# p.16 Appendix A Table 1
# AUL, CAN, FRA, GER, JAP, UKG, USA
w = np.array([[0.016, 0.022, 0.052, 0.055, 0.116, 0.124, 0.615]]).T

# Correlation Matrix
# p.16 Appendix A Table 2
# AUL, CAN, FRA, GER, JAP, UKG, USA
correlation = np.array([
        [1, 0.488, 0.478, 0.515, 0.439, 0.512, 0.491],
        [0.488, 1, 0.664, 0.655, 0.310, 0.608, 0.779],
        [0.478, 0.664, 1, 0.861, 0.355, 0.783, 0.668],
        [0.515, 0.655, 0.861, 1, 0.354, 0.777, 0.653],
        [0.439, 0.310, 0.355, 0.354, 1, 0.405, 0.306],
        [0.512, 0.608, 0.783, 0.777, 0.405, 1, 0.652],
        [0.491, 0.779, 0.668, 0.653, 0.306, 0.652, 1]])

# Standard Deviation (Volatility)
# p.16 Appendix A Table 2
# AUL, CAN, FRA, GER, JAP, UKG, USA
std = np.array([[0.16, 0.203, 0.248, 0.271, 0.21, 0.2, 0.187]])

# Variance Covariance Matrix (which can be calculated with the correlation and volatility above)
Sigma = correlation * np.dot(std.T, std)

# delta (δ): risk aversion parameter (scalar)
# p.4 footnote
delta = 2.5

# tau (τ): a scalar measures the uncertainty of the CAPM prior
tau = 0.05


### (1) Reverse Optimization
### Inputs: Equilibrium Portfolio Weights, Standard Deviation (Volatility)
### Output: Equilibrium Expected Returns (p.16 Appendix A Table 1)
#
# U: the utility function of the optimization - w which maximizes U is a set of weights for the optimized portfolio
# U= wTΠ − δ/2wTΣw
#
# When w maximizing U, the following equation must hold:
# dU/dw = Π − δΣw = 0
# Π = δΣw
#
# Π: Equilibrium Expected Returns (nx1 vector) (p.16 Appendix A Table 1) —> r_eq
# Π′: Black-Litterman expected returns updated by an investor’s view (nx1 vector) (p.7 Chart 2A) —> r_posterior
# w: Equilibrium Portfolio Weights (nx1 vector) (p.16 Appendix A Table 1)
# w′: Optimal Portfolio Weights (nx1 vector) (p.9 Chart 3B)
# Σ: Variance Covariance Matrix
# P, Q: Investor’s view matrix
# Ω: Investor’s view confidence matrix
# delta (δ): investor’s risk aversion parameter (scalar)
# tau (τ): a scalar measures the uncertainty of the CAPM prior, variance-covariance matrix
#
# reverse optimization
r_eq = delta * np.dot(Sigma, w)
# >>> r_eq
#array([[ 0.03937555],
#       [ 0.0691519 ],
#       [ 0.08358087],
#       [ 0.0902724 ],
#       [ 0.0430281 ],
#       [ 0.06767693],
#       [ 0.07560047]])



### (2) Blending Equilibrium Portfolio Weights with Investor’s Views
### Inputs: Π Equilibrium Expected Returns, P & Q Investor’s view matrix
### Output: Π′ Black-Litterman expected returns updated by an investor’s view —> r_posterior, Σ′ —> Sigma_posterior
#
# Investor’s View
# AUL, CAN, FRA, GER, JAP, UKG, USA
#
# P = [0 0 -0.295 1 0 -0.705 0
#        0 1   0        0 0  0        -1]
# In the first row,
# -0.295: FRA
# 1:GER
# -0.705: UKG
# In the second row,
# 1: CAN
# -1: USA
#
# Q = [ 0.05
#          0.03]
#
# These mean that:
# GER outperforms FRA and UKG by 5%
# CAN outperforms USA by 3%
#
#
# Ω: Investor’s view confidence matrix
# Ω = [ 0.001065 0
#          0              0.000852 ]

P = np.array([
        [0,0,-0.295,1,0,-0.705,0],
        [0,1,0,0,0,0,-1]]) # 2x7 matrix (2: number of views, 7: number of assets)
Q = np.array([[0.05],[0.03]]) # 2-vector
Omega = np.array([
        [0.001065383332,0],
        [0,0.0008517381]])

# Black-Litterman master equation
# Π′ = Π + τΣPT (PτΣPT + Ω)-1 (Q−PΠ)
#
# Blending Investor’s View with the Equilibrium Returns
r_posterior = r_eq + np.dot( np.dot( tau*np.dot(Sigma,P.T), np.linalg.inv(tau*np.dot(np.dot(P,Sigma),P.T)+Omega)), (Q-np.dot(P,r_eq)))
#
# AUL, CAN, FRA, GER, JAP, UKG, USA
#
# >>> r_posterior
#array([[ 0.04422145],
#       [ 0.08729864],
#       [ 0.09479745],
#       [ 0.11209947],
#       [ 0.04616347],
#       [ 0.0697166 ],
#       [ 0.0748156 ]])
#
#
# On top of the returns, variance-covariance matrix can be updated as follows:
# Σ′ = Σ + τΣ -τΣPT (PτΣPT + Ω)-1PτΣ
Sigma_posterior = Sigma + tau*Sigma - tau*np.dot( np.dot( np.dot(Sigma,P.T), np.linalg.inv(tau*np.dot(np.dot(P,Sigma),P.T)+Omega)), tau*np.dot(P,Sigma))
#
#>>> Sigma_posterior
#array([[ 0.02684723,  0.01657429,  0.0198378 ,  0.02328236,  0.01547005,
#         0.01718801,  0.01539182],
#       [ 0.01657429,  0.04299381,  0.03494199,  0.03753285,  0.01383059,
#         0.02589072,  0.03107504],
#       [ 0.0198378 ,  0.03494199,  0.06439533,  0.06036839,  0.01937078,
#         0.04074256,  0.03244592],
#       [ 0.02328236,  0.03753285,  0.06036839,  0.07627367,  0.02106627,
#         0.04414171,  0.03454918],
#       [ 0.01547005,  0.01383059,  0.01937078,  0.02106627,  0.04629476,
#         0.01785243,  0.01260504],
#       [ 0.01718801,  0.02589072,  0.04074256,  0.04414171,  0.01785243,
#         0.04199287,  0.0255861 ],
#       [ 0.01539182,  0.03107504,  0.03244592,  0.03454918,  0.01260504,
#         0.0255861 ,  0.03661572]])


### (3) Optimization - finding optimal weights
### Inputs: Π′, Σ′
### Output: w′ Optimal Portfolio Weights —> w_posterior
#
# w′ = Π′ (δΣ′)-1
#
# Forward Optimization and finding optimal weghts
w_posterior = np.dot(np.linalg.inv(delta*Sigma_posterior), r_posterior)
#
# AUL, CAN, FRA, GER, JAP, UKG, USA
# >>> w_posterior
#array([[ 0.0152381 ],
#       [ 0.41863571],
#       [-0.03409321],
#       [ 0.33582847],
#       [ 0.11047619],
#       [-0.08173526],
#       [ 0.18803095]])
#
# Plot w′ by using pandas
df = pd.DataFrame([w.reshape(7),w_posterior.reshape(7)],
                  columns=['AUL','CAN','FRA','GER','JAP','UKG','USA'],
                  index=['Equilibrium Weights','Constrained Optimal Weights'])
df.T.plot(kind='bar', color='br')




Deep Learning (Regression, Multiple Features/Explanatory Variables, Supervised Learning): Impelementation and Showing Biases and Weights

Deep Learning (Regression, Multiple Features/Explanatory Variables, Supervised Learning): Impelementation and Showing Biases and Weights ...