Google TensorFlow 설치 및 사용방법

2016.1.13.

softgear@etri.re.kr

Google TensorFlow

https://www.tensorflow.org/
data flow graph를 이용한 computation library
그래프의 Node는 수학적 연산을 의미,
그래프의 Edge는 다차원 데이터 배열 (“tensors”)을 의미
CPU, GPU(Linux, CUDNN 6.5 V2) 지원
처음에는 Google’s Brain Team을 위해 개발됨

Tensor 란?

물리학에서의 텐서 http://ghebook.blogspot.kr/2011/06/tensor.html
여기서는 다차원 데이터 셋을 의미

TensorFlow Features

Deep Flexibility : 고수준 라이브러리 ~ 저수준 함수까지 Python 또는 C/C++를 이용하여 구성 가능
True Portability : 작성한 코드 수정 없이, CPU, GPU, desktop, server, mobile platform 에 적용 가능
Connect Research and Production : TensorFlow에 구현한 새 알고리즘을 바로 상용 제품화 가능
Auto-Differentiation : 자동 유도-미분
Language Options : Python 기본, C++
Maximize Performance : 최대한 CPU core, GPU 사용

Introduction

import tensorflow as tf
import numpy as np

# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(100).astype("float32")
y_data = x_data * 0.1 + 0.3

# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but Tensorflow will
# figure that out for us.)
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

# Before starting, initialize the variables. We will 'run' this first.
init = tf.initialize_all_variables()

# Launch the graph.
sess = tf.Session()
sess.run(init)

# Fit the line.
for step in xrange(201):
   sess.run(train)
   if step % 20 == 0:
       print(step, sess.run(W), sess.run(b))

# Learns best fit is W: [0.1], b: [0.3]

Download and Setup

gitbhub로 부터 또는 binary package를 이용하여 설치 가능

Requirements

Python 2.7 and Python 3.3+ from source
GPU version : Cuda Toolkit 7.0, CUDNN 6.5 V2 ( Cuda installation: https://www.tensorflow.org/versions/master/get_started/os_setup.html#optional-install-cuda-gpus-on-linux )

Install Overview

Pip install : 시스템에 가상화 없이 설치
Virtualenv install : 정해진 디렉토리내에 설치(* 본 문서에서는 이 방법대로 Mac에 설치함. )
Docker install : 가상화 설치 (Linux만 가능한 듯)
source code로도 가능

Virualenv installation

pip와 Virtualenv 설치, six 업그레이드

$ sudo easy_install pip

$ sudo easy_install -U six
$ sudo pip install --upgrade virtualenv

~/tensorflow 디렉토리에 Virtualenv 환경 생성

$ virtualenv --system-site-packages ~/tensorflow

Virtualenv 활성화 및 그 안에 TensorFlow 설치

$ source ~/tensorflow/bin/activate

(tensorflow)$

(tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py2-none-any.whl

pip3에 대해서도 동일하게 수행

$ source ~/tensorflow/bin/activate # If using bash

(tensorflow)$

(tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py3-none-any.whl

“tensorflow-0.6.0-py3-none-any.whl is not a supported wheel on this platform.” 라는 오류가 나는데, 상관없음 (확인필요)

Virtualenv 비활성화

더이상 가상환경을 안 쓸 경우.

(tensorflow)$ deactivate

imac:~ user$ #원래 프롬프트

설치 확인

가상환경하에서 다음을 실행.

(tensorflow):~ user$ python

Python 2.7.10 (default, Oct 23 2015, 18:05:06)

[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

>>> import tensorflow as tf

>>> hello = tf.constant('Hello, TensorFlow!')

>>> sess = tf.Session()

I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 8

I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 8

>>> print(sess.run(hello))

Hello, TensorFlow!

>>> a = tf.constant(10)

>>> b = tf.constant(32)

>>> print(sess.run(a+b))

>>>

>>> quit()

데모 모델 실행하기

데모 모델이 설치된 디렉토리 확인

(tensorflow)imac:~ user$ python -c 'import os; import inspect; import tensorflow; print(os.path.dirname(inspect.getfile(tensorflow)))'

/Users/user/tensorflow/lib/python2.7/site-packages/tensorflow

(tensorflow)imac:~ user$

MNIST 문자인식 CNN 모델 실행

(tensorflow)imac:~ user$ python -m tensorflow.models.image.mnist.convolutional

Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.

Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.

Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.

Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.

Extracting data/train-images-idx3-ubyte.gz

Extracting data/train-labels-idx1-ubyte.gz

...생략

기본 사용법 (Basic Usage)

TensorFlow를 사용하기 위해서 다음 사항을 하는 방법을 알아야 한다.

graph로 computation을 표현
Sessions 안의 graph를 실행
tensors로서 data를 표현
Variables로 상태 관리
feeds와 fetches로 데이터를 넣거나 빼기

개요

graph의 Nodes를 “op,ops”(operations)라 한다. 하나의 op는 0개 이상의 “Tensor”를 취하며(입력), 계산을 수행하고, 결과를 0개 이상의 Tensor를 생산한다. Tensor는 typed multi-dimension array (다중차원배열)이다. 예를 들어, [batch, height, width, channels] 차원의 부동소수 4차원 배열로서 이미지의 mini-batch 를 표현할 수 있다.

TensorFlow graph는 계산들을 서술한 것이다. 계산을 위해, graph는 “Session” 내에 띄워져야 (launch) 한다. Session은 graph ops를, CPU나 GPU 같은, “Devices”에 배치하고, ops를 실행하기 위한 method를 제공한다. 이러한 method는 ops의 결과물인 tensor를 Python의 “numpy ndarray” 객체형태로 리턴한다. 이는 C/C++에서는 tensorflow::Tensor 인스턴스이다.

Computation graph

TensorFlow 프로그램은, construction 단계와 execution 단계로 구성된다. construction 단계는 graph를 조립-연결하며, execution 단계는 graph 내의 ops를 실행하기 위해 session을 사용한다.

TensorFlow는 C,C++,Python으로 사용할 수 있으며, Python이 편리하다.

Building Graph

graph를 만들기 위해서, “Constant”와 같이 어떤 입력도 필요로 하지 않는 ops로 시작하여, 그 출력을 계산을 위한 다른 ops로 넘긴다. Python라이브러리의 ops 생성자는, 생성된 ops의 출력을 의미하는 객체를 리턴한다. 그래서 리턴된 것을 다른 ops 생성자의 입력으로 넘길수 있다.

TensorFlow Python 라이브러리는 ops생성자가 node를 추가할 수 있는 default graph를 가지고 있으며, 많은 응용에서 충분히 사용할 수 있다. 많은 그래프를 명시적으로 다루기 위해서는 Graph class 문서를 참고하라( https://www.tensorflow.org/versions/master/api_docs/python/framework.html#Graph ) .

다음 파이선 코드를 보자. default graph는 3개의 노드를 갖게 된다. 2개는 constant() op이고 1개는 matmul() op이다. 실제로 행렬 곱하기를 수행하여 결과를 얻기 위해서는 graph를 session 안에 띄워야(launch) 한다.

import tensorflow as tf

# Create a Constant op that produces a 1x2 matrix. The op is
# added as a node to the default graph.
#
# The value returned by the constructor represents the output
# of the Constant op.
matrix1 = tf.constant([[3., 3.]])

# Create another Constant that produces a 2x1 matrix.
matrix2 = tf.constant([[2.],[2.]])

# Create a Matmul op that takes 'matrix1' and 'matrix2' as inputs.
# The returned value, 'product', represents the result of the matrix
# multiplication.
product = tf.matmul(matrix1, matrix2)

Launching the graph in a session : graph를 session에 띄워 실행하기

graph를 session에 띄우려면, Session 객체를 생성한다. Session 생성자에 인수가 없으면 default graph를 띄운다. 자세한 session API는 session class 를 참고한다( https://www.tensorflow.org/versions/master/api_docs/python/client.html#session-management ).

# Launch the default graph.
sess = tf.Session()

# To run the matmul op we call the session 'run()' method, passing 'product'
# which represents the output of the matmul op. This indicates to the call
# that we want to get the output of the matmul op back.
#
# All inputs needed by the op are run automatically by the session. They
# typically are run in parallel.
#
# The call 'run(product)' thus causes the execution of threes ops in the
# graph: the two constants and matmul.
#
# The output of the op is returned in 'result' as a numpy `ndarray` object.
result = sess.run(product)
print(result)
# ==> [[ 12.]]

# Close the Session when we're done.
sess.close()

세션 사용이 끝나면 close()를 해야 한다. 세션을 “with” 블록으로 사용할 수 있는데, 이때는 자동으로 close된다.

with tf.Session() as sess:
result = sess.run([product])
print(result)

여러개의 GPU가 있는 경우, 명시적으로 ops에 GPU를 할당하여야 한다. with...Device 구문을 사용한다.

with tf.Session() as sess:
with tf.device("/gpu:1"):
   matrix1 = tf.constant([[3., 3.]])
   matrix2 = tf.constant([[2.],[2.]])
   product = tf.matmul(matrix1, matrix2)
   ...

Devices에는 다음 과 같은 것이 가능하다.

"/cpu:0": The CPU of your machine.
"/gpu:0": The GPU of your machine, if you have one.
"/gpu:1": The second GPU of your machine, etc.

TensorFlow의 GPU 에 대한 자세한 정보는 링크를 참고한다(

https://www.tensorflow.org/versions/master/how_tos/using_gpu/index.html ).

Interactive Usage

IPython과 같은 interactive Python환경에서는 InteractiveSession 클래스와 Tensor.eval(), Operation.run() 메소드를 사용한다.

# Enter an interactive TensorFlow Session.
import tensorflow as tf
sess = tf.InteractiveSession()

x = tf.Variable([1.0, 2.0])
a = tf.constant([3.0, 3.0])

# Initialize 'x' using the run() method of its initializer op.
x.initializer.run()

# Add an op to subtract 'a' from 'x'. Run it and print the result
sub = tf.sub(x, a)
print(sub.eval())
# ==> [-2. -1.]

# Close the Session when we're done.
sess.close()

Tensors

TensorFlow은 모든 데이터를 tensor 데이터 구조 형태로 다루며 computation graph의 연산간에 전달된다. tensor는 n-차원 array 또는 list로 생각하면 된다. 하나의 tensor는 static type 및 rank, shape을 갖는다. TensorFlow가 어떻게 이러한 개념을 다루는지는 링크를 참고한다( https://www.tensorflow.org/versions/master/resources/dims_types.html ).

Variables

Variables는 graph의 실행간에 상태(state)를 유지한다. 다음 예는 단순 카운터로 동작하는 varible을 보인다. Varible에 대한 자세한 것은 링크를 참고 한다 ( https://www.tensorflow.org/versions/master/how_tos/variables/index.html ).

# Create a Variable, that will be initialized to the scalar value 0.
state = tf.Variable(0, name="counter")

# Create an Op to add one to `state`.

one = tf.constant(1)
new_value = tf.add(state, one)
update = tf.assign(state, new_value)

# Variables must be initialized by running an `init` Op after having
# launched the graph. We first have to add the `init` Op to the graph.
init_op = tf.initialize_all_variables()

# Launch the graph and run the ops.
with tf.Session() as sess:
# Run the 'init' op
sess.run(init_op)
# Print the initial value of 'state'
print(sess.run(state))
# Run the op that updates 'state' and print 'state'.
for _ in range(3):
sess.run(update)
print(sess.run(state))

# output:

# 0
# 1
# 2
# 3

이 코드에서 assign() 동작은 add() 동작과 마찬가지로 graph를 기술하는 것이며, run()수행이 되기 전까지는 실제로 어떤 치환 동작을 하는 것은 아니다.

Variables의 집합으로서 통계적 모델의 파라메터를 서술할 수 있다. 예를 들어, 신경망의 weight들을 Variable내에 tensor로서 저장할 수 있다. training동안 이 tensor는 계속 업데이트될 것이다.

Fetches

연산의 출력을 가져오기 위해서는, Session 객체의 run() 호출로 graph를 수행하고, 결과를 가져오기 위해 tensor로 전달한다. 앞의 예에서는 하나의 노드인 state를 fetch 했으며, 여러 개의 tensor를 가져올(fetch) 수 있다.

input1 = tf.constant(3.0)
input2 = tf.constant(2.0)
input3 = tf.constant(5.0)
intermed = tf.add(input2, input3)
mul = tf.mul(input1, intermed)

with tf.Session() as sess:
result = sess.run([mul, intermed])
print(result)

# output:
# [array([ 21.], dtype=float32), array([ 7.], dtype=float32)]

요청된 tensor들의 값을 만들때 필요한 op들은 딱 한번 수행된다. (요청된 tensor별 한번이 아니라)

Feeds

앞의 예제는 Contants와 Varibles에 tensor를 저장하여, computation graph에 tensors이 들어가도록 하였다. TensorFlow는 graph의 어떤 동작(operation)으로 tensor를 직접 적용할 수 있도록 feed 메커니즘을 제공한다.

feed는 동작(operation)의 출력을 tensor 값으로 대치한다. 이는 run()호출시 인수에 feed data를 제공하면 된다. feed는 run 호출에 전달될 때만 사용된다. 보통은 “feed” 동작을 위해서 tf.placeholder() 를 사용한다.

input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
output = tf.mul(input1, input2)

with tf.Session() as sess:
print(sess.run([output], feed_dict={input1:[7.], input2:[2.]}))

# output:
# [array([ 14.], dtype=float32)]

더 큰 규모의 feed의 예는 링크를 참고한다( https://www.tensorflow.org/versions/master/tutorials/mnist/tf/index.html ).

Tutorials

https://www.tensorflow.org/versions/master/tutorials/index.html

How-To

https://www.tensorflow.org/versions/master/how_tos/index.html

API Documentation

Python API : https://www.tensorflow.org/versions/master/api_docs/python/index.html

C++ API: https://www.tensorflow.org/versions/master/api_docs/cc/index.html

White Paper

http://download.tensorflow.org/paper/whitepaper2015.pdf

Additional Resources

https://www.tensorflow.org/versions/master/resources/index.html

Apache-2.0 License

Optimizers

Gradient Descent Optimizer
AdagradOptimizer : “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization”, http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf
Momentum Optimizer
Adam Optimizer : “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION” , http://arxiv.org/pdf/1412.6980v7.pdf
FTRL Optimizer : Follow-The-(Proximally)-Regularized Leader, “Ad Click Prediction: a View from the Trenches”, https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf
RMSProp Optimizer : Divide the gradient by a running average of its recent magnitude, http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

Python API hierarchy

Buidling Graph

Core graph data structures
Utility functions
Graph collections
Defining new operations
For libraries building on TensorFlow

Constants, Sequences, and Random Values

Constants Value Tensors
Sequences
Random Tensors
Examples

Variables

Variables
Variable helper functions
Saving and Restoring Variables
Sharing Variables
Sparse Variable Updates

Tensor Transformations

Casting
Shapes and Shaping
Slicing and Joining

Math

Arithmetic Operations
Basic Math Functions
Matrix Math Functions
Complex Number Functions
Reduction
Segmentation
Sequence Comparision and Indexing

Control Flow

Control Flow Operations
Logical Operators
Comparision Operators
Debugging Operations

Images

Encoding and Decoding
Resizing
Cropping
Fliping and Transposing
Converting Between Colorspaces
Image Adjustments

Sparse Tensors

Sparse Tensor Representation
Sparse to Dense Conversion
Manipulation

Input and Readers

Placeholders
Readers
Converting
Example protocol buffer
Queues
Dealing with the filesystem
Input pipeline
Beginning of an input pipeline
Batching at the end of an input pipeline

Data IO

Data IO
TFRecords Format Details

Neural Network

Activation Functions
Convolution
Pooling
Normalization
Losses
Classification
Embeddings
Evaluation
Candidate Sampling
Sampled Loss Functions
Candidate Samplers
Miscellaneous candidates sampling and utilities

Running Graphs

Session management
Error classes

Training

Optimizers
Usage
Processing gradients before applying them
Gating Gradients
Slots
Gradient Computation
Gradient Clipping
Decaying the learning rate
Moving Average
Coordinator and QueueRunner
Summary Operations
Adding Summeries to Event Files
Training utilities
Other Functions and Classes

-문서끝-

2016년 2월 4일 목요일

NVIDIA GPU TESLA K40 설치방법 (Ubuntu 14.04)

NVIDIA GPU TESLA K40 설치방법 (Ubuntu 14.04)

하드웨어 설치

하드웨어 정상 설치 확인

CUDA 다운로드

환경변수 설정

샘플 다운로드 및 시험

2016년 1월 13일 수요일