Docker Docker Docker! what the heck..?

Let us consider a scenario where you are planning to deploy an application or service. Now, to deploy an application, the most sensible thing to do is to fire up an instance(VM) and deploy it.

Here, an hypervisor sitting on top of the host os launches a virtual operating system by sharing the hardware resources.

So, option 1:

(V)App/services -> (V)bins, libs -> (V)Guest OS -> HYPERVISOR(KVM, Xen) -> Host OS -> Server

Option 2:

(V)App/services -> bins, libs -> Docker engine -> Host OS -> Server</b>

(V) - Virtualized component

Containers, instead of virtualizing hardwares, they sit on one single linux operating system instance, hence for a usecase where VMs are too much of an over head, containers are the right solution.

Megam’s strategy is to deploy with both VMs and dockers to max out resources usage efficiency. We support baremetal docker deployments too.

In short, imagine containers as small capsules that contains your application in a well bundled manner, with defined resource usages, that can be read by the docker engine on any linux operating systems. Not to forget, it is highly portable - Containers gives you instant application portability

Now you can run multiple applications independantly on a single operating system which runs on your server instead of creating multiple OSes for each application/services.

The key difference between containers and VMs is that while the hypervisor abstracts an entire device, containers just abstract the operating system kernel. smart right?

So docker figured out the whole containerization technology?

Nope! containers are in existence for almost 15 years now, Google’s open source container technology lmctfy (let me contain that for you) effecively uses containers for most their infrastructure needs, you search, use gmail, read news, you spin up a container. LXCs, solaris’s Zones are all container technology that are there more than a decade ba

Docker, on the other hand made containers easier and safer to depoy. Developers can use docker to pack, ship and run any application as a lightweight, portable LXC container that can run virtually anywhere.

Trying docker out (Ubuntu Trusty 14.04)

Docker community is huge and it supports almost all platforms, so nothing to worry.

First things first,

 $sudo apt-get update

now, let us install docker

 $wget -qO- https://get.docker.com/ | sh

Containerize your application:

Once you have installed docker we are good to go. First, we will use basic docker commands,

docker ps - shows running containers
docker build - builds a container docker pull/push - pulls/pushes an image from docker hub docker run - runs it

Image? docker hub? what?

It is simple, you can build your containers and push it on docker hub and later can just use docker run anywhere to deploy it. easy right? An image is your packaged container that is pushed on docker hub. docker run automatically pulls if the image is not found locally. Shown below.

Let us run postgres container

$sudo docker run postgres

it first searches for local docker images and then pulls it from docker hub and runs it automatically.

Now, see if the container is running..

$sudo docker ps

this should list the containers currently running and in our case, it should list postgres.

It is a breeze to simply build, and run docker containers on any linux environment. Thats it for now, I will write about docker clustering using kubernetes, using docker api and geard and also about the deeper understanding of docker containers and how it works internally.

Stay tuned!

Introduction

when we launched the vm by using image centos7.0. In vm , I tried to install the git yum install git that keyboard interrupt error is raised on the screen.To solve this error, first we install deltarpm package.

Delta RPM packages contain the difference between an old and a new version of an RPM package. This means the whole new RPM does not have to be downloaded saving bandwidth.

To use delta RPMs install the deltarpm package

        yum install deltarpm
        yum provides ‘*/applydeltarpm’

next we go to update this package

        yum update

Finally install git ,

        yum install git

Now the git installed successfully.The keyboard interrupt error is solved.

Application configuration

A config file is a reasonable way to maintain different environments such as development and production.

For most applications configuration includes logging levels, port bindings, and database settings. These settings are typically stored in environment variables, for example:

 export MEGAM_HOME="/var/lib/megam"
 	export MEGAM_HOST="megam.io"
 export MEGAM_PASSWORD="password"

Once set I can read configuration values from the command line like this:

$ echo $MEGAM_HOME /var/lib/megam

It’s equally as easy to access environment variables in Go using the os package:

package main

import (
	"fmt"
	"os"
)

func main() {
	home := os.Getenv("MEGAM_HOME")
	fmt.Printf("Megam home is set to: %v\n", home)
}

The biggest drawback to storing configuration in the environment is that you can only store string values; it’s up to you to convert these strings into values that can be used by your application.

It is highly recommended that we go for configuration files instead of storing our configuration settings using environment variables.

sample conf yaml file:

MEGAM_HOME: /var/lib/megam
riak:
 	 url: localhost:8087
 	 bucket: accounts

######How to use this conf file in our go application ? tsuru/config is one of way to parse and use conf file in go application

tsuru/config Introduction

Config is a Go package to manage yaml configuration files.

For usage information, read tsuru package documentation: https://godoc.org/github.com/tsuru/config.

Usage

import "github.com/tsuru/config"

Package config provide configuration facilities, handling configuration files in yaml format.

Get values

To get values from conf file use following code in your application

s, err := config.GetString("conf_string")

It gets the string value from conf file.

some examples:

conf file:

MEGAM_HOME: /var/lib/megam
riak:
 	 url: localhost:8087
 	 bucket: accounts

First we get “MEGAM_HOME”

home, err := config.GetString("MEGAM_HOME")
if err != nil {
	fmt.Printf("Megam home getting error: %v\n", err)
}
fmt.Printf("Megam home is set to: %v\n", home)

Second we get riak url from file

url, err := config.GetString("riak:url")
if err != nil {
	fmt.Printf("getting error: %v\n", err)
}
fmt.Printf("Riak url is set to: %v\n", url)

To get subdivision of key then used ”:” in between the keys.

###supported formats

  • func GetBool(key string) (bool, error)
  • func GetDuration(key string) (time.Duration, error)
  • func GetFloat(key string) (float64, error)
  • func GetInt(key string) (int, error)
  • func GetList(key string) ([]string, error)
  • func GetString(key string) (string, error)
  • func GetUint(key string) (uint, error)

conclusion

Hopefully this article has hightlighted the steps in using configuration files in a go project

What is concurrency?

Execution of multiple processes independantly and they may or may not be getting executed at the same time. Imagine a webserver gets multiple smaller requests and they all concurrently gets executed. It is basically multitasking on a single-core processor.

Oh, wait! then Concurrency is parallelism?

Parallelism is about running multiple processes at the same time, but concurrency is about dealing with a lot of processes may or may not be getting executed at the same time. Confusing?

Parallelism

  • Simultaneous Execution
  • Running multiple threads on different processors at the same time against each other.
  • Requires multiple cores.

Concurrency

  • Independent Execution
  • Running multiple light-weight concurrent processes(goroutines!!, hold on! we’ll see what goroutines are)
  • Single core

Note: Using GOMAXPROCS, it is possible to run goroutines on different logical processors by configuring the runtime.

So, ‘Concurrency is not parallelism’ but,

‘concurreny can achieve parallelism’.

why Go?

Go’s concurrency primitives are way too good and it is easy to write concurrent programs. Go uses goroutines to achieve concurreny and importantly, it makes communication between goroutines a lot easier.

what are goroutines ?

Fundamentally, goroutine is a function which runs concurrently with other functions.

So, here is an example

package main
import (
    "fmt"
       )

func print(s string){
        for i := 0; i < 5; i++ {
             fmt.Println(s)
     }
 }


 func main() {
     print("Current thread")

     go print("bye")  

     var input string
     fmt.Scanln(&input)
 }

Compile the above code to understand how the goroutines work. We’ll look into more details in a bit.

what are channels?

Channels are basically pipelines that are used to communicate between two goroutines. Channels can be used for synchronization of goroutines, etc.

package main

import (
    "fmt"
    "time"
    )

    func put(c chan string){
     for {
        c <- "Tadaa!"
       }
      }

    func print(c chan string){
      for {
        message := <- c
        fmt.Println(message)
        time.Sleep(time.second * 2)
        }
       }
     func main() {
       var c chan string = make(chan string)

       go put(c)
       go print(c)

       var input string
       fmt.Scanln(&input)
      }

How awesome right? Thats how cool goroutines and channels are. It makes it a lot easier and more interesting to program.

why are goroutines super cool?

Concurreny was possible in other languages too, but what made concurreny a cherrypie in go?

goroutines are light-weight, other languages like java, uses threads.

Creating a goroutine hardly takes 2kb of stackspace(it was 8kb till go 1.3) where as a thread consumes 1MB per stack space.

I recommend all of you reading this to watch Go Concurrency Patterns by Rob Pike

Few Links:

Fundamentals of concurrent programming

Concurreny goroutines and GOMAXPROCS

Ceph is one of the most interesting distributed storage systems available, with a very active development and a complete set of features that make it a valuable candidate for cloud storage services

Assumptions
Ceph version: 0.87
Installation with ceph-deploy
Operating system for the Ceph nodes: Ubuntu 14.04
Preparing the storage

WARNING: preparing the storage for Ceph means to delete a disk’s partition table and lose all its data. Proceed only if you know exactly what you are doing!

Ceph will need some physical storage to be used as Object Storage Devices (OSD) and Journal. Ceph supports ext4, btrfs and xfs. I tried setting up clusters with ext4.

I have three storage partitions as

$ df -h
/dev/sdb1       115G   /storage1
/dev/sdb2       115G   /storage2
/dev/sda3       115G   /storage3
Install Ceph

The ceph-deploy tool must only be installed on the admin node. Access to the other nodes for configuration purposes will be handled by ceph-deploy over SSH (with keys).

Add Ceph repository to your apt configuration

echo deb https://ceph.com/debian-giant/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list

Install the trusted key with

wget -q -O- 'https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | sudo apt-key add -

Install ceph-deploy

sudo apt-get -y update
sudo apt-get -y install ceph-deploy ceph-common ceph-mds
Setup the admin node

Each Ceph node will be setup with an user having passwordless sudo permissions and each node will store the public key of the admin node to allow for passwordless SSH access. With this configuration, ceph-deploy will be able to install and configure every node of the cluster.

NOTE: the hostnames (i.e., the output of hostname -s) must match the Ceph node names!

Add a ceph user on each Ceph cluster node (even if a cluster node is also an admin node) and give it passwordless sudo permissions

$ sudo useradd -d /home/ceph -m ceph -s /bin/bash
$ sudo passwd ceph
<Enter password>
$ echo "ceph ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph
$ sudo chmod 0440 /etc/sudoers.d/ceph

Edit the /etc/hosts file to add mappings to the cluster nodes. Example:

$ cat /etc/hosts
127.0.0.1       localhost
192.168.1.100 cephserver

to enable dns resolution with the hosts file, install dnsmasq

$ sudo apt-get install dnsmasq

Generate a public key for the admin user and install it on every ceph nodes

$ ssh-keygen

Setup an SSH access configuration by editing the .ssh/config file. Example:

Host cephserver
  Hostname cephserver
  User ceph
Setup the cluster

Administration of the cluster is done entirely from the admin node.

step1: Move to a dedicated directory to collect the files that ceph-deploy will generate. This will be the working directory for any further use of ceph-deploy

$ mkdir ceph-cluster
$ cd ceph-cluster

step2: Deploy the monitor node(s) – replace mon0 with the list of hostnames of the initial monitor nodes

$ ceph-deploy new cephmaster
[ceph_deploy.cli][INFO  ] Invoked (1.4.0): /usr/bin/ceph-deploy new cephmaster
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][DEBUG ] Resolving host cephmaster
[ceph_deploy.new][DEBUG ] Monitor cephmaster at 192.168.1.100
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[ceph_deploy.new][DEBUG ] Monitor initial members are ['cephmaster']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['192.168.1.100']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...

Tip Assuming only one node for your Ceph Storage Cluster, you will need to modify the default osd crush chooseleaf type setting (it defaults to 1 for node) to 0 for device so that it will peer with OSDs on the local node. Add the following line to your Ceph configuration file:

osd crush chooseleaf type = 0

Tip If you deploy without executing foregoing step on a single node cluster, your Ceph Storage Cluster will not achieve an active + clean state. To remedy this situation, you must modify your CRUSH Map.

step3: Install ceph in the node

$ ceph-deploy install  cephmaster

atep4: Create monitor and gather keys

$ ceph-deploy mon create-initial

Note The content of the working directory after this step should look like

cadm@mon0:~/my-cluster$ ls
ceph.bootstrap-mds.keyring  ceph.bootstrap-osd.keyring  ceph.client.admin.keyring  ceph.conf  ceph.log  ceph.mon.keyring  release.asc

step4: Prepare and activate the disks (ceph-deploy also has a create command that should combine this two operations together, but for some reason it was not working for me)

ceph-deploy osd prepare cephmaster:/storage1 cephmaster:/storage2 cephmaster:/storage3
ceph-deploy osd activate cephmaster:/storage1 cephmaster:/storage2 cephmaster:/storage3

step5: Copy keys and configuration files

$ ceph-deploy admin cephmaster

step6: Ensure proper permissions for admin keyring

$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring

Check the Ceph status and health

$ ceph health
$ ceph status
$ ceph osd tree

Ceph setup is ok when the health is HEALTH_OK. We the software engineers generally don’t care about the WARNINGS, but in ceph HEALTH_WARN is like error.

Revert installation

There are useful commands to purge the Ceph installation and configuration from every node so that one can start over again from a clean state.

This will remove Ceph configuration and keys

ceph-deploy purgedata cephmaster
ceph-deploy forgetkeys

This will also remove Ceph packages

ceph-deploy purge cephmaster

Before getting a healthy Ceph cluster I had to purge and reinstall many times, cycling between the “Setup the cluster”, “Prepare OSDs and OSD Daemons” and “Final steps” parts multiple times, while removing every warning that ceph-deploy was reporting.