Openstack Swift

by HEIG-Cloud

Tags:

Posted on Thu, Dec 17, 2015


What is OpenStack Swift

Swift is OpenStack’s implementation of an object storage. In cloud computing, we often refer to block and object storage. OpenStack provides two different components, each responsibe for one of these parts:

  • Cinder: Block storage
  • Swift: Object storage

Note: We won’t be covering Cinder in this article.

Object storage VS Block storage

What is the difference between these two types of storage? Well, the name probably says it all, one is on the block level, it means that you can install an operating system, have a filesystem and use it as you would use any physical drive you may have.

The other one is about objects, it means you can store data (images, music, videos, archives, all kind of data). So you guessed right, the title is misleading with the “VS”, because they don’t provide the same kind of service.

More about Swift

Here are a few cool things about Swift:

  • The project is considered “mature” by the community.
  • Quite easy to use.
  • Manages the repplication of your objects (3 copys by default)
  • Works well with OpenStack (integrated in the dashboard)
  • So on…

How do I access data

Swift, like all the components of the stack uses a REST API to communicate. It’s also true for the user. It’s possible to interact with swift in 3 separate ways:

  • CLI: using the CLI (Command-Line Interface). To do so, you need to install the python client that provides the command. Usually, the openstack clients packages are named like this: python-*name*client, which in the case of swift is:

    python-swiftclient

    To use it, you only need to install the following packages:

    $ sudo apt-get install python-swiftclient python-keystoneclient
    
    
    OR
    
    
    $ sudo pip install python-swiftclient python-keystoneclient
    

    Note that we also install the keystone client because by default swift will use keystone for authentification. It’s possible to change this, swift has its own auth methods but that would only add some needless complications so we’ll stick with keystone.

    Here are a few examples of what you can do with the cli (you need to source an openrc.sh script first or use the env vars):

    $ swift -V 3 upload container-name file
    $ swift -V 3 download container-name file
    $ swift -V 3 post container-name 
    

    So, in order:

    • Uploads a file (or a directory) on the given container
    • Downloads a file (or a directory) from the given container
    • Creates a new container with the given name (if does not exist)
  • Python API

    You can also work on your own version of the python client, we don’t do that so this is just for reference. More info can be found here.

  • REST API

    Usually, the cli python clients do nothing more than provide you with friendly commands you can use on your terminal. All these commands exist with the REST API as well (the basic rule is, you should be able to do the same things with the CLI and the REST API). As for Swift, we never really had to work with the API, but once again, you can check it here for reference.

What can I do with an object storage

A fair question indeed. You can either use it for storage, which would probably be the primary option but there are other things one can do with an object store. This image will give you a good hint:

Hadoop, Spark and Scala

If you have ever worked with Hadoop or Spark, chances are pretty good that you are familiar with HDFS. If not, HDFS is simply a shared filesystem used by the hadoop workers to share data (read and write). Hadoop needs HDFS and Spark may use it as well but is not entirely dependant. It should be possible (as we read on Spark’s website) to replace HDFS by Swift. Why would we do that though? Well, first reason, we already have a working swift cluster. An other reason is, to use HDFS we need to install it on all the workers, meaning if we want to have a “virtual” cluster, we still need to give our instances a lot of block storage. Also, deleting the instances will delete the HDFS cluster, which won’t be the case with Swift.

We are still working on this but the big part of the work is to adapt our codes to work with swift (login, access, etc). We have high expectations regarding this so we’ll keep working on it until it works :)

Problems with Swift

We’ve had a few problems while installing Swift. First, swift does not create logs files, it uses syslog for logging. It means that you need to consult this file to see what’s up: > /var/log/syslog

Is this really a problem? No but it’s worth knowing :)

Also, Swift makes a loooooot of logs, because it’s very verbose. Once again, having an ELK Stack is very useful :P (you can check our post about it if you are interested).

The installation in itself is pretty straightforward, once the conf files are downloaded, change them accordingly, create the rings, distribute them accross your swift-storage nodes and check if everything works fine.