Downloads

The Kiji project is organized as a framework consisting of multiple modules.

You can download individual modules, or grab the BentoBox, which contains all Kiji framework components, assembled in a self-contained download. The BentoBox edition also comes with a standalone Hadoop & HBase cluster, for a zero-configuration way to get off the ground. Both KijiSchema and the BentoBox come with command-line tools for interacting with Kiji tables and data. After downloading, scroll down for the Quick Start Guide or check out our documentation and get hacking with Kiji!

 

 

Starting a Maven project that uses Kiji? Read the Maven setup instructions.
Component version Download
Kiji BentoBox “Ebi” (Standalone edition) 2.0.1 (tar.gz)
Includes KijiSchema, KijiMR, and all tools, examples, Hadoop, and HBase

Older versions:
(2.0.1) (1.4.3) (1.4.2) (1.4.1) (1.4.0) (1.3.0) (1.2.6) (1.2.5) (1.2.4) (1.2.3) (1.2.1) (1.1.6) (1.1.5) (1.1.4) (1.1.3) (1.1.2) (1.1.1) (1.1.0) (1.0.5) (1.0.4) (1.0.3) (1.0.2) (1.0.1) (1.0.0)

     KijiSchema 1.4.0 (tar.gz)
The main KijiSchema libraries and tools; requires separate Hadoop & HBase
KijiSchema DDL Shell 1.3.5 (tar.gz)
An interactive shell for defining and working with table schemas
     KijiMR 1.2.6 (tar.gz)
The main KijiMR libraries and tools; requires KijiSchema, Hadoop, and HBase.
     KijiMR Library 1.1.5 (tar.gz)
Additional KijiMR libraries, bulk importers, etc.; requires KijiMR
     Kiji Hive Adapter 0.10.0 (tar.gz)
A SerDe for Hive to use Kiji tables as external Hive tables.
     KijiREST server 1.3.0 (tar.gz)
A REST API for interacting with KijiSchema

You may also be interested in the Ruby client gem

     KijiScoring 0.13.0 (tar.gz)
A module for applying trained models to score Kiji entities in real-time.
     Kiji Model Repository 0.7.1 (tar.gz)
A module to track and deploy Kiji models to production.
     Kiji Modeling 0.8.1 (tar.gz)
A modularization of the Kiji model development cycle.
     Kiji Express 2.0.1 (tar.gz)
A Scala DSL for analyzing and modeling data in Kiji tables
     Kiji Express Examples 2.0.1 (tar.gz)
Example projects for Kiji Express
     Kiji Express Music Tutorial 1.0.1 (tar.gz)
An “Express” version of the Kiji Music Tutorial
KijiSchema Phonebook Example 1.1.2 (tar.gz)
Code example of a KijiSchema application
KijiMR Music Example 1.1.5 (tar.gz)
Code example of a KijiMR application.

All downloads are tar.gz archives; you should expand them with tar xzf <filename>.

Looking for the source? Get it at github.com/kijiproject.

Installation

Prerequisites

Kiji has been tested on GNU/Linux (Ubuntu 12.04, 12.10 and CentOS-6.3) and
Mac OS X (10.8.x, 10.7.x, 10.6.x).

Kiji is a Java-based system. To run Kiji applications, you will need to download the Oracle Java JRE. To develop Kiji applications, you will need the Oracle Java JDK. We have tested this system with the Oracle Java “Hotspot” JVM, version 6. Other JVMs are not supported at this time. If you are running OS X, this is installed by default.

Kiji is built on top of Hadoop and HBase. Our system depends on Cloudera’s Distribution including Apache Hadoop, version 4 (CDH4). If you downloaded the BentoBox, a zero-configuration development/test cluster is included in the package. If you downloaded KijiSchema, you will need to install and configure Hadoop and HBase separately.

Configuring Your Environment

After downloading either the BentoBox (kiji-bento-version-release.tar.gz) or KijiSchema (kiji-schema-version-release.tar.gz), unzip the archive with the command tar xzf filename. This will create to a directory named kiji-bento-version/ or kiji-schema-version/.

For convenience, $KIJI_HOME should be set to this directory. For example:

export KIJI_HOME=/path/to/kiji-bento-(version)

You should edit your .bashrc file to contain this line so that it’s incorporated in your environment for future bash sessions.

If you’re using the BentoBox edition, you can set up your environment as follows:

source $KIJI_HOME/bin/kiji-env.sh
bento start

This starts the bento minicluster and updates your environment variables to use it. After a few seconds you should be able to view the status page of your own mini HBase cluster at http://localhost:60010/. (If you don’t see it right away, wait 10 seconds and reload the page.) You’re now ready to proceed with installing Kiji onto your cluster.

If you installed only KijiSchema (not the BentoBox), you should instead set $HADOOP_HOME and $HBASE_HOME and make sure the Hadoop HDFS and HBase services are running.

For help configuring CDH, see Cloudera’s CDH4 Installation Guide.

Installing Kiji System Tables

Kiji will manage tables for you on top of your HBase cluster. Each Kiji table corresponds to a physical HBase table. There are also a number of system tables that hold metadata maintained by Kiji itself. Before you can use Kiji, you must run its install command and create these system tables.

Issue the following command from your Kiji release directory to install a Kiji instance with the name default:

kiji install

You should see output like the following:

Creating kiji instance: kiji://localhost:2181/default/
Creating meta tables for kiji instance in hbase...
13/02/22 21:01:20 INFO org.kiji.schema.KijiInstaller: Installing kiji instance 'kiji://localhost:2181/default/'.
13/02/22 21:01:25 INFO org.kiji.schema.KijiInstaller: Installed kiji instance 'kiji://localhost:2181/default/'.
Successfully created kiji instance: kiji://localhost:2181/default/

Now you can create some tables in the DDL schema-shell, explore the Phonebook example, and get started building a Maven project with Kiji. See the quickstart section next to get acquainted with the tools.

Quickstart Guide

These instructions assume you downloaded the BentoBox distribution and sourced the environment setup script by running source bin/kiji-env.sh from the root directory of your unzipped BentoBox distribution. Please refer to the section describing how to configure your environment for more details.

Creating a Table

Tables in Kiji can be created through the command line with a JSON-encoded layout file or interactively through Kiji’s schema shell tool. We’ll create a table of usernames and email addresses using the schema shell tool.

Issue the following command from the Kiji directory to open the schema shell:

kiji-schema-shell

You’ll be presented with a schema> prompt. This shell permits you to create and alter Kiji tables with an SQL-like data description language. Issue the following command to create a user table:

CREATE TABLE users WITH DESCRIPTION 'A table for user names and email addresses'
    ROW KEY FORMAT HASH PREFIXED(2)
    WITH LOCALITY GROUP default WITH DESCRIPTION 'main storage' (
      MAXVERSIONS = INFINITY,
      TTL = FOREVER,
      INMEMORY = false,
      COMPRESSED WITH GZIP,
      FAMILY info WITH DESCRIPTION 'basic information' (
        name "string" WITH DESCRIPTION 'the user\'s name',
        email "string"));

The table is immediately created in HBase. Type quit to exit the schema shell.

quit

If you now issue the command:

kiji ls kiji://localhost:2181/default

You will see your default instance and the users table. Your output should look like:

kiji://localhost:2181/default/users

For more about the schema shell, consult the user guide.

Inserting Some Data

Now that we have a users table, let’s insert some data into it. Kiji has a put tool to insert data into individual cells. Issue the following two commands to insert a name and email address for a user named Ophelia Phelps:

kiji put \
    --target=kiji://.env/default/users/info:name \
    --entity-id='kiji="ophie@kiji.org"' \
    --value='"Ophelia Phelps"' \
    --schema='"string"'
kiji put \
    --target=kiji://.env/default/users/info:email \
    --entity-id='kiji="ophie@kiji.org"' \
    --value='"ophie@kiji.org"' \
    --schema='"string"'

Note that the parameters entity-id and value use a pair of single quotes around a pair of double quotes. You can view the result of these commands by issuing the command:

kiji scan default/users

You should see output like the following:

Scanning kiji table: kiji://.env/default/users/
entity-id=['ophie@kiji.org'] [1361596447940] info:name
                                 Ophelia Phelps
entity-id=['ophie@kiji.org'] [1361596457342] info:email
                                 ophie@kiji.org

The first of each pair of lines is the entity id of the row (hash-prefixed in this table), the timestamp for the data, and the column name. The second line is the value we inserted.

The kiji tools can use either absolute or relative URIs to reference tables; default/users is a synonym for kiji://.env/default/users/.

Inserting data from the command line is fine for a few values, but if you need to generate a lot of data, it’s faster to write a program to perform the inserts. Kiji includes an example tool named synthesize-user-data for the users table format. Let’s use it to generate 10 rows of data:

kiji synthesize-user-data \
    --table=kiji://.env/default/users \
    --num-users=10

You can now view the new users with:

kiji scan default/users

The kiji command has many other tools. Issue the command kiji help to learn more. You can also learn about individual tools. For example:

kiji ls --help

Additional information about the command line tools is also available in the user guide.

This concludes the quickstart guide. If you’re done, you can shut down the cluster by issuing the command:

bento stop

Next Steps

To get started using Kiji, consult the phonebook tutorial. The phonebook tutorial shows how to write programs that store, access, and analyze data in Kiji tables.

Once you are familiar with the fundamental operations on Kiji table, you can move on to learning how to use KijiExpress with the music recommendation tutorial.

Training & Services

WibiData periodically offers public training courses for data scientists and developers with Java experience. Prior experience with Hadoop and HBase is helpful but not necessary. If you have questions regarding training, or are interested in consulting & private training around developing applications on HBase and Hadoop, please contact darcy@wibidata.com.