The Kiji project is organized as a framework consisting of multiple modules.
You can download individual modules, or grab the BentoBox, which contains all Kiji framework components, assembled in a self-contained download. The BentoBox edition also comes with a standalone Hadoop & HBase cluster, for a zero-configuration way to get off the ground. Both KijiSchema and the BentoBox come with command-line tools for interacting with Kiji tables and data. After downloading, scroll down for the Quick Start Guide or check out our documentation and get hacking with Kiji!
|Kiji BentoBox “Ebi” (Standalone edition)||2.0.1||(tar.gz)|
|Includes KijiSchema, KijiMR, and all tools, examples, Hadoop, and HBase
|The main KijiSchema libraries and tools; requires separate Hadoop & HBase|
|KijiSchema DDL Shell||1.3.5||(tar.gz)|
|An interactive shell for defining and working with table schemas|
|The main KijiMR libraries and tools; requires KijiSchema, Hadoop, and HBase.|
|Additional KijiMR libraries, bulk importers, etc.; requires KijiMR|
|Kiji Hive Adapter||0.10.0||(tar.gz)|
|A SerDe for Hive to use Kiji tables as external Hive tables.|
|A REST API for interacting with KijiSchema
You may also be interested in the Ruby client gem
|A module for applying trained models to score Kiji entities in real-time.|
|Kiji Model Repository||0.7.1||(tar.gz)|
|A module to track and deploy Kiji models to production.|
|A modularization of the Kiji model development cycle.|
|A Scala DSL for analyzing and modeling data in Kiji tables|
|Kiji Express Examples||2.0.1||(tar.gz)|
|Example projects for Kiji Express|
|Kiji Express Music Tutorial||1.0.1||(tar.gz)|
|An “Express” version of the Kiji Music Tutorial|
|KijiSchema Phonebook Example||1.1.2||(tar.gz)|
|Code example of a KijiSchema application|
|KijiMR Music Example||1.1.5||(tar.gz)|
|Code example of a KijiMR application.|
All downloads are tar.gz archives; you should expand them with
tar xzf <filename>.
Kiji has been tested on GNU/Linux (Ubuntu 12.04, 12.10 and CentOS-6.3) and
Mac OS X (10.8.x, 10.7.x, 10.6.x).
Kiji is a Java-based system. To run Kiji applications, you will need to download the Oracle Java JRE. To develop Kiji applications, you will need the Oracle Java JDK. We have tested this system with the Oracle Java “Hotspot” JVM, version 6. Other JVMs are not supported at this time. If you are running OS X, this is installed by default.
Kiji is built on top of Hadoop and HBase. Our system depends on Cloudera’s Distribution including Apache Hadoop, version 4 (CDH4). If you downloaded the BentoBox, a zero-configuration development/test cluster is included in the package. If you downloaded KijiSchema, you will need to install and configure Hadoop and HBase separately.
Configuring Your Environment
After downloading either the BentoBox (kiji-bento-version-release.tar.gz) or KijiSchema (kiji-schema-version-release.tar.gz), unzip the archive with the command
tar xzf filename. This will create to a directory named
$KIJI_HOME should be set to this directory. For example:
You should edit your .bashrc file to contain this line so that it’s incorporated in your environment for future bash sessions.
If you’re using the BentoBox edition, you can set up your environment as follows:
source $KIJI_HOME/bin/kiji-env.sh bento start
This starts the bento minicluster and updates your environment variables to use it. After a few seconds you should be able to view the status page of your own mini HBase cluster at http://localhost:60010/. (If you don’t see it right away, wait 10 seconds and reload the page.) You’re now ready to proceed with installing Kiji onto your cluster.
If you installed only KijiSchema (not the BentoBox), you should instead set $HADOOP_HOME and $HBASE_HOME and make sure the Hadoop HDFS and HBase services are running.
For help configuring CDH, see Cloudera’s CDH4 Installation Guide.
Installing Kiji System Tables
Kiji will manage tables for you on top of your HBase cluster. Each Kiji table corresponds to a physical HBase table. There are also a number of system tables that hold metadata maintained by Kiji itself. Before you can use Kiji, you must run its install command and create these system tables.
Issue the following command from your Kiji release directory to install a Kiji instance with the name default:
You should see output like the following:
Creating kiji instance: kiji://localhost:2181/default/ Creating meta tables for kiji instance in hbase... 13/02/22 21:01:20 INFO org.kiji.schema.KijiInstaller: Installing kiji instance 'kiji://localhost:2181/default/'. 13/02/22 21:01:25 INFO org.kiji.schema.KijiInstaller: Installed kiji instance 'kiji://localhost:2181/default/'. Successfully created kiji instance: kiji://localhost:2181/default/
Now you can create some tables in the DDL schema-shell, explore the Phonebook example, and get started building a Maven project with Kiji. See the quickstart section next to get acquainted with the tools.
These instructions assume you downloaded the BentoBox distribution and sourced the environment setup script by running
source bin/kiji-env.sh from the root directory of your unzipped BentoBox distribution. Please refer to the section describing how to configure your environment for more details.
Creating a Table
Tables in Kiji can be created through the command line with a JSON-encoded layout file or interactively through Kiji’s schema shell tool. We’ll create a table of usernames and email addresses using the schema shell tool.
Issue the following command from the Kiji directory to open the schema shell:
You’ll be presented with a
schema> prompt. This shell permits you to create and alter Kiji tables with an SQL-like data description language. Issue the following command to create a user table:
CREATE TABLE users WITH DESCRIPTION 'A table for user names and email addresses' ROW KEY FORMAT HASH PREFIXED(2) WITH LOCALITY GROUP default WITH DESCRIPTION 'main storage' ( MAXVERSIONS = INFINITY, TTL = FOREVER, INMEMORY = false, COMPRESSED WITH GZIP, FAMILY info WITH DESCRIPTION 'basic information' ( name "string" WITH DESCRIPTION 'the user\'s name', email "string"));
The table is immediately created in HBase. Type
quit to exit the schema shell.
If you now issue the command:
kiji ls kiji://localhost:2181/default
You will see your
default instance and the
users table. Your output should look like:
For more about the schema shell, consult the user guide.
Inserting Some Data
Now that we have a users table, let’s insert some data into it. Kiji has a put tool to insert data into individual cells. Issue the following two commands to insert a name and email address for a user named Ophelia Phelps:
kiji put \ --target=kiji://.env/default/users/info:name \ --entity-id='kiji="firstname.lastname@example.org"' \ --value='"Ophelia Phelps"' \ --schema='"string"' kiji put \ --target=kiji://.env/default/users/info:email \ --entity-id='kiji="email@example.com"' \ --value='"firstname.lastname@example.org"' \ --schema='"string"'
Note that the parameters entity-id and value use a pair of single quotes around a pair of double quotes. You can view the result of these commands by issuing the command:
kiji scan default/users
You should see output like the following:
Scanning kiji table: kiji://.env/default/users/ entity-id=['email@example.com']  info:name Ophelia Phelps entity-id=['firstname.lastname@example.org']  info:email email@example.com
The first of each pair of lines is the entity id of the row (hash-prefixed in this table), the timestamp for the data, and the column name. The second line is the value we inserted.
The kiji tools can use either absolute or relative URIs to reference tables; default/users is a synonym for kiji://.env/default/users/.
Inserting data from the command line is fine for a few values, but if you need to generate a lot of data, it’s faster to write a program to perform the inserts. Kiji includes an example tool named synthesize-user-data for the users table format. Let’s use it to generate 10 rows of data:
kiji synthesize-user-data \ --table=kiji://.env/default/users \ --num-users=10
You can now view the new users with:
kiji scan default/users
kiji command has many other tools. Issue the command
kiji help to learn more. You can also learn about individual tools. For example:
kiji ls --help
Additional information about the command line tools is also available in the user guide.
This concludes the quickstart guide. If you’re done, you can shut down the cluster by issuing the command:
To get started using Kiji, consult the phonebook tutorial. The phonebook tutorial shows how to write programs that store, access, and analyze data in Kiji tables.
Once you are familiar with the fundamental operations on Kiji table, you can move on to learning how to use KijiExpress with the music recommendation tutorial.
Training & Services
WibiData periodically offers public training courses for data scientists and developers with Java experience. Prior experience with Hadoop and HBase is helpful but not necessary. If you have questions regarding training, or are interested in consulting & private training around developing applications on HBase and Hadoop, please contact firstname.lastname@example.org.