Announcing the “Buri” BentoBox v1.1.0

Kiji fans,

We are pleased to announce that you can now download a new version of the BentoBox SDK for Kiji, a framework for developing Big Data Applications. BentoBox is powered by Hadoop and HBase via Cloudera’s Distribution for Hadoop (CDH) version 4.2.

On June 14th, WibiData hosted the first KijiCon, co-sponsored by Cloudera and Opower. Over 60 developers from forward-thinking companies attended training and hack sessions with the goal of building Big Data Applications using the Kiji framework. We are excited to welcome many more Kiji users to the community and encourage everyone to voice their feedback. You can get up and running with Kiji in just a few minutes by downloading the BentoBox from http://www.kiji.org. If you’re one of our existing users, use the command bento upgrade to upgrade your BentoBox.

This marks the first release of BentoBox 1.1, called “Buri”. Buri is compatible with CDH 4.1 or CDH 4.2 and includes updates to all components in the Kiji framework.

Buri includes the following software:

  • KijiSchema 1.1.0
  • KijiSchema shell 1.1.0
  • KijiMR 1.0.0
  • KijiMR Library 1.0.0
  • Kiji Hive adapter 0.4.0
  • KijiREST 0.2.0
  • KijiScoring 0.3.0
  • KijiExpress 0.5.0

BentoBox also includes an examples directory with tutorials and examples that use the Kiji framework.

Below is an overview of updates made to each component. For full details on changes made, see the RELEASE_NOTES.txt files included with BentoBox.

Updates to KijiSchema
KijiSchema 1.1.0 includes several bug fixes, as well as changes for compatibility with HBase 0.94. New features include:

  • A new table layout version (“layout-1.2.0″) that allows users to control more HBase storage properties (like maximum HFile size and Bloom filters).
  • A new class, KijiRowKeyComponents, that can be used to specify keys to a KijiTableKeyValueStore without knowing the Kiji table’s row key format.

Updates to KijiMR and the KijiMR Library
This marks the first release of KijiMR 1.0.0 and the KijiMR Library 1.0.0, the first stable API releases of these components. Future 1.x versions of KijiMR and the KijiMR Library will maintain compatibility with this version.

As we have developed the release candidates for version 1.0.0, several incompatible changes were made while defining and focusing the API. Please check the release notes for incompatible changes with previous “rc” releases. Email user@kiji.org if you need help upgrading your code to use version 1.0.0.

KijiMR includes several improvements:

  • A new InMemoryMapKeyValueStore, that can be used to pass small amounts of data to producers, gatherers, etc.
  • The ability to bulk load entire directories of HFiles after running an import.
  • Improved buffering when writing to Kiji tables from MapReduce jobs.

Updates to KijiSchema shell
KijiSchema shell 1.1.0 includes several changes for compatibility with other components of the Kiji system. It also includes several new features:

  • The ability to create or remove Kiji instances using ‘CREATE INSTANCE’ and ‘DROP INSTANCE’ statements.
  • The ability to specify a number of initial regions for a table when creating it using a ‘CREATE TABLE’ statement.

Updates to KijiExpress
This release of KijiExpress includes several bug fixes and changes made for compatibility with other components of the Kiji system. It also includes the first features being developed for the KijiExpress Model Lifecycle, a system that can be used to define and execute the phases used in the development and deployment of real-time models. The code for the KijiExpress music tutorial, included with BentoBox, contains an example of using the Model Lifecycle to extract features from a Kiji table and apply them to a model.

Other new features include:

  • Users can pass Hadoop -D configuration flags from the command line when running jobs.
  • A tuple field can now be used to specify the qualifier of a map-type column family to be written to by a KijiExpress flow. Specify the tuple field when creating a KijiOutput.

As always, thank you to our users–especially those who have provided feedback or reported bugs to user@kiji.org–and to contributors who submitted patches. We are excited to develop software with and for you, and look forward to providing more useful improvements and tools in the future.