Friday, October 24, 2014

Kylin - Next Gen Open Source OLAP Engine for Big Data

Just come across the news that eBay has released to the open-source community their distributed analytics engine: Kylin (http://kylin.io). It doesn't just make it open for the core code base, but also Shell Client, RPC Server, Jobs Scheduler and relevant tools. What it means is that a whole set of tools for SQL interface and multi-dimensional analysis (OLAP) is available for free on Hadoop to support extremely large datasets.

Regarding query latency, Kylin has claimed to reduce it on Hadoop for 10+ billion rows of data down to sub-second level (better than Hive queries for the same dataset). As compared with Kylin, mainstream OSDDMS like Cassandra (SEDA-based architecture) has to let a request to hop between multiple threadpools during processing, increasing latency. Nonetheless, it can still be fixed by including lightweight threads like Kilim, a more-efficient executor service, or a new approach entirely.

For standard compatibility, Kylin supports most ANSI SQL query functions in its ANSI SQL on Hadoop interface.

Kylin also has the seamless integration with BI Tools like Tableau and other third-party applications.

With nice feature like MOLAP cube query, Kylin let users define a data model and then pre-build within itself to support more than 10+ billions of raw data records.

Apart from MOLAP, the next generation of Kylin will provide hybrid OLAP (HOLAP) to combine real-time/near-real-time and historical results for business decisions by offering a single entry point for front-end queries.

Furthermore, Kylin provides compression and encoding to reduce storage.

The business units in eBay have been putting Kylin in production for some time. They have carried out the analysis of 12+ billion source records generating 14+ TB cubes. Its 90% query latency is less than 5 seconds, without using Hive query or Shell command.

Slideshow: http://slidesha.re/1wilDxC

News release: http://www.ebaytechblog.com/2014/10/20/announcing-kylin-extreme-olap-engine-for-big-data/#.VEnOb4uUcQ7



No comments:

Post a Comment