Imply, the company founded by the original creators of Apache Druid®, today unveiled at a virtual event the first milestone in Project Shapeshift, the 12-month initiative designed to solve the most pressing issues developers face when building analytics applications. The announcement includes a cloud database service built from Apache Druid and the private preview of a multi-stage query engine for Druid. Together, these innovations show how Imply delivers the most developer-friendly and capable database for analytics applications.
Developers are increasingly at the forefront of analytics innovation, driving an evolution in analytics beyond traditional BI and reporting to modern analytics applications. These applications—fueled by the digitization of businesses—are being built for real-time observability at scale for cloud products and services, next-gen operational visibility for security and IT, revenue-impacting insights and recommendations and for extending analytics to external customers. Apache Druid has been the database-of-choice for analytics applications trusted by developers of 1000+ companies including Netflix, Confluent and Salesforce.
“Today, we are at an inflection point with the adoption of Apache Druid as every organization now needs to build modern analytics applications,” said Fangjin Yang, CEO and co-founder, Imply. “This is why it’s now time to take Druid to the next level. Project Shapeshift is all about making things easier for developers, so they can drive the analytics evolution inside their companies.”
As developers turned to Apache Druid to power interactive data experiences on streaming and batch data with limitless scale, Imply saw tremendous opportunity to simplify the end-to-end developer experience and extend the Druid architecture to power more analytics use cases for applications from a single database.
Real-Time Database as a Service Built from Apache Druid
Building analytics applications involves operational work for software development and engineering teams across deployment, database operations, lifecycle management and ecosystem integration. For databases, cloud database services have become the norm as they remove the burden of infrastructure from cluster sizing to scaling and shift the consumption model to pay-as-you-use.
Imply Polaris, however, is a cloud database service reimagined from the ground up to simplify the developer experience for analytics applications end-to-end. Much more than cloudifying Apache Druid, Polaris drives automation and intelligence that delivers the performance of Druid without needing expertise, and it provides a complete, integrated experience that simplifies everything from streaming to visualization. Specifically Polaris introduces:
- Fully-Managed Cloud Service – Developers can build modern analytics applications without needing to think about the underlying infrastructure. No more sizing and planning required to deploy and scale the database. Developers can start ingesting data and building applications in just a few minutes.
- Database Optimization – Developers get all the performance of Druid they need without turning knobs. The service automates configurations and tuning parameters and includes built-in performance monitoring that ensures the database is optimized for every query in the application.
- Single Development Experience – Developers get a seamless, integrated experience to build analytics applications. A built-in, push-based streaming service via Confluent Cloud and visualization engine integrated into a single UI makes it simple to connect to data sources and build rich, interactive applications.
“Polaris is built on the core tenets of Apache Druid—flexibility, efficiency and resiliency—and packages them into a cloud service that deploys instantly, scales effortlessly and doesn’t requires any Druid expertise, enabling any developer to build modern analytics applications,” said Jad Naous, chief product officer, Imply.
“We chose Apache Druid to power our analytics application to get real-time traffic visibility across one of the world’s largest global tier-1 IP backbones,” said Paolo Lucente, big data architect at the global IP network division of NTT Ltd. “We are looking forward to deploying Imply Polaris to continue to get the interactivity we need on a simple cloud-based service without having to worry about maintenance.”
Evolving the Druid Architecture
From its inception, Druid has uniquely enabled developers to build highly interactive and concurrent applications at scale, powered by a query engine built for always-on applications with sub-second performance at TB to PB+ scale. Increasingly, however, developers need data exports, reporting and advanced alerting included with their applications, requiring additional data processing systems to deploy and manage.
Today, Imply introduces a private preview of a multi-stage query engine, a technical evolution for Druid that reinforces its leadership as the most capable database for analytics applications. The multi-stage query engine—in conjunction with the core Druid query engine—will extend Druid beyond interactivity to support the following new use cases in a single database platform:
- Druid for Reporting – Improved ability to handle long-running, heavyweight queries to give developers a single database for powering applications that require both interactivity and complex reports or data exports. Cost-control capabilities make these heavyweight queries affordable.
- Druid for Alerting – Building on Druid’s longstanding capability to combine streaming and historical data, the multi-stage query engine enables alerting across a large number of entities with complex conditions at scale.
- Simplified and More Capable Ingestion – Druid has always provided very high concurrency—very fast queries across large data sets. Using the same SQL language that Druid already supports for queries, the new multi-stage query engine enables simplified ingestion from object stores, including HDFS, Amazon S3, Azure Blob and Google GCS with in-database transformation, making data ingestion easy without giving up any of Druid’s power to enable interactive conversations in modern data analytics applications.
“The multi-stage query engine represents the most significant evolution of Druid, an expansion of the architecture that makes it unparalleled in the industry,” said Gian Merlino, co-founder/CTO of Imply and Apache Druid PMC chair. “It brings both flexibility as well as ease to the developer experience. I’m excited that the entire open source community will be able to take full advantage of it.”