Advertisement

Pilosa Launches Breakthrough Open Source Software to Dramatically Accelerate Data Queries

By on

by Angela Guess

According to a new press release, “Pilosa, an open distributed bitmap index, today launched into public beta. Pilosa decouples the index from data storage and optimizes it for massive scale. The result is dramatically accelerated query speeds across multiple, massive data sets. Pilosa is available today on GitHub. Pilosa solves a fundamental problem in data science. The volume of enterprise data has grown faster than Moore’s Law, yet the speed at which we can read it has stagnated. Despite several years of major advances in databases, the technology that retrieves data has gone untouched and read speeds have lagged far behind write speeds. Pilosa’s technology addresses this problem head-on, dramatically speeding up both queries to existing databases and the process of joining data from multiple stores. ‘The next wave of scientific breakthroughs will come from research projects that work with datasets of a terabyte or more,’ said Higinio (H.O.) Maycotte, CEO of Pilosa. ‘We know how to store that data, but nobody has focused on accelerating access to that data. That changes today. Our commitment to open source ensures that this fundamental problem is solved once and for all’.”

The release goes on, “Because Pilosa is a bitmap index, it is relatively small in volume and runs in-memory rather than on disk. The first version includes production-tested features including single and multi-node index support, replication, algorithm plugins, a data importer, and basic cluster management. There are eight patents in the first version alone. The software helps data scientists and engineers make sense of multiple, massive data sets without purchasing more hardware and without hours-long batch job wait times. Benchmark tests indicate Pilosa queries consistently fast even at high volumes and without increasing complexity or processing rigor. No test exceeded 1.8 seconds and most queries were returned in fractions of a second. A simple query can traverse more than 2 billion edges in one second on commodity cloud hardware, approximating speeds only seen when leveraging expensive hardware such as GPUs.”

Read more at PRweb.

Photo credit: Pilosa

Leave a Reply