In 2002, internet researchers just wanted a better search engine, and preferably one that was open-sourced. That was when Doug Cutting and Mike Cafarella decided to give them what they wanted, and they called their project “Nutch.” Hadoop was originally designed as part of the Nutch infrastructure, and was presented in the year 2005. The […]
Case Study: Deriving Spark Encoders and Schemas Using Implicits
Click to learn more about author Dávid Szakallas. In recent years, the size and complexity of our Identity Graph, a data lake containing identity information about people and businesses around the world, begged the addition of Big Data technologies in the ingestion process. We used Apache Pig initially, and then migrated to Apache Spark a […]
Why You Should Use Kubernetes to Deploy Monolithic Apps
Click to learn more about author Pete Johnson. A Linux shell is a Linux shell is a Linux shell. If you take that attitude, it opens up the possibility of running monolithic applications on Kubernetes. As more and more greenfield development shifts to a Microservices, Cloud-native-based architecture that’s hosted on top of container clusters, every […]
Four Open Source Tools to Start Your Analytics Journey
Click to learn more about author Mark Hensley. As an adjunct instructor at University of Redlands, I teach several courses in the fields of business and technology. In that capacity I get to meet a lot of bright young people with a keen interest in data and Analytics. Unfortunately, the little financial resources most students […]
The Knowledge Representation Corner: Procedural vs. Declarative
Click to learn more about author Adam Pease. Programmers that are new to ontology may be prone to think that any tool or language can be used to represent terms, definitions, and facts about the world. After all, as programmers, we’re used to solving problems in code and know that whether we use Perl, C++, Java, […]