NoSQL databases (aka non-relational databases) come with both advantages and disadvantages. On the plus side, they are more scalable than traditional relational databases and can store a variety of formats. Additionally, they are easy to use, and their flexibility can speed up development, especially in a cloud computing environment. NoSQL databases were developed as a […]
Case Study: Deriving Spark Encoders and Schemas Using Implicits
Click to learn more about author Dávid Szakallas. In recent years, the size and complexity of our Identity Graph, a data lake containing identity information about people and businesses around the world, begged the addition of Big Data technologies in the ingestion process. We used Apache Pig initially, and then migrated to Apache Spark a […]
A Database Reverse Engineering Case Study
Click here to learn more about author Michael Blaha. In a blog last year we discussed database archaeology, which is another name for database reverse engineering. Database reverse engineering is the inverse to normal development. We start with an application and work backwards to understand the software and infer its content. This month we’ll take a […]
Ten Reasons Why Developers Ignore Data Models — Revisited
Click here to learn more about author Michael Blaha. Here is a follow- up on my blog from several months ago on “Ten Reasons Why Developers Ignore Data Models.” Paraphrasing, reader Lawrence Hecht asked via Twitter if there is any way to tell from a schema if the developers had used a data model. A […]