JSON pattern matching with sed, perl and regular expressions

Why VIM? Sooner or later there comes the day when your easy-to-use IDE becomes useless for handling huge files. There aren’t many editors capable of working with very large files, like production logs for instance. I’ve recently had to analyze a 100 MB one-line JSON file and once more VIM saved the day. VIM, like many other Unix utilities, is both tough and brilliant. Git interactive rebase uses VIM by default, and if you’re still not convinced, maybe this great article will make you change your mind.

MongoDB and the fine art of data modeling

Introduction This is the third part of our MongoDB time series tutorial, and this post will emphasize the importance of data modeling. You might want to check the first part of this series, to get familiar with our virtual project requirements and the second part talking about common optimization techniques. When you first start using MongoDB, you’ll immediately notice it’s schema-less data model. But schema-less doesn’t mean skipping proper data modeling (satisfying your application business and performance requirements). As opposed to a SQL database, a NoSQL document model is more focused towards… Read More

A beginner’s guide to MongoDB performance turbocharging

Introduction This is the second part of our MongoDB time series tutorial, and this post will be dedicated to performance tuning. In my previous post, I introduced you into our virtual project requirements. In short, we have 50M time events, spanning from the 1st of January 2012 to the 1st of January 2013, with the following structure: We’d like to aggregate the minimum, the maximum, and the average value as well as the entries count for the following discrete time samples: all seconds in a minute all minutes in an hour all… Read More

MongoDB time series: Introducing the aggregation framework

In my previous posts I talked about batch importing and the out-of-the-box MongoDB performance. Meanwhile, MongoDB was awarded DBMS of the year 2013, so I therefore decided to offer a more thorough analyze of its real-life usage. Because a theory is better understood in a pragmatic context, I will first present you our virtual project requirements. Introduction Our virtual project has the following requirements: it must store valued time events represented as v=f(t) it must aggregate the minimum, maximum, average and count records by: seconds in a minute minutes in an hour… Read More

A beginner’s guide to ACID and database transactions

Introduction Transactions are omnipresent in today’s enterprise systems, providing data integrity even in highly concurrent environments. So let’s get started by first defining the term and the context where you might usually employ it. A transaction is a collection of read/write operations succeeding only if all contained operations succeed. Inherently a transaction is characterized by four properties (commonly referred as ACID): Atomicity Consistency Isolation Durability