February 2023 – The Dataist

The Apache Drill PMC is pleased to announce a milestone release of Apache Drill. Since the last release of Drill the team has been hard at work quashing bugs and making overall functionality improvements. The TL;DR includes the following:

New connectors including Apache Iceberg, Delta Lake, Microsoft Access, GoogleSheets, and Box
Efficient cross-cloud query capability
Greatly improved access controls to include user translation support for all storage plugins
Greatly improved query planning and implicit casting.
New BI-focused SQL operators including PIVOT, UNPIVOT, EXCEPT and INTERSECT
New functions for computing regression lines and trends.
New and updated date manipulation functions.

Overall, Drill 1.21 is much more capable and stable than previous versions.

2 Comments

I’ve had a number of conversations recently that have highlighted to me how not understanding people’s assumptions can really hamper conversations. I’m going to highlight questions from two recent conversations, one was with a VC and the other was from a grant for which we applied. My biggest frustration in all this is that how wrong assumptions on both parties prevented deals from moving forward. I assumed that the other parties understood what I understood about SQL, which was wrong. The other parties’ assumptions about what you couldn’t do with SQL led them to assume other things about DataDistillr that was also wrong. In any event, it was the assumptions that bit us both.

Today’s SQL is not What You Learned in 1996

We applied for a US Government grant which unfortunately we did not win. The feedback seemed to center around whether or not SQL was capable of dealing with multi-dimensional data. The reviewers seemed to think that this was not possible and would be extremely difficult. Here’s where the assumptions hurt. I assumed that the reviewer would know that modern SQL tools already support multidimensional data structures. The reviewer assumed the opposite based on their understanding of SQL. It was a lesson learned for me in that I should have put more explanatory language explaining current state. I wish however, that we could have spoken with the reviewer and explained this.

Yes, SQL Supports Nested Data

Unfortunately, SQL hasn’t coalesced around a solid standard for this, but many SQL based systems such as Drill, Spark, Presto, Postgres, MySQL and many others support querying nested data using SQL.

Month: February 2023

Announcing Drill 1.21: New Connectors, Functions and Much Better Stability

It’s The Assumptions That Get You

Today’s SQL is not What You Learned in 1996

Yes, SQL Supports Nested Data