Category Archives: Programming

On Not Understanding Data Abstraction, Revisited

We have a technical reading group (aka ‘Nerd Club’) at work. Last week we tackled the paper On Understanding Data Abstraction, Revisited by William Cook. We learned about it through a talk by Kevlin Henney that takes its title from a clause in the paper: It Is Possible to Do Object-Oriented Programming in Java. I really enjoyed reading the paper, but fear I’ve failed to understand important parts of it. Failing to understand isn’t so unusual for me, particularly for papers that have more than a tiny bit of theory, but this one has me frustrated because I’m sure I’m close.

The paper shows that there are two common forms of data abstraction, abstract data types (ADTs) and objects, and that they are different and complementary. The final sentence reads:

Understanding the fundamental differences between objects and ADTs can help in choosing to use them wisely.

and so my goal is to understand those fundamental differences. The paper first characterises the two kinds of abstraction somewhat theoretically, and then talks about their implementation in practical programming languages: Java, Haskell and Smalltalk.

What I understood

I haven’t totally failed to understand. It’s a good paper and there’s loads of great stuff that I can say I got:

Continue reading On Not Understanding Data Abstraction, Revisited

SQL Set Intersection – harder than I thought?

I’ve heard it said that SQL is a set-based language, and that thinking in sets is the way to make proper use of it. So I thought I was on solid ground when I decided to represent a bag of sets as a relation:

CREATE TABLE bag_of_sets (
  set_id INTEGER NOT NULL,
  member CHAR(1) NOT NULL,
  CONSTRAINT enforce_set PRIMARY KEY (set_id, member)
);

As far as I know that’s a good, fully-normalized representation. As intersection is such a fundamental operation on sets, I expected to find a natural way to express it in SQL. But this is the best I can manage:

-- Intersection
SELECT
    member
  FROM bag_of_sets
  GROUP BY member
  HAVING COUNT(*) = (SELECT COUNT(DISTINCT set_id) FROM bag_of_sets);

Ugly, isn’t it? How long does it take you to realise what it’s trying to do? And then, how long does it take you to be sure it’s correct? The intention is hidden amongst lots of implementation detail (a thought I often have about the SQL I write). And having to mention bag_of_sets twice seems so wrong. Have I missed something better? Continue reading SQL Set Intersection – harder than I thought?

Pure SQL put-if-absent (or insert-if-not-exist)

I learned a little trick from a colleague this week: a pure SQL put-if-absent operation. I needed a database patch to insert a couple of rows into a database table. Sounds dull, but gets a little more interesting because I was trying to pay off a small technical debt: these rows had already been added to the production database. I needed an idempotent patch, one that would add the rows to any development database which doesn’t have them, but that wouldn’t give an error on the production database, which does.

It would have been reasonably simple to write in a procedural language (we use PostgreSQL, so it would have been PL/pgSQL), but for that I’d need to write a function, then call that function from a SELECT statement, then delete the function. But that all seems a bit messy.

To illustrate the SQL alternative we found, we’ll need a table to update. Here’s a playpen we can experiment in:
Continue reading Pure SQL put-if-absent (or insert-if-not-exist)