Tag Archives: SQL

SQL Set Intersection – harder than I thought?

I’ve heard it said that SQL is a set-based language, and that thinking in sets is the way to make proper use of it. So I thought I was on solid ground when I decided to represent a bag of sets as a relation:

CREATE TABLE bag_of_sets (
  set_id INTEGER NOT NULL,
  member CHAR(1) NOT NULL,
  CONSTRAINT enforce_set PRIMARY KEY (set_id, member)
);

As far as I know that’s a good, fully-normalized representation. As intersection is such a fundamental operation on sets, I expected to find a natural way to express it in SQL. But this is the best I can manage:

-- Intersection
SELECT
    member
  FROM bag_of_sets
  GROUP BY member
  HAVING COUNT(*) = (SELECT COUNT(DISTINCT set_id) FROM bag_of_sets);

Ugly, isn’t it? How long does it take you to realise what it’s trying to do? And then, how long does it take you to be sure it’s correct? The intention is hidden amongst lots of implementation detail (a thought I often have about the SQL I write). And having to mention bag_of_sets twice seems so wrong. Have I missed something better? Continue reading SQL Set Intersection – harder than I thought?

Pure SQL put-if-absent (or insert-if-not-exist)

I learned a little trick from a colleague this week: a pure SQL put-if-absent operation. I needed a database patch to insert a couple of rows into a database table. Sounds dull, but gets a little more interesting because I was trying to pay off a small technical debt: these rows had already been added to the production database. I needed an idempotent patch, one that would add the rows to any development database which doesn’t have them, but that wouldn’t give an error on the production database, which does.

It would have been reasonably simple to write in a procedural language (we use PostgreSQL, so it would have been PL/pgSQL), but for that I’d need to write a function, then call that function from a SELECT statement, then delete the function. But that all seems a bit messy.

To illustrate the SQL alternative we found, we’ll need a table to update. Here’s a playpen we can experiment in:
Continue reading Pure SQL put-if-absent (or insert-if-not-exist)