How to Insert and Update in a Single Query with Postgres?
There are various DB operations that can require multiple SQL queries. Suppose you have two tables called banned_users
and users
. The banned_users
table has an AUTOINCREMENT
column called id
, which is referenced in the users
table via banned_id
. So whenever a user is banned, an entry is created in the banned_users
table and the users
table is also updated. Why create a separate table for bans? The schema ensures we store all records of bans (like, when it was done and by whom). And also make sure they are reversible (if the user is unbanned simply set banned_id
to NULL
).
Here’s what a program in Node.js would look like to do the bans:
function banUser(userId) {
const banUserTableEntry = await db.queryOne(`
INSERT INTO banned_users (user_id)
VALUES($1)
RETURNING id
`, [userId])
return db.queryOne(`
UPDATE users SET banned_id = $1 WHERE id = $2
`, [banUserTableEntry.id, userId])
}
You need a query to do the insert, then use the value of AUTOINCREMENT
column to update the users
table. What if we want to do this in a single query? We can eliminate the side of code to store the incremented id’s value.
Fortunately, CTEs makes it possible. They are a powerful tool to simplify queries or do a bunch of operations in a single query like we want to do now. Here’s how:
function banUser(userId) {
return db.queryOne(`
WITH banned_record AS (
INSERT INTO banned_users (user_id)
VALUES ($1)
RETURNING id
)
UPDATE users
SET deletion_id = (SELECT id FROM banned_record)
WHERE id = $1;
`, [userId]);
}
A similar query can be used to do multiple inserts, which can come in handy to initialize a newly created user. For eg,
async function initUser(email, password) {
return db.queryOne(`
WITH user AS (
INSERT INTO users (email, crypt($2, gen_salt('md5')))
VALUES ($1)
RETURNING id
), email_settings AS (
INSERT INTO email_settings (user_id, marketing_emails)
VALUES ((SELECT id FROM user), true)
)
INSERT INTO users_meta (user_id, value_json)
VALUES ((SELECT id FROM user), '{}')
`, [userId]);
}
If your requirement is simple though and you don’t need to use sequence values, a simpler way to run multiple operations would be to use transactions.
BEGIN;
statement_1;
statement_2;
...
COMMIT;
But is it really better versus storing results in a variable at the application level? Performance-wise, it won’t really make a significant difference. Code-readability wise, if you’re programming language allows something like async-await, then two queries might be easier to read. In some cases though having a single query makes the code more succinct and could be a good choice.