DataBase Transactions

A collection of queries executed in a single run (Unit of work)

But why not a single query ? It’s impossible to have a single query doing everything for you, data are stored in separated tables even databases. So you will need to execute many different sql queries but treating them as if they were a single query.

example: Bank deposite

select balance of the giver
check balance >= amount to give
take money from the giver
give money to the beneficiary You can’t do all that things within a single transaction, right ?

Long transactions are not recommended in general, okay.

Transaction Life span

BEGIN
COMMIT OR ROLLBACK

typically in every database you need to mention that you are starting a brand new transaction, in postgres it’s simply BEGIN

the sql commands continues executing until reaching end of the transaction and COMMIT the changes unless something wrong happens in the middle so ROLLBACK.

Committing changes has tradeoffs: should I persist changes directly into desk or make the changes on the memory and then commit at the end when the COMMIT command executes.

Both are implemented in different database and both has pros and cons for example persisting changes into desk directly make the final commit much faster than writing things into memory and by the end moving everything to the desk in a single shot. Persistence into dest >>>faster>>> wiring into memory and commit later.

but there’s a catch what if a transaction has 10000000 queries and by the last query sth wrong happens, if the data are persisting into the desk after each query then the ROLLBACK will be very very expensive although it’s faster in COMMIT, but if the data are written into the memory then flush the memory that’s it.

So everything comes with tradeoffs, nothing wrong nothing right but choices.

Another critical case: what if when COMMIT sth wrong happened, that’s the worst thing as all queries already executed and finished. In case of persisting each query result the COMMIT is fast so this case can rarely happened but in case of memory the commit takes long to persist things into memory so chances that sth wrong happens os higher.

what if the database itself crashes, the database when back should notice the transaction isn’t committed and should ROLLBACK.

NOTE

For each case the database does lots of work for you, but each database implements things differently one can optimize for COMMITs like postgres and others for crashes.

TIP

Postgres persist every query result in desk before COMMIT, so COMMITs are fastest. lots of IO but the commits are fast.

Read only transaction

When you inform the database that’s a readonly transaction, it can make lots of optimizations. Generating reports is a very common case on readonly-transactions. You need consistency. You need a snapshot of the data at the time of transaction start so no other transaction modifies anything during reading data. (Snapshot isolation level)

WARNING

We are always in a transaction, you execute a single normal update query if you don’t start a transaction explicitly the database starts one for you and immediately commits it. (so transaction always exists even for a single simple query by the database if not by yourself)

M.K Hassan

Areas

Technical

Leetcode Problems

Projects

MLOPS Zoomcamp

LLM Zoomcamp

Inbox

Archive

DataBase Transactions

Transaction Life span

Read only transaction

Graph View

Table of Contents

Areas

Technical

Leetcode Problems

Projects

MLOPS Zoomcamp

LLM Zoomcamp

Inbox

Archive