# Distribution, commutation and association in database queries

Standard

Image via Wikipedia

Another reminder note.

The RDBMS query engine relies on these laws to build efficient queries:

Commutation:

Numbers may travel (commute)

$a + b = b + a$ or in relational algebra $A \Join B = B \Join A$

Association:

Numbers may freely associate

$(a + b) + c = a + (b + c)$ or in relational algebra $(A \Join B) \Join C = A \Join (B \Join C)$

Distribution:

Operators can be distributed

$3 \times (4 + 5) = 3 \times 4 + 3 \times 5$ or, in relational algebra $\sigma(A \Join B) = \sigma(A) \Join \sigma(B)$

A more general statement of these laws:

Commutative law: $f(a, b) = f(b, a)$
Associative law: $f(a, f(b, c)) = f(f(a, b), c)$
Distributive law: $f(g(a,b)) = g(f(a,b))$

Date has an excellent discussion of all this on pp 124 – 127 of SQL and Relational Theory: How to Write Accurate SQL Code

# Solving a relational query: part 5

Standard

Found: the relational algebra to tackle the problem outlined in part 3.

This is based on a solution to a similar problem found in Database Management Systems, Third Edition, though, unfortunately, that book prints a solution that is not relational algebra at all, as it relies on the order in which results are returned (relational algebra is a form of set algebra and sets have no order).

$\rho (TEMP1, \rho (BIDDER\_ID \rightarrow USER\_ID, \sigma _{amount > 100} (BID)) \Join USER \Join LOT)$

(Rename the attributes in TEMP1 – this is the step that is broken in the book)

$\rho (name \rightarrow TEMP1\_name, TEMP1)$

$\rho(TEMP2, TEMP1 \times ( \rho(BIDDER\_ID \rightarrow USER\_ID, BID) \Join USER \Join LOT ))$

$\Pi _{NAME, EMAIL} ( \sigma _{TEMP1\_LOT\_ID=LOT\_ID \wedge TEMP1\_USER\_ID=USER\_ID \wedge TEMP1\_BID\_ID \neq BID\_ID} (TEMP2))$

# Solving a relational query: part 4

Standard

As Paul Rubin pointed out the SQL in the last post was broken – here’s some that, I think (please do not let it be wrong a second time!), by relying on SQL’s coercion of a tuple to a scalar, would work (Paul proposed an even more ambitions way of doing this in a single join):

SELECT DISTINCT U.NAME, U.EMAIL
FROM USER U, BID B, LOT L
WHERE U.USER_ID = B.BIDDER_ID AND
B.AMOUNT > 100 AND B.LOT_ID = L.LOT_ID
AND (SELECT COUNT(*) FROM
BID, LOT WHERE
BID.LOT_ID = B.LOT_ID AND BID.BIDDER_ID = U.USER_ID
AND BID.LOT_ID = LOT.LOT_ID) > 1

Still no clue as to how to do this with relational algebra though.

# Solving a relational query: part 3

Standard

Image via Wikipedia

This is starting to depress me, now, because I thought I was full on top of all this material until I tried this question.

So, going back to the database specified in part 1 – I now need to find the relational algebra (not the SQL, I have done that) for: “find the name and email of any user who has made multiple bids for a lot including a bid exceeding £100”.

Various ideas have crossed my mind and some bits are easy eg $USER \Join \sigma _{AMOUNT > 100} \rho (BIDDER\_ID -> USER\_ID,\ BID)$ gets a relation from which we coul;d extract users who bid more than £100 for anything, but how can I get it all to string together.

I think the SQL that works is:

SELECT DISTINCT U.NAME, U.EMAIL
FROM USER U, BID B, LOT L
WHERE U.USER_ID = B.BIDDER_ID AND
B.AMOUNT > 100 AND B.LOT_ID = L.LOT_ID
AND L.LOT_ID IN (
SELECT COUNT(*) > 1 FROM
LOT, BID WHERE
LOT.LOT_ID = L.LOT_ID AND BID.LOT_ID = LOT.LOT_ID
AND BID.BIDDER_ID = U.USER_ID)

# Solving a relational query: part 2

Standard

Image via Wikipedia

With thanks to Professor Paul A. Rubin, I think this may be the way to solve this:

This joins LOT to a relation with one attribute (LOT_ID) made up from tuples where LOT_ID should only be returned where no bids have been made that are equal or greater than the reserve price.

Very difficult part question for an exam, though😦

# Solving a relational query: part 1

Standard

I have not found the answer to this, so thought I’d go through the steps.

Assume we have a relational database with these tables:

USER (USER_ID, NAME, EMAIL)
LOT (LOT_ID, SLLER_ID, TYPE, DESCRIPTION, RESERVE_PRICE, START_DATE, END_DATE)
BID (BID_ID, LOT_ID, BIDDER_ID, BID_DATE, AMOUNT)
FEE (LOT_ID, AMOUNT)

What is the relational algebra that finds “the description and reserve price of any lot where all bids have been less than the reserve price”.

I have been struggling with this and came up with this:

$\Pi_{(DESCRIPTION,RESERVE\_PRICE)}\ \sigma_{(\Pi_{(LOT\_ID=(\Pi_{LOT\_ID}(LOT\Join BID)\ -\ \Pi_{LOT\_ID}(LOT\Join BID-\sigma_{(AMOUNT

But that will not work as will return any lot where a bid has been made for less than the reserve price.

I am sure this is a question of relational division but have not yet been able to get my mind round the right way to formulate it.

# Looks like a great book

Standard

Image via Wikipedia

My copy of C. J. Date‘s SQL and Relational Theory: How to Write Accurate SQL Code turned up today – I have only read one chapter so far but already I have a feeling that it is going to have been worth every penny.

Date is concerned not to write another SQL primer but to give those with some SQL experience a good grounding in the relational model and the set theory on which it is based. It just feels more like a proper scientific/mathematical text book and it also seems to be well written.

Very pleased with the purchase and will keep you all posted on how it goes: at the moment I am just wishing I had bought it months ago.