The University - Computer and Information
Science
CIS Database Management Systems Fall 201
Assignment
Total points: 30
Implement the hash-join algorithm for a DBMS. Use C++ or Java programming language to write
the program. The program should include comments to make it readable. Turn in a brief report/
description about the program design and implementation (e.g., high-level program diagram and
data/file structures) and program usage; (2) the program source code; (3) proof of compilation (e.g.,
the screen snapshot of a successful compilation); and (4) sample execution outputs. Assemble the
above required contents in a Word or PDF file for submission.
The program specificiation is given as follows. Let R1(a1, a2, a3) and R2(b1, b2, b3, b4) be two realtions with all i nteger attributes. Tuples i n these two relations are sequentially stored in two data
files, respectively (use .txt files).
Use the hash-join algorithm to implement a join (equijoin) of R1 and R2. Assume that the
a hash function is f(k) = k mod N, where N is the number of buckets allowed in your hash
structure/table.
Your program should allow a user to choose the joining attributes from the two relations, i.e.,
performing
R1 ./R1.ai=R2.bj R2
for any chosen pair of ai and bj, where ai
are the i-th attribute in R1 (1 ≤ i ≤ 3) and bj is the j-
th attribute in R2 (1 ≤ j ≤ 4). For example, a user may want to perform R1 ./R1.a2=R2.b3 R2.
Your program should display the join result and output the selectivity of the join.
You may request a user to interactively input the necessary parameters, such as the data file
names for R1 and R2, the number of tuples in each relation, and the joining attributes (e.g.,
1 for the 1st attribute, 3 for the 3rd attribute).
Use your program to perform several joins for different relation instances of R1 and R2.
Get Free Quote!
348 Experts Online