Stage 9 : Selection and Projection on Relations (12 hours)
- Complete the implementation of SELECT and PROJECT operations in NITCbase
Introduction
In previous stages, you had implemented linear search on relations and a rudimentary version of the SELECT command to select records from a relation. In this stage, we will complete the implementation of the select operation and the project operation.
As discussed earlier, a SELECT operation in relational algebra involves fetching all the records that satisfy a given condition. Our previous implementation would select records from a relation and print them to the console. The actual NITCbase specification defines the SELECT operation as selecting records from a relation that satisfy a specific condition and writing those records into a newly created relation of a specified name. Since we have now implemented relation creation, we can finish our implementation of the SELECT * FROM TABLE WHERE command.
A PROJECT operation on a relation is used to pick a subset of columns from the relation. In NITCbase, doing a PROJECT operation on a relation would result in the creation of a new relation with a subset of the attributes of the source relation. The required attributes will be picked from each record and inserted into the new relation.
Once you implement the select()
and project()
operations in the Algebra Layer, you will be able to add the following commands to your NITCbase using a combination of the two functions.
Frontend User Interface Command | Operation |
---|---|
SELECT * FROM RelName INTO TargetName WHERE Attribute op value | selection |
SELECT * FROM RelName INTO TargetName | projection (clone relation) |
SELECT Attr1,Attr2 FROM RelName INTO TargetName | projection |
SELECT Attr1,Attr2 FROM RelName INTO TargetName WHERE Attr op value | selection and projection |
Implementation
A sequence diagram documenting the call sequence involved in a call to the SELECT AttrList FROM TABLE WHERE command is shown below. The calls to the Cache Layer and Buffer Layer are omitted for the sake of clarity.
NOTE: The functions are denoted with circles as follows.
🔵 -> methods that are already in their final state
🟢 -> methods that will attain their final state in this stage
🟠 -> methods that we will modify in this stage, and in subsequent stages
🟤 -> methods that we built earlier and require more work later, but will leave as is in this stage
A class diagram highlighting the methods relevant to this stage is shown below.
In the Block Access Layer, we implement the project()
function. This function is used to fetch one record of the relation. Each subsequent call would return the next record until there are no more records to be returned. Similar to the linearSearch()
function you implemented earlier, project()
makes use of the searchIndex
in the relation cache to keep track of the last
NOTE: Even though this function is named
project()
, it does not do a PROJECT operation. It effectively serves as a way to access the next record from a relation. The actual projection of the record to a subset of its columns is implemented in the Algebra Layer. This function is namedproject()
only due to the fact that it serves as a helper function for theAlgebra::project()
function.
BlockAccess/BlockAccess.cpp
Implement the BlockAccess::project()
function by looking at the algorithm given in the design docs.
In the Algebra Layer, you had already implemented part of the select()
function in previous stages. In this stage, you will modify the function to create a new relation and insert the selected records into the new relation.
We also add two new overloaded functions project(srcRel, targetRel)
and project(srcRel, targetRel, numAttrs, attrs)
which are responsible for the SELECT * FROM TABLE and SELECT AttrList FROM TABLE commands respectively. The SELECT AttrList FROM TABLE WHERE also makes use of the project(srcRel, targetRel, numAttrs, attrs)
function. Note that the project(srcRel, targetRel)
function is used to create a copy of the source relation into the target relation.
Algebra/Algebra.cpp
Implement the following functions looking at their respective design docs
Algebra::select()
.
NOTE: The definition of the above function involves a call to theAttrCacheTable::resetSearchIndex()
function before the call to theBlockAccess::search()
function. This call is only relevant when we are doing a B+ tree search. Since we have not yet implemented indexing, this call can be omitted. The rest of the design remains the same.Algebra::project(srcRel, targetRel)
Algebra::project(srcRel, targetRel, numAttrs, attrs)
Finally, in the Frontend Programming Interface, we update the handlers of the functions to call the respective Algebra Layer methods.
Contrary to what we are used to, the implementation of the Frontend::select_attrlist_from_table_where()
function involves more than just a call to a lower layer method. Since this operation is a combination of both selection and projection, it requires calls to both the corresponding methods.
The function implementation involves creating an intermediate relation which holds the result of one of the operations. The second operation is done on this intermediate relation, and the result gives us the required target relation. The intermediate relation is then deleted. NITCbase reserves the name of this intermediate relation as .temp
(available to you as the constant TEMP).
Frontend/Frontend.cpp
Implement the following functions looking at their respective design docs
Exercises
Q1. Create a relation Toys(id NUM, name STR, colour STR, stock NUM)
and insert the values from the file toys.csv into the relation (using the INSERT FROM FILE command). Then, run the following commands in your NITCbase.
SELECT name,colour FROM Toys INTO ToyColours;
SELECT * FROM Toys INTO ToysForSale WHERE stock > 3000;
SELECT * FROM Toys INTO ToysToReorder WHERE stock <= 1000;
SELECT * FROM Toys INTO ToysCopy;
SELECT id,name FROM Toys INTO BlueToys WHERE colour = blue;
Verify the contents of the new relations in the XFS Interface using either the PRINT TABLE command or EXPORT command.
Q2. This exercise will test the error conditions for the select command. Run the following in your NITCbase and ensure that you get the corresponding output.
SELECT * FROM Toys INTO ToysForSale WHERE stock > ten; # Error: Mismatch in attribute type
SELECT * FROM Toys INTO Toys WHERE stock > 10; # Error: Relation already exists
SELECT id,name,size FROM Toys INTO ToySizes; # Error: Attribute does not exist
Q3. Create a relation QuizMarks(rollNo STR, name STR, cgpa NUM, quizMarks NUM, location STR)
which stores the data of the results of a scholarship quiz in a college. Insert the records from the file quizmarks.csv into it. All students who have a CGPA greater than 8 and scored more than 85 marks for the quiz are eligible for the scholarship. The organisers need the roll number and name of all the eligible students separated by location. Write the required queries and fetch this data for the locations Kerala
, Gujarat
and Telangana
.