Stage 11: Index Creation and Deletion (26 hours)
- Implement the creation and insertion operations on a B+ tree on the XFS disk
- Implement the deletion of an index from the XFS disk
Introduction
You must now already be familiar with the usage of indexes to speed up search operations. An index can be created on any attribute of a relation using the CREATE INDEX command. Once an index is created on the attribute, all search operations involving that attribute will proceed through the index instead of a linear search through all the records. An index can be deleted with the DROP INDEX command. Note that NITCbase does not allow you to create/delete indexes for the relation catalog and the attribute catalog.
Thus far, index creation and deletion could only be done through the XFS Interface. In this stage, we will implement this functionality in NITCbase.
Implementation
When an index is created for an attribute of a relation, the RootBlock
field in the corresponding attribute catalog entry will have to be updated with the block number of the root block of the B+ tree. In practice, this value is updated in the attribute cache and then written back to the buffer (and subsequently the disk) when the relation is closed. Thus, we will need to implement attribute cache update and write back in the Cache Layer.
In the Buffer Layer, we will need to update the IndLeaf and IndInternal classes to implement the setEntry()
function we discussed in the previous stage. This function allows us to write to an index block.
A sequence diagram showing the call sequence involved in the implementation of index creation and deletion are shown below.
NOTE: The functions are denoted with circles as follows.
🔵 -> methods that are already in their final state
🟢 -> methods that will attain their final state in this stage
Cache Update and Write-back
An index can only be created for an open relation. When an index is created for a relation on an attribute, the RootBlock
field is set for the corresponding attribute catalog entry in the attribute cache entry of the relation. Similar to how we had implemented the updation of the relation cache in previous stages, this updated value will be written to the buffer when the relation is closed (or at system exit, when all open relations are closed.).
Thus, we will need to implement methods to write to the attribute cache. Additionally, we will also modify the OpenRelTable::closeRel()
function to write-back any dirty attribute cache entries on relation closing. Note that we do not need to update the destructor of the OpenRelTable class to handle write-back for the attribute cache entries of the relation catalog and the attribute catalog (why?).
A class diagram of the Cache Layer highlighting the methods relevant to this functionality is shown below.
In class AttrCacheTable
, we implement the method setAttrCatEntry()
which is overloaded to find the entry in the attribute cache using the attribute's name or offset.
We also implement the attrCatEntryToRecord()
function which converts from a struct AttrCatEntry to a record of the attribute catalog (a union Attribute array). This function will be useful when writing the dirty cache entry back to the buffer in record form.
Cache/AttrCacheTable.cpp
Implement the following functions looking at their respective design docs
In class OpenRelTable
, we update closeRel()
to check for dirty attribute cache entries and write them back to the buffer using attrCatEntryToRecord()
and the Buffer Layer functions we are familiar with.
Cache/OpenRelTable.cpp
Implement the OpenRelTable::closeRel()
function by looking at the design docs.
Writing to Index Blocks
In the previous stage, we had talked about the abstract class IndBuffer and it's children IndLeaf and IndInternal representing leaf index blocks and internal index blocks respectively. We had implemented the getEntry()
function in both the classes to read an entry from an index block. In this stage, we will implement the setEntry()
function which allows us to write an entry to an index block.
A class diagram of the Buffer Layer highlighting these functions is shown below.
Implement these functions as shown in the links below.
Buffer/BlockBuffer.cpp
Implement the following functions looking at their respective design docs
Creating and Deleting B+ Trees
In the B+ Tree Layer, we implement methods to create a B+ tree, insert into a B+ tree and free all the index blocks associated with a B+ tree. Note that we do not need to implement the functionality to delete an individual entry from a B+ tree because NITCbase does not support individual deletion of records.
A class diagram highlighting all the functions that will be modified to implement this functionality is shown below.
As shown in the sequence diagram above, the Frontend User Interface will parse the CREATE INDEX
command and call the Frontend::create_index()
function in the Frontend Programming Interface. This call is then transferred along to the Schema Layer. Hence, the implementation of the Frontend::create_index()
function only involves a call to the Schema::createIndex()
function. Similarly, the DROP INDEX
command leads to the Frontend::drop_index()
function which in turn transfers control to Schema::dropIndex()
.
Frontend/Frontend.cpp
Implement the following functions looking at their respective design docs
The Schema::createIndex()
function verifies that the relation is open and passes the rel-id and attribute name along to the BPlusTree::bPlusCreate()
function to create an index (to be implemented later in this stage).
The Schema::dropIndex()
function fetches the root block of the index on a specified attribute from the attribute cache and then calls the BPlusTree::bPlusDestroy()
function to free the index blocks. The corresponding attribute cache entry is then updated to indicate that there no longer exists a B+ tree on the attribute.
Although the Schema Layer function Schema::dropIndex()
is responsible for removing the RootBlock
field during index deletion, during index creation, the attribute cache entry is updated with the value of the root block in the B+ Tree Layer function BPlusTree::bPlusCreate()
(and not in the Schema::createIndex()
function).
Schema/Schema.cpp
Implement the following functions looking at their respective design docs
We implement the core functionality of this stage in the B+ Tree Layer.
The BPlusTree::bPlusCreate()
is used to create an index on attribute for a relation. It allocates a new index block and sets the RootBlock
field in the corresponding attribute cache entry. It then reads every record of the relation and inserts the attribute value into the index using BPlusTree::bPlusInsert()
.
The BPlusTree::bPlusDestroy()
function recursively traverses all the blocks of the index and frees them using BlockBuffer::releaseBlock()
.
The BPlusTree::bPlusInsert()
function is used to insert an entry into the B+ tree of an attribute. This is quite possibly the most complicated function in the implementation of NITCbase. Hence, the functionality involved in this task has been split among many private helper functions in the BPlusTree
class.
BPlusTree/BPlusTree.cpp
Implement the following functions looking at their respective design docs
Lastly, in the Block Access Layer, we update the insert()
method to insert the new record into any existing indexes of the relation using BPlusTree::bPlusInsert()
. The deleteRelation()
method is updated to free up any indexes associated with the relation by calling BPlusTree::bPlusDestroy()
. You have already implemented a major part of both of the above functions. The only modification required to the above functions is the part corresponding to B+ trees.
BlockAccess/BlockAccess.cpp
Implement the following functions looking at their respective design docs
And with that, your NITCbase now supports the creation and deletion of indexes! Verify your implementation with the exercises below.
You can use the print B+ tree
and export B+ blocks
commands in the XFS Interface to view the status of your B+ tree.
Exercises
Q1. In your NITCbase, run the file s11test.txt (using the run command) to test your implementation. Place the file s11students.csv in the Files/Input_Files
directory and s11test.txt in the Files/Batch_Execution_Files
directory before running the script. The script should result in the creation of the following relations
S11_Students(name STR, cgpa NUM)
: Stores the name and CGPA of all studentsS11_c_Students(name STR)
: Stores the name of all students that have their name beginning with the letter cS11_9_Students(name STR, cgpa NUM)
: Stores the name and CGPA of all students who have a CGPA >= 9
Now, verify the contents of the above relations by running the following commands in the XFS Interface.
NOTE: Don't forget to exit NITCbase before running commands in the XFS Interface (refer to the WARNING in the documentation of runtime disk).
schema S11_Students
export S11_c_Students c_students.csv
export S11_9_Students 9_students.csv
You should see the following.
Relation: S11_Students
Attribute Type Index
---------------- ---- -----
name STR no
cgpa NUM yes
name
caa
caaapi
caaara
caaarter
.
.
naveen,9.000000
rohith,9.000000
ato,9.010000
.
.
Q2. This exercise will test the error conditions of the index creation and deletion functionality. Run the following in your NITCbase and ensure that you get the corresponding output.
create index on RELATIONCAT.RelName; # Error: This operation is not permitted
drop index on S11_Students.name; # Error: Relation is not open
open table S11_Students; # Relation S11_Students opened successfully
drop index on S11_Students.name; # Error: No index