What is Data Structure
As the word says, the way we structure ,organize or store data is data structure.
In computer world, we can find lots and lots of data in the form of text, images, videos etc., so it is necessary to structure this data to ease the job of fetching and using it.
We can see data structure in 2 ways:
- Abstract data types/Mathematical and logical model: are the entities which generally talk about the data items and associated operations but not the implementations.
For example, let’s say we want to model a reservation system:
Seats –> Seats are available or not
Operations -> Determine availability of the seat
–> Reserve the seat.
–>Cancel the reservation.
–>Find the block of available seats.
In another example if we want to define a List in which we can:
store number of elements of any data type.
should be able to read the elements by position and
should be able to modify the element at particular position.
So here data is –> list of elements of same data type
And operations are –>Store,retrieve and modify.
So, the List becomes a set of Abstract data types.
2. Concrete Implementations:
For example Array.
Array is one data type which talks about the concrete implementations .
In C,C++ we can achieve List Abstract data type(ADT) using a data structure type called Linked List.
So while working on the data structure we will first check the Abstract view of it and then study about the implementation.
The first Data structure is
List as Abstract data type:
It’s a real world entity.
It can be collection of words,names e.t.c.
One instance of List
Let’s suppose we have to define a List where can store element of same data type, retrieve or modify an element at particular location.
So this we can achieve the same using Array data type.
We can define an array of same data type and can define the size as well for the array.
We can read an element at particular location
And can also modify an element at particular position
Second instance of List
Let’s suppose this time we want to define a dynamic list where
- we don’t want to fix the size of the list.
- Insert item in the list.
- Remove an item in the list.
- Count number of element in the list.
- Read/Modify and item in the list.
This operation can done implementing the array but there can be lot of disadvantages such as costly operations and time complexity .Following are the disadvantages:
- If we want to insert an element in an array,it generally gets inserted at the end of the list.But if we want to insert an element at particular location then first we need to move all the element from that particular position to the next position and then insert the element at that particular location which is costly operation.
For example : if we want to insert an element at a ,then we need to first shift element at a,a and a position to right and then insert an element at a position.
2. Similar with the removal of an element at particular position .In that case we need to shift all elements towards the right of the element to the left making it costly operation.
3. One more disadvantage is we can’t define the size of the array as dynamic .We can always add elements to fill the array .then to accommodate other elements we need to create another array larger than previous size and the copy the elements of previous to new array and then add another element .But this operation is costly and time consuming.
Learning point:study of Data structure is not only about studying operation and its implementation but also cost of operations.
Lets say in computer memory ,each segment of memory represent 1 byte of data and each byte of memory has an address.
Every application running in the system needs memory,so we have a memory manager to manage those memories.It takes care of which part of memory is free and which part occupied and also allocate the memory when requested.
Let’s say we declare an array say
Int a; //4 bytes of memory will be allocated.
If we declared:
Then memory manager allocates 16 bytes of memory for the variable A(4 block for each integer in the array).
Suppose we have to add the 5th element in the array ,then we have follow following steps:
create another array double the size of previous array
Copy the element of previous array into new array
Which is a costly operation.
To solve the above problem we can use Linked list.In Linked list we have something called nodes which consist of 2 parts:
- One part which contains the data.
- Second part which contains the address of the next node.
The block of memory allocated are disjoint and non-continuous.
Suppose we want to create a dynamic list where we can insert,modify,delete and read the element,then in case of linked list :
- First we create a node for the first element in the list which contain element as well as the address of next node (in this case it will be null as this is the first node).
- For second element ,another node is created containing data and address of next node(which will be null in this case)
Now another challenge is how to link these 2 nodes.
To link these two node we update the address of 2nd node in the first node
So next time if we have to create a node between these nodes then we have to create a new node and then update the address of first node and new node
No shifting for this operation .Thats y it is not a costly operation.
No memory wastage .
Can free node as we want.
To access a particular node we have to start with the first node also called as head node and then traverse through the nodes.
3.Doubly linked list:
In Doubly linked list the node contains three parts:
Link to next node
Link to previous node
For example : there are 3 node containing data ,it can be represented as below:
Advantage of doubly linked list:
With just pointer we can point to current node,next node and previous node as well.
So it’s easy to traverse as compared to singly linked list where we always have to traverse from head node to access a particular node.
It’s easy to do modification ,insertion and deletion operation on nodes.
Disadvantage of doubly linked list:
For a node extra memory location to store address of previous node.
It is a list with the restriction that insertion and deletion can only be performed only from one end called top.It is also called as last in first out structure.
There are basically 4 operation we can do in case of stack:
Insertion termed as push.
Removing of most recent item from the stack called pop.
Top is the term for operation to top element in the stack.
IsEmpty is to check whether stack is empty or not.
Representation of stack:
Real time scenario where we can use stack:
A good use for a stack would be the undo/redo feature. Every time the user enters something, save the state (in this case, the text) on a stack and if you need to reverse something, just pop it off the top of the undo stack and push it onto the redo stack.
It can also be used for recursion /function calls.
A queue is a data ss data structure where whatever goes in first come out first.That’s why it is called FIFO structure.
Here insertion happens from one end called as rear end and removal is performed from other end called front end.
Operation available in the queue:
Enqueue : insertion of the element from the queue.
Dequeue: deletion of the element from the queue.
Front/Peek: this operation is used to look at the top element in the queue.
IsEmpty: to check whether the queue is empty or not.
Real scenario where queue can be used:
Suppose there is a resource which serve the request and it can serve only one request at a time,then we can use queue in this case.The request that comes first get served first.
Tree is the data structure where data is stored in the hierarchical structure rather than linear or sequential way where root at top and branching out in downward direction.
It can be defined as collection of entities called nodes linked together to simulate a hierarchy.Here topmost node of the tree is called root of the tree .
Each node can contain data of any type or reference of other nodes called its children and are linked to each other
Representation of tree:
Here for example:
1 is the parent of 2 and 3 nodes.
Node 2 and 3 are the children of node 1.
Node 2 and 3 are sibling as they are pointing to same parent.
Node 4,5 ,6,7 and 8 are the leaves as they are last elements in the tree.
Node 1 and 2 are the ancestors of node 4 as it is following the same hierarchy.
Node 4 is the descendant of node 1 and 2.
The most common way of implementing tree is by creating nodes dynamically and then linking these nodes using pointers just like in the linked list.
Real time scenario of using Tree data structure:
We want to represent employees in an organization as well as its position.