DATA STRUCTURES-UNIT IV

 

UNIT IV

 

INTRODUCTION TO ALGORITHMS:

Definition  of  Algorithms-  Overview  and  importance  of  algorithms-  pseudocode conventions, Asymptotic notations, practical complexities.

Graph  Applications:  Breadth  first  traversal    Depth  first  traversal-  -Single  source shortest path – Minimal spanning trees – prim’s and kruskal’s algorithms

 

INTRODUCTION TO ALGORITHMS

 

4.1)Definition  of  Algorithms

 

      An algorithm is a well-defined, step-by-step procedure or set of rules used to solve a particular problem or perform a specific task.

      Algorithms are widely applied in various fields like mathematics, computer science, and data processing.

4.2)Overview  and  importance  of  algorithms

      In data structures, algorithms are essential for efficiently manipulating, storing, and retrieving data.

      Algorithms operate on data structures (e.g., arrays, lists, stacks, queues, trees, graphs) to perform tasks such as searching, sorting, inserting, deleting, and traversing data.

      Efficient algorithms help optimize resource utilization such as memory and computation time, making systems more responsive and scalable.

An algorithm to add two numbers and display the result

Step 1:Start ADD

Step 2: get values of a&b

Step 3: C=a + b

Step 4:Display C

Step 5:Stop

Key characteristics of algorithms:

        Input: An algorithm takes zero or more inputs.

        Output: It produces at least one output.

        Definiteness: Every step of the algorithm is precisely defined.

        Finiteness: The algorithm must terminate after a finite number of steps.

        Effectiveness: Every operation in the algorithm should be basic enough to be carried out, in principle, by a human using pen and paper.

1. Modularity

Modularity means breaking the problem into smaller subproblems/modules. A well-designed algorithm is always modular.

2. Correctness

If for a given input, the actual output and the desired output are exactly the same, then the algorithm is correct.

3. Maintainability

It means structuring an algorithm in such a way that it is easily comprehensible. Maintainability also includes being easily able to redefine an algorithm.

4. Functionality

Functionality defines the various logical steps that an algorithm uses to solve computational problems.

5. User-friendliness

An algorithm is user-friendly if the designer is easily able to explain it to the programmer.

6. Robustness

Robustness defines the clarity with which an algorithm can define or solve the problem.

7.  Extensibility

An algorithm is extensible if any other designer is also easily able to use it.

8. Simplicity

An algorithm should be simple and easy to understand.

Importance of Algorithms in Data Structures (Brief)

  1. Efficiency: Algorithms optimize time and space complexity, ensuring faster data access and manipulation.
  2. Data Management: They enable efficient operations like searching, sorting, insertion, and deletion on data structures.
  3. Scalability: Algorithms handle large datasets efficiently, allowing operations to scale with minimal performance loss.
  4. Optimization: Algorithms help optimize tasks, making data operations more effective and reducing resource usage.
  5. Problem Solving: They offer structured approaches for solving complex problems (e.g., shortest path, balancing trees).
  6. Automation: Algorithms automate repetitive tasks, streamlining processes in dynamic and static data structures.

Examples of Algorithms in Data Structures

      Searching Algorithms: Linear Search, Binary Search

      Sorting Algorithms: Quick Sort, Merge Sort, Bubble Sort

      Tree Traversal Algorithms: In-order, Pre-order, Post-order Traversal

      Graph Algorithms: Depth-First Search (DFS), Breadth-First Search (BFS), Dijkstra's Algorithm

Binary Search Algorithm on a Sorted Array

      Array: [10, 20, 30, 40, 50, 60]

      Target: 40

Binary Search Process:

      Start in the middle (30).

      Compare: 40 > 30, move right.

      Middle is now 50.

      Compare: 40 < 50, move left.

      Middle is now 40 (target found).

4.3)pseudocode conventions

 

Pseudocode is a high-level, informal language used to describe the steps in an algorithm without requiring the precision or syntax of actual programming languages.

While there is no strict standard for pseudocode, there are commonly accepted conventions to make it clear and easy to understand.

1. Basic Structure

Indentation: Use indentation to define blocks of code, such as loops, conditionals, or functions.
if condition:

    statement1

                statement2

Statements: Use simple, clear statements that represent the flow of logic. Statements should be concise but descriptive.
x = 5

2. Control Flow

Conditional Statements: Use if, else if, and else to describe decisions.
if condition:

    // actions if true

else:

    // actions if false

Loops: Use for, while, or repeat to represent iterations.
plaintext
CopyEdit
for i = 1 to N:

    // actions

plaintext
CopyEdit
while condition:

    // actions

End of Block: Some pseudocode styles use end if, end for, end while, etc., to indicate the end of a block.
            end if

3. Functions and Procedures

Function Definition: Use function or procedure to describe reusable blocks of code.
function name(parameters):

    // actions

end function

Function Call: Call a function by name, passing in arguments as needed.

result = functionName(arg1, arg2)

4. Variables and Assignment

Assignment: Use = for assigning values to variables.
plaintext
CopyEdit
x = 10

Operations: Use standard mathematical operations such as +, -, *, /, etc.
            sum = a + b

5. Input and Output

Input: Represent user input using input or read.
            input x

Output: Display output using output or print.
            output "Result is", result

6. Comments

Comments: Use comments to explain sections of code or clarify logic. Typically done using //, #, or some notation to indicate that it's a comment.
            // This is a comment

7. Logical and Comparison Operators

Logical Operators: Use and, or, and not for logical operations.
if x > 0 and y < 5:

    // actions

Comparison Operators: Use common comparison operators like =, !=, >, <, >=, <=.
            if x == y:

               // actions

8. Data Structures

Arrays and Lists: Use indexing for arrays and lists.
            array[i] = value

Common Structures: Use terms like queue, stack, or list to describe operations on these structures.
            enqueue(queue, item)

9. Return Statement

Return Values: Indicate the return value from a function or process using return.
            return result

10. Complexity Notation (Optional)

Sometimes pseudocode includes an explanation of time and space complexity, typically using Big-O notation.
            // Time complexity: O(n)

      Pseudocode should be written in plain, natural language.

      It focuses on clarity over strict syntax, making it easy to translate into any programming language.

      Avoid language-specific features; stick to general programming concepts.

4.4)Asymptotic notations

Asymptotic notations in algorithms are used to describe the performance (time complexity or space complexity) of algorithms in relation to the size of the input data, especially as the input size approaches infinity.

They give a way to express how an algorithm's resource consumption grows as the input size grows.

primary asymptotic notations used in algorithms:

1. Big O Notation (O)

Big O notation is the most commonly used asymptotic notation. It represents the upper bound of the algorithm's time complexity or space complexity. It tells us the worst-case scenario, i.e., the maximum time an algorithm can take or the maximum space it can occupy as the input size grows.

      Use: Describes the worst-case complexity.

      Example: If an algorithm takes at most  3n2+2n+1 steps, it is O(n^2), meaning the growth rate is bounded by n^2 for large n.

Examples:

      O(1): Constant time complexity, the algorithm runs in the same time regardless of input size (e.g., accessing an array element by index).

      O(log n): Logarithmic time complexity, where the time grows logarithmically as input size increases (e.g., binary search).

      O(n): Linear time complexity, where time grows linearly with input size (e.g., iterating over an array).

      O(n^2): Quadratic time complexity, common with algorithms that involve nested loops (e.g., bubble sort).

2. Omega Notation (Ω)

Omega notation gives a lower bound on the time complexity of an algorithm. It tells us the best-case scenario, i.e., the minimum time an algorithm can take to complete as the input size grows.

      Use: Describes the best-case complexity.

      Example: If an algorithm takes at least 3n3n3n steps in its best-case scenario, it is Ω(n)\Omega(n)Ω(n).

Examples:

      Ω(1): Best-case scenario where the time is constant regardless of input size.

      Ω(n): Best-case where time grows linearly with input size.

3. Theta Notation (Θ)

Theta notation gives a tight bound on the time complexity of an algorithm. It describes both the upper and lower bounds, meaning it tells us the exact asymptotic behavior. The time complexity in the worst-case and best-case scenarios grows at the same rate.

      Use: Describes the exact time complexity.

      Example: If an algorithm takes exactly 2n^2 steps, it is Θ(n^2), because both the upper and lower bounds are n^2.

Examples:

      Θ(n): The algorithm has linear complexity in both the best-case and worst-case.

      Θ(n log n): Merge sort has time complexity Θ(nlogn)\Theta(n \log n)Θ(nlogn) in all cases (best, average, and worst).

4. Little-o Notation (o)

Little-o notation gives an upper bound that is not tight. It describes the worst-case complexity but suggests that the actual running time grows more slowly than the function used.

      Use: Describes an upper bound that is not tight.

      Example: If an algorithm takes at most n log n time, it is o(n^2), meaning its running time grows slower than n^2.

Examples:

      o(n^2): The algorithm's time

4.5)Practical Complexities

O(1) – Constant Time:

      Example: Accessing an array element by index.

      Very efficient, time doesn't depend on input size.

O(log n) – Logarithmic Time:

      Example: Binary search.

      Efficient for large inputs; reduces problem size in each step.

O(n) – Linear Time:

      Example: Scanning an array.

      Time grows proportionally with input size; common for loops.

O(n log n) – Linearithmic Time:

      Example: Merge sort, Quick sort (average case).

      Efficient for sorting and divide-and-conquer algorithms.

O(n²) – Quadratic Time:

      Example: Bubble sort, Insertion sort.

      Inefficient for large datasets due to nested loops.

O(2ⁿ) – Exponential Time:

      Example: Recursive algorithms like the naive solution for the Traveling Salesman Problem.

      Not practical for large inputs.

GRAPH  APPLICATIONS:

 

1)BREADTH  FIRST  TRAVERSAL 

 

·       Breadth First Search (BFS) is a fundamental graph traversal algorithm.

·       It begins with a node, then first traverses all its adjacent nodes. Once all adjacent are visited, then their adjacent are traversed.

·       It's used in many applications, including finding the shortest path between two nodes, detecting cycles, and indexing web pages.

 

How it works [1]

1.     Start at a given source vertex

2.     Explore all adjacent vertices

3.     Explore all adjacent vertices of those vertices

4.     Repeat until all reachable vertices are visited

 

·       The algorithm starts from a given source and explores all reachable vertices from the given source. It is similar to the Breadth-First Traversal of a tree. Like tree, we begin with the given source and traverse vertices level by level using a queue data structure. 

Example:

·       Given a undirected graph represented by an adjacency list adj, where each adj[i] represents the list of vertices connected to vertex i.

·       Perform a Breadth First Traversal (BFS) starting from vertex 0, visiting vertices from left to right according to the adjacency list, and return a list containing the BFS traversal of the graph.

Examples:

Input: adj = [[2,3,1], [0], [0,4], [0], [2]]

Output: [0, 2, 3, 1, 4]
Explanation:
Starting from 0, the BFS traversal will follow these steps:
 Visit 0 → Output: 0
 Visit 2 (first neighbor of 0) → Output: 0, 2
 Visit 3 (next neighbor of 0) → Output: 0, 2, 3
 Visit 1 (next neighbor of 0) → Output: 0, 2, 3,
 Visit 4 (neighbor of 2) → Final Output: 0, 2, 3, 1, 4

Follow the below given approach:

1.    Initialization: Enqueue the given source vertex into a queue and mark it as visited.

2.    Exploration: While the queue is not empty:

     Dequeue a node from the queue and visit it (e.g., print its value).

     For each unvisited neighbor of the dequeued node:

     Enqueue the neighbor into the queue.

     Mark the neighbor as visited.

3.    Termination: Repeat step 2 until the queue is empty.

Applications of BFS in Graphs

     Shortest Path Finding

     Cycle Detection

     Connected Components

     Topological Sorting

     Level Order Traversal of Binary Trees

     Network Routing

2)DEPTH  FIRST  TRAVERSAL

 

·       In Depth First Search (or DFS) for a graph,  traverse all adjacent vertices one by one.

·       When we traverse an adjacent vertex, we completely finish the traversal of all vertices reachable through that adjacent vertex.

·       This is similar to a tree, where we first completely traverse the left subtree and then move to the right subtree.

·       The key difference is that, unlike trees, graphs may contain cycles (a node may be visited more than once).

·       To avoid processing a node multiple times, we use a boolean visited array.

·       The algorithm starts from a given source and explores all reachable vertices from the given source.

·       It is similar to Preorder Tree Traversal where we visit the root, then recur for its children. In a graph, there might be loops. So we use an extra visited array to make sure that we do not process a vertex again.

·       Let us understand the working of Depth First Search with the help of the following illustration: for the source as 0

DFS Algorithm:

  1. Initialization:

      Start at a given node (root or source).

      Mark it as visited.

  1. Exploration:

      Visit an unvisited adjacent node, mark it as visited, and recursively repeat this process.

      If no adjacent unvisited node is found, backtrack to the previous node.

  1. Termination:

      The algorithm terminates when all nodes reachable from the starting node have been visited.

DFS APPLICATIONS:

      Finding connected components in a graph.

      Detecting cycles in a graph.

      Solving puzzles like mazes and games.

      Pathfinding problems in directed/undirected graphs.

3)SINGLE SOURCE SHORTEST PATH ALGORITHM

·       The shortest path algorithms are the ones that focuses on calculating the minimum travelling cost from source node to destination node of a graph in optimal time and space complexities.

·       2 algorithms are applied for the finding the shortest path.

1)Dijkstra's Algorithm

2) Bellman-Ford Algorithm:

 

1)Dijkstra's Algorithm

     Dijkstra's Algorithm  is a greedy algorithm used to solve the single-source shortest path problem for a graph with non-negative edge weights. It finds the shortest path from a starting (source) vertex to all other vertices in the graph.

     It was conceived by Dutch computer scientist Edsger W. Dijkstra in 1956.

     The algorithm keeps track of two sets: one for visited vertices and one for unvisited vertices. It begins at the source vertex and picks the unvisited vertex with the smallest known distance.

     Then, it checks the neighbors of this vertex and updates their distances if a shorter path is found. This process repeats until the destination is reached, or all reachable vertices are visited.

Dijkstra’s algorithm can work on both directed graphs and undirected graphs as this algorithm is designed to work on any type of graph as long as it meets the requirements of having non-negative edge weights and being connected.

     In a directed graph, each edge has a direction, indicating the direction of travel between the vertices connected by the edge. In this case, the algorithm follows the direction of the edges when searching for the shortest path.

     In an undirected graph, the edges have no direction, and the algorithm can traverse both forward and backward along the edges when searching for the shortest path.

Algorithm for Dijkstra’s Algorithm:

1.    Mark the source node with a current distance of 0 and the rest with infinity.

2.    Set the non-visited node with the smallest current distance as the current node.

3.    For each neighbor, N of the current node adds the current distance of the adjacent node with the weight of the edge connecting 0->1. If it is smaller than the current distance of Node, set it as the new current distance of N.

4.    Mark the current node 1 as visited.

5.    Go to step 2 if there are any nodes are unvisited.

Example:Dijkstra’s Algorithm will generate the shortest path from Node 0 to all other Nodes in the graph.

The algorithm will generate the shortest path from node 0 to all the other nodes in the graph.

For this graph, we will assume that the weight of the edges represents the distance between two nodes.

 

 

As, we can see we have the shortest path from,
Node 0 to Node 1, from
Node 0 to Node 2, from
Node 0 to Node 3, from
Node 0 to Node 4, from
Node 0 to Node 6.

 

Step 3:  Then Move Forward and check for adjacent Node which is Node 3, so marked it as visited and add up the distance, Now the distance will be:

Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7

Step 4: Again we have two choices for adjacent Nodes (Either we can choose Node 4 with distance 10 or either we can choose Node 5 with distance 15) so choose Node with minimum distance. In this step Node 4 is Minimum distance adjacent Node, so marked it as visited and add up the distance.

 

Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 = 17

Step 5:  Again, Move Forward and check for adjacent Node which is Node 6, so marked it as visited and add up the distance, Now the distance will be:

 

 

Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 + 5 + 10 + 2 = 19

 

So, the Shortest Distance from the Source Vertex is 19 which is optimal one

 

 

2)Bellman-Ford Algorithm:

     Bellman-Ford is a single source shortest path algorithm. It effectively works in the cases of negative edges and is able to detect negative cycles as well.

     It works on the principle of relaxation of the edges.

Principle of Relaxation of Edges

     Relaxation means updating the shortest distance to a node if a shorter path is found through another node. For an edge (u, v) with weight w:

If going through u gives a shorter path to v from the source node (i.e., distance[v] > distance[u] + w), we update the distance[v] as distance[u] + w.

     In the bellman-ford algorithm, this process is repeated (V – 1) times for all the edges.

Example:Given a weighted graph with V vertices and E edges, and a source vertex src, find the shortest path from the source vertex to all vertices in the given graph. If a vertex cannot be reached from source vertex, mark its distance as 108.

Note: If a graph contains negative weight cycle, return -1.

Examples:

Input: V = 5, edges = [[1, 3, 2], [4, 3, -1], [2, 4, 1], [1, 2, 1], [0, 1, 5]], src = 0

 

 

 

Output: [0, 5, 6, 6, 7]

Negative weight cycle:A negative weight cycle is a cycle in a graph, whose sum of edge weights is negative. If you traverse the cycle, the total weight accumulated would be less than zero.

Negative Weight Cycle

In the presence of negative weight cycle in the graph, the shortest path doesn’t exist because with each traversal of the cycle shortest path keeps decreasing.

4)MINIMAL SPANNING TREES

 

Spanning tree

     A spanning tree is a sub-graph of an undirected connected graph, which includes all the vertices of the graph with a minimum possible number of edges. If a vertex is missed, then it is not a spanning tree.

     The edges may or may not have weights assigned to them.

     The total number of spanning trees with n vertices that can be created from a complete graph is equal to n(n-2).

     If we have n = 4, the maximum number of possible spanning trees is equal to 44-2 = 16. Thus, 16 spanning trees can be formed from a complete graph with 4 vertices.

Example of a Spanning Tree

Let the original graph be:

spanning treeSome of the possible spanning trees that can be created from the above graph are:

spanning tree spanning tree spanning tree spanning tree spanning tree

 

Minimum Spanning Tree

     A minimum spanning tree is a spanning tree in which the sum of the weight of the edges is as minimum as possible.

Example of a Spanning Tree

Let's understand the above definition with the help of the example below.

The initial graph is:

initial graphWeighted graph

The possible spanning trees from the above graph are:

minimum spanning tree (mst)Minimum spanning tree - 1minimum spanning tree (mst)Minimum spanning tree - 2minimum spanning tree (mst)Minimum spanning tree - 3minimum spanning tree (mst)Minimum spanning tree - 4

The minimum spanning tree from the above spanning trees is:

minimum spanning tree (mst)Minimum spanning tree

The minimum spanning tree from a graph is found using the following algorithms:

  1. Prim's Algorithm
  2. Kruskal's Algorithm

Spanning Tree Applications

      Computer Network Routing Protocol

      Cluster Analysis

      Civil Network Planning

Minimum Spanning tree Applications

      To find paths in the map

      To design networks like telecommunication networks, water supply networks, and electrical grids.

5) PRIM’S AND KRUSKAL’S ALGORITHMS

 

Greedy Algorithm

     A greedy algorithm is an approach for solving a problem by selecting the best option available at the moment. It doesn't worry whether the current best result will bring the overall optimal result.

     The algorithm never reverses the earlier decision even if the choice is wrong. It works in a top-down approach.

 

1)KRUSKAL'S ALGORITHM

Kruskal's algorithm is a minimum spanning tree algorithm that takes a graph as input and finds the subset of the edges of that graph which

      form a tree that includes every vertex

      has the minimum sum of weights among all the trees that can be formed from the graph

How Kruskal's algorithm works

     It falls under a class of algorithms called greedy algorithms that find the local optimum in the hopes of finding a global optimum.

     We start from the edges with the lowest weight and keep adding edges until we reach our goal.

The steps for implementing Kruskal's algorithm are as follows:

  1. Sort all the edges from low weight to high
  2. Take the edge with the lowest weight and add it to the spanning tree. If adding the edge created a cycle, then reject this edge.
  3. Keep adding edges until we reach all vertices.

 

Example of Kruskal's algorithm

Start with a weighted graphStart with a weighted graphChoose the edge with the least weight, if there are more than 1, choose anyoneChoose the edge with the least weight, if there are more than 1, choose anyoneChoose the next shortest edge and add itChoose the next shortest edge and add itChoose the next shortest edge that doesn't create a cycle and add itChoose the next shortest edge that doesn't create a cycle and add itChoose the next shortest edge that doesn't create a cycle and add itChoose the next shortest edge that doesn't create a cycle and add itRepeat until you have a spanning treeRepeat until you have a spanning tree

 

Kruskal Algorithm Pseudocode

     Any minimum spanning tree algorithm revolves around checking if adding an edge creates a loop or not.

     The most common way to find this out is an algorithm called Union FInd. The Union-Find algorithm divides the vertices into clusters and allows us to check if two vertices belong to the same cluster or not and hence decide whether adding an edge creates a cycle.

KRUSKAL(G):

A = ∅

For each vertex v ∈ G.V:

    MAKE-SET(v)

For each edge (u, v) ∈ G.E ordered by increasing order by weight(u, v):

    if FIND-SET(u) ≠ FIND-SET(v):      

    A = A ∪ {(u, v)}

    UNION(u, v)

return A

Kruskal's Algorithm Complexity

The time complexity Of Kruskal's Algorithm is: O(E log E).

Kruskal's Algorithm Applications

      In order to layout electrical wiring

      In computer network (LAN connection)

2)PRIM'S ALGORITHM

Prim's algorithm is a minimum spanning tree algorithm that takes a graph as input and finds the subset of the edges of that graph which

      form a tree that includes every vertex

      has the minimum sum of weights among all the trees that can be formed from the graph

How Prim's algorithm works

It falls under a class of algorithms called greedy algorithms that find the local optimum in the hopes of finding a global optimum.

We start from one vertex and keep adding edges with the lowest weight until we reach our goal.

The steps for implementing Prim's algorithm are as follows:

  1. Initialize the minimum spanning tree with a vertex chosen at random.
  2. Find all the edges that connect the tree to new vertices, find the minimum and add it to the tree
  3. Keep repeating step 2 until we get a minimum spanning tree

Example of Prim's algorithm

Repeat until you have a spanning treeRepeat until you have a spanning tree


Prim's Algorithm pseudocode

The pseudocode for prim's algorithm shows how we create two sets of vertices U and V-U. U contains the list of vertices that have been visited and V-U the list of vertices that haven't. One by one, we move vertices from set V-U to set U by connecting the least weight edge.

T = ∅;

U = { 1 };

while (U ≠ V)

    let (u, v) be the lowest cost edge such that u ∈ U and v ∈ V - U;

    T = T ∪ {(u, v)}

    U = U ∪ {v}

 

Prim's Algorithm Complexity

The time complexity of Prim's algorithm is O(V2).


Prim's Algorithm Application

     Laying cables of electrical wiring

     In network designed

     To make protocols in network cycles

 

Kruskal's vs Prim's Algorithm

Prim's algorithm is another popular minimum spanning tree algorithm that uses a different logic to find the MST of a graph. Instead of starting from an edge, Prim's algorithm starts from a vertex and keeps adding lowest-weight edges which aren't in the tree, until all vertices have been covered.

 

Comments

Popular posts from this blog

RDBMS LAB EXERCISES WITH ANSWER

DATA STRUCTURES-UNIT V