**Scrivener Publishing**

100 Cummings Center, Suite 541J

Beverly, MA 01915-6106

*Publishers at Scrivener*

Martin Scrivener (martin@scrivenerpublishing.com)

Phillip Carmical (pcarmical@scrivenerpublishing.com)

Edited by

Sachi Nandan Mohanty

*ICFAI Foundation For Higher Education, Hyderabad, India*

and

Pabitra Kumar Tripathy

Kalam Institute of Technology, Berhampur, India

This edition first published 2021 by John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA and Scrivener Publishing LLC, 100 Cummings Center, Suite 541J, Beverly, MA 01915, USA © 2021 Scrivener Publishing LLC

For more information about Scrivener publications please visit www.scrivenerpublishing.com.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

**Wiley Global Headquarters**

111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

**Limit of Liability/Disclaimer of Warranty**

While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read.

*Library of Congress Cataloging-in-Publication Data*

ISBN 978-1-119-75054-3

Cover image: Pixabay.Com

Cover design by Russell Richardson

Set in size of 11pt and Minion Pro by Manila Typesetting Company, Makati, Philippines

Printed in the USA

10 9 8 7 6 5 4 3 2 1

Welcome to the first edition of *Data Structures Using and Algorithms C++.* A data structure is the logical or mathematical arrangement of data in memory. To be effective, data has to be organized in a manner that adds to the efficiency of an algorithm and also describe the relationships between these data items and the operations that can be performed on these items. The choice of appropriate data structures and algorithms forms the fundamental step in the design of an efficient program. Thus, a deep understanding of data structure concepts is essential for students who wish to work on the design and implementation of system software written in C++, an object-oriented programming language that has gained popularity in both academia and industry. Therefore, this book was developed to provide comprehensive and logical coverage of data structures like stacks, queues, linked lists, trees and graphs, which makes it an excellent choice for learning data structures. The objective of the book is to introduce the concepts of data structures and apply these concepts in real-life problem solving. Most of the examples presented resulted from student interaction in the classroom. This book utilizes a systematic approach wherein the design of each of the data structures is followed by algorithms of different operations that can be performed on them and the analysis of these algorithms in terms of their running times.

This book was designed to serve as a textbook for undergraduate engineering students across all disciplines and postgraduate level courses in computer applications. Young researchers working on efficient data storage and related applications will also find it to be a helpful reference source to guide them in the newly established techniques of this rapidly growing research field.

**Dr. Sachi Nandan Mohanty and Prof. Pabitra Kumar Tripathy**

December 2020

Introduction to Data Structure

Data structure is the representation of the logical relationship existing between individual elements of data. In other words the data structure is a way of organizing all data items that considers not only the elements stored but also their relationship to each other.

Data structure specifies

- Organization of data
- Accessing methods
- Degree of associativity
- Processing alternatives for information

The data structures are the building blocks of a program and hence the selection of a particular data structure stresses on

- The data structures must be rich enough in structure to reflect the relationship existing between the data, and
- The structure should be simple so that we can process data effectively whenever required.

In mathematically **Algorithm + Data Structure = Program**

Finally we can also define the data structure as the “Logical and mathematical model of a particular organization of data”

Data structure can be broadly classified into two categories as Linear and Non-Linear

**Linear Data Structures**

In linear data structures, values are arranged in linear fashion. Arrays, linked lists, stacks, and queues are the examples of linear data structures in which values are stored in a sequence.

**Non-Linear Data Structure**

This type is opposite to linear. The data values in this structure are not arranged in order. Tree, graph, table, and sets are the examples of nonlinear data structure.

**Operations Performed in Data Structure**

In data structure we can perform the operations like

- Traversing
- Insertion
- Deletion
- Merging
- Sorting
- Searching

The step by step procedure to solve a problem is known as the **ALGORITHM.** An algorithm is a well-organized, pre-arranged, and defined computational module that receives some values or set of values as input and provides a single or set of values as out put. These well-defined computational steps are arranged in sequence, which processes the given input into output.

An algorithm is said to be accurate and truthful only when it provides the exact wanted output.

The efficiency of an algorithm depends on the time and space complexities. The complexity of an algorithm is the function which gives the running time and/or space in terms of the input size.

**Steps Required to Develop an Algorithm**

- Finding a method for solving a problem. Every step of an algorithm should be defined in a precise and in a clear manner. Pseudo code is also used to describe an algorithm.
- The next step is to validate the algorithm. This step includes all the steps in our algorithm and should be done manually by giving the required input, perform the required steps including in our algorithm and should get the required amount of output in a finite amount of time.
- Finally implement the algorithm in terms of programming language.

**Mathematical Notations and Functions**

- ❖
**Floor and Ceiling Functions** - Floor function returns the greatest integer that does not exceed the number.
- Ceiling function returns the least integer that is not less than the number.
- ❖
**Remainder Function**To find the remainder “mod” function is being used as

- ❖
**To find the Integer and Absolute value of a number**INT(5.34) = 5 This statement returns the integer part of the number

INT(- 6.45) = 6 This statement returns the absolute as well as the integer portion of the number

- ❖
**Summation Symbol**To add a series of number as a1+ a2 + a3 +............+ an the symbol Σ is used

- ❖
**Factorial of a Number**The product of the positive integers from 1 to n is known as the factorial of n and it is denoted as n!.

**Algorithemic Notations**

While writing the algorithm the comments are provided with in [ ].

The assignment should use the symbol “: =” instead of “=”

For Input use Read : variable name

For output use write : message/variable name

The control structures can also be allowed to use inside an algorithm but their way of approaching will be some what different as

**Simple If**

```
If condition, then:
Statements
[end of if structure]
```

**If...else**

```
If condition, then:
Statements
Else :
Statements
[end of if structure]
```

**If...else ladder**

```
If condition1, then:
Statements
Else If condition2, then:
Statements
Else If condition3, then:
Statements
…………………………………………
…………………………………………
…………………………………………
Else If conditionN, then:
Statements
Else:
Statements
[end of if structure]
```

**LOOPING CONSTRUCT**

```
Repeat for var = start_value to end_value by
step_value
Statements
[end of loop]
Repeat while condition:
Statements
[end of loop]
Ex : repeat for I = 1 to 10 by 2
Write: i
[end of loop]
```

**OUTPUT**

`1 3 5 7 9`

The complexity of programs can be judged by criteria such as whether it satisfies the original specification task, whether the code is readable. These factors affect the computing time and storage requirement of the program.

**Space Complexity**

The space complexity of a program is the amount of memory it needs to run to completion. The space needed by a program is the sum of the following components:

- A fixed part that includes space for the code, space for simple variables and fixed size component variables, space for constants, etc.
- A variable part that consists of the space needed by component variables whose size is dependent on the particular problem instance being solved, and the stack space used by recursive procedures.

**Time Complexity**

The time complexity of a program is the amount of computer time it needs to run to completion. The time complexity is of two types such as

- Compilation time
- Runtime

The amount of time taken by the compiler to compile an algorithm is known as compilation time. During compilation time it does not calculate for the executable statements, it calculates only the declaration statements and checks for any syntax and semantic errors.

The run time depends on the size of an algorithm. If the number of instructions in an algorithm is large, then the run time is also large, and if the number of instructions in an algorithm is small, then the time for executing the program is also small. The runtime is calculated for executable statements and not for declaration statements.

Suppose space is fixed for one algorithm then only run time will be considered for obtaining the complexity of algorithm, these are

- Best case
- Worst case
- Average case

**Best Case**

Generally, most of the algorithms behave sometimes in best case. In this case, algorithm searches the element for the first time by itself.

For example: In linear search, if it finds the element for the first time by itself, then it behaves as the best case. Best case takes shortest time to execute, as it causes the algorithms to do the least amount of work.

**Worst Case**

In worst case, we find the element at the end or when searching of elements fails. This could involve comparing the key to each list value for a total of N comparisons.

For example in linear search suppose the element for which algorithm is searching is the last element of array or it is not available in array then algorithm behaves as worst case.

**Average Case**

Analyzing the average case behavior algorithm is a little bit complex than the best case and worst case. Here, we take the probability with a list of data. Average case of algorithm should be the average number of steps but since data can be at any place, so finding exact behavior of algorithm is difficult. As the volume of data increases, the average case of algorithm behaves like the worst case of algorithm.

Efficiency of an algorithm can be determined by measuring the time, space, and amount of resources it uses for executing the program. The amount of time taken by an algorithm can be calculated by finding the number of steps the algorithm executes, while the space refers to the number of units it requires for memory storage.

The asymptotic notations are the symbols which are used to solve the different algorithms and the notations are

- Big Oh Notation (
**O**) - Little Oh Notation (
**o**) - Omega Notation (
**Ω**) - Theta Notation (
**θ**)

**Big Oh (O) Notation**

This Notation gives the upper bound for a function to within a constant factor. We write f(n) = O(g(n)) if there are +ve constants n0 and C such that to the right of n0, the value of f(n) always lies on or below Cg(n)

**Omega Notation (Ω)**

This notation gives a lower bound for a function to with in a constant factor. We write f(n) = Ωg(n) if there are positive constants n0 and C such that to the right of n0 the value of f(n) always lies on or above Cg(n)

**Theta Notation (θ)**

This notation bounds the function to within constant factors. We say f(n) = θg(n) if there exists +ve constants n0, C1 and C2 such that to the right of n0 the value of f(n) always lies between c1g(n) and c2(g(n)) inclusive.

**Little Oh Notation (o)**

**Introduction**

An important question is: How efficient is an algorithm or piece of code? Efficiency covers lots of resources, including:

CPU (time) usage

Memory usage

Disk usage

Network usage

All are important but we will mostly talk about CPU time

Be careful to differentiate between:

**Performance:** how much `time/memory/disk/...`

is actually used when a program is running. This depends on the machine, compiler, etc., as well as the code.

**Complexity:** how do the resource requirements of a program or algorithm scale, i.e., what happens as the size of the problem being solved gets larger. Complexity affects performance but not the other way around. The time required by a method is proportional to the number of “basic operations” that it performs. Here are some examples of basic operations:

```
one arithmetic operation (e.g., +, *).
one assignment
one test (e.g., x == 0)
one read
one write (of a primitive type)
```

*Note: As an example,*

O(1) refers to constant time.

O(n) indicates linear time;

O(*n*^{k}) (k fixed) refers to polynomial time;

O(log n) is called logarithmic time;

*O*(2^{n}) refers to exponential time, etc.

*n ^{2} + 3n + 4 is O(n^{2}), since n^{2} + 3n + 4 < 2n^{2} for all n > 10. Strictly speaking, 3n + 4 is O(n^{2}), too, but big-O notation is often misused to mean equal to rather than less than.*

In general, how can you determine the running time of a piece of code? The answer is that it depends on what kinds of statements are used.

**1. Sequence of statements**

```
statement 1;
statement 2;
...
statement k;
```

Note: this is code that really is exactly k statements; this is **not** an unrolled loop like the N calls to *addBefore* shown above.) The total time is found by adding the times for all statements:

```
total time = time(statement 1) + time
(statement 2) + ... + time(statement k)
```

If each statement is “simple” (only involves basic operations) then the time for each statement is constant and the total time is also constant: O(1). In the following examples, assume the statements are simple unless noted otherwise.

**2. if-then-else statements**

```
if (cond) {
sequence of statements 1
}
else {
sequence of statements 2
}
```

Here, either sequence 1 will execute, or sequence 2 will execute. Therefore, the worst-case time is the slowest of the two possibilities: max(time(sequence 1), time(sequence 2)). For example, if sequence 1 is O(N) and sequence 2 is O(1) the worst-case time for the whole if-then-else statement would be O(N).

**3. for loops**

```
for (i = 0; i < N; i++) {
sequence of statements
}
```

The loop executes N times, so the sequence of statements also executes N times. Since we assume the statements are O(1), the total time for the for loop is N * O(1), which is O(N) overall.

**4. Nested loops**

```
for (i = 0; i < N; i++) {
for (j = 0; j < M; j++) {
sequence of statements
}
}
```

The outer loop executes N times. Every time the outer loop executes, the inner loop executes M times. As a result, the statements in the inner loop execute a total of N * M times. Thus, the complexity is O(N * M). In a common special case where the stopping condition of the inner loop is `j < N`

instead of `j < M`

(i.e., the inner loop also executes N times), the total complexity for the two loops is O(N^{2}).

**5. Statements with method calls:**

When a statement involves a method call, the complexity of the statement includes the complexity of the method call. Assume that you know that method *f* takes constant time, and that method *g* takes time proportional to (linear in) the value of its parameter k. Then the statements below have the time complexities indicated.

```
f(k); // O(1)
g(k); // O(k)
```

When a loop is involved, the same rule applies. For example:

` for (j = 0; j < N; j++) g(N);`

has complexity (N^{2}). The loop executes N times and each method call `g(N)`

is complexity `O(N)`

.

**Examples**

**Q1. What is the worst-case complexity of the each of the following code fragments?**

Two loops in a row:

```
for (i = 0; i < N; i++) {
sequence of statements
}
for (j = 0; j < M; j++) {
sequence of statements
}
```

**Answer:** The first loop is O(N) and the second loop is O(M). Since you do not know which is bigger, you say this is O(N+M). This can also be written as O(max(N,M)). In the case where the second loop goes to N instead of M the complexity is O(N). You can see this from either expression above. O(N+M) becomes O(2N) and when you drop the constant it is O(N). O(max(N,M)) becomes O(max(N,N)) which is O(N).

**Q2. How would the complexity change if the second loop went to N instead of M?**

A nested loop followed by a non-nested loop:

```
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
sequence of statements
}
}
for (k = 0; k < N; k++) {
sequence of statements
}
```

**Answer:** The first set of nested loops is O(N^{2}) and the second loop is O(N). This is O(max(N^{2},N)) which is O(N^{2}).

**Q3. A nested loop in which the number of times the inner loop executes depends on the value of the outer loop index:**

```
for (i = 0; i < N; i++) {
for (j = i; j < N; j++) {
sequence of statements
}
}
```

**Answer:** When i is 0 the inner loop executes N times. When i is 1 the inner loop executes N-1 times. In the last iteration of the outer loop when i is N-1 the inner loop executes 1 time. The number of times the inner loop statements execute is N + N-1 + ... + 2 + 1. This sum is N(N+1)/2 and gives O(N^{2}).

**Q4. For each of the following loops with a method call, determine the overall complexity. As above, assume that method f takes constant time, and that method g takes time linear in the value of its parameter.**

**a. for (j = 0; j < N; j++) f(j);**
**b. for (j = 0; j < N; j++) g(j);**
**c. for (j = 0; j < N; j++) g(k);**

**Answer:** a. Each call to f(j) is O(1). The loop executes N times so it is N x O(1) or O(N).

b. The first time the loop executes j is 0 and g(0) takes “no operations. The next time j is 1 and g(1) takes 1 operations. The last time the loop executes j is N-1 and g(N-1) takes N-1 operations. The total work is the sum of the first N-1 numbers and is O(N^{2}).

c. Each time through the loop g(k) takes k operations and the loop executes N times. Since you do not know the relative size of k and N, the overall complexity is O(N x k).

- What is data structure?
- What are the types of operations that can be performed with data structure?
- What is asymptotic notation and why is this used?
- What is complexity and its type?
- Find the complexity of 3n
^{2}+ 5n. - Distinguish between linear and non-linear data structure.
- Is it necessary is use data structure in every field? Justify your answer.