Data Structures

This is about algorithms but data structures are also important and one cannot do without the other. So first a refresher on data structures.

  • Data structure is the organization of data in the computer
  • The following are the basic data structures in computer programming: Array, Linked List, Stack, Queue, Tree, and Graph.
  • Array is pretty much the most common and basic one of them all that software developers used often. It is used to store multiple items together in contiguous memory locations.
  • Linked List stores multiple items as well but in separate locations and are linked using pointers that tells you where the next item is. Some applications are:
    1. Implement dynamic memory management functions of operating system
    2. Polynomial implementation for mathematical operations
    3. As circular linked list to implement OS or application functions that require round robin execution of tasks
    4. When a user uses the alt+tab key combination to browse through the opened application to select a desired application
    5. As circular queue to maintain the playing sequence of multiple players in a game
  • Stack is a linear data structure that dictates the order in which operations are performed. It’s a last in first out (LIFO) or first in last out (FILO) operation. Some applications are:
    1. Manage function calls including nested operations, recursive operations
    2. Checking, evaluating, converting expressions in various programming languages
    3. In all the problems solutions based on backtracking
    4. Used in depth first search in graph and tree traversal
    5. Operating system functions
    6. Undo and redo functions in an editor
  • Queue is similar to stack but it’s a first in first out (FIFO) operation. Some applications are:
    1. Used in breadth search operation in graphs
    2. Job scheduling, CPU scheduling, disk scheduling, printer buffer, keyboard buffer in operating systems
    3. Data transfer between peripheral devices and CPU
    4. Interrupts generated by the user applications for CPU
    5. Calls handled by the customers
  • Tree is a non-linear data structure whose elements can have more than 1 children. One kind of tree that is commonly referred to as Binary Tree has at most 2 children. Binary tree node contains the data, a pointer to the left child and a pointer to the right child. Some applications of a tree data structure are:
    1. Implementing the hierarchical structures in computer systems like directory and file system
    2. Decision making in gaming applications
    3. Parsing of expressions and statements in programming language compilers
    4. Hash trees
    5. Path-finding algorithm to implement in AI, robotics and video games applications
  • Graph is a non-linear data structure consisting of a finite set of vertices (or nodes) and set of edges which connect a pair of nodes. Some applications are:
    1. Representing networks and routes in communication, transportation and travel applications
    2. Interconnections in social networks and other network based applications
    3. Mapping applications
    4. Resource utilization and availability in an organization
    5. Robotic motion and neural networks
  • There are some more data structures out there but the above is enough to start with. And having listed the applications they can be used for gives them much deserving appreciation and importance.
  • You can check .NET Collections and Data Structures from Microsoft for the data structures available in .NET and their complexities as well.
  • Additionally look at Fundamental Data Structures and Algorithms in C# to get a visual explanation and C# code example of the above data structures.
  • For Javascript, check JavaScript data types and data structures from Mozilla. It has in ECMAScript 6 new data structures such as Set and Map.
  • And for visual explanation and Javascript code example check Data Structures in JavaScript – With Code Examples.

Algorithms

Now that we got the data structures out of the way, let’s get on with algorithms.

  • Algorithm is a set of instructions to solve a class of specific problems or to perform a computation.
  • In Asymptotic Analysis, the performance (or complexity) of an algorithm is evaluated in terms of input size (not the actual running time). It’s calculated by how the time (or space) taken by an algorithm increases with the input size.
  • One of 3 asymptotic anotations is the Big O notation (e.g. $ O(N) $) which defines an upper bound of an algorithm (worst case).
  • The following are the common Order-of-growth classifications, from fastest to slowest:

    name order of growth description example
    Constant $ 1 $ statement add 2 numbers
    Logarithmic $ log \ N $ (barely slower than constant) divide in half binary search
    Linear $ N $ loop find the maximum
    Linearithmic $ N \ log \ N $ divide and conquer mergesort
    Quadratic $ N ^ {2} $ double loop check all pairs
    Cubic $ N ^ {3} $ triple loop check all triples
    Exponential $ 2 ^ {N} $ exhaustive search check all subsets
  • There is also this Amortized Analysis that is used for algorithms where an occasional operation is very slow, but most of the other operations are faster. In Amortized Analysis, a sequence of operations is analyzed and guarantees a worst-case average time that is lower than the worst-case time of a particularly expensive operation.
  • Following are some types of algorithms based on their implementation:
    • Brute force is a direct and straightforward technique of solving problems in which all possible ways to a given problem is enumerated. Best for solving small and simple problems.
    • Recursive involves a function calling itself directly or indirectly, performing same operations multiple times with different inputs, and with a base condition to stop recursion.
    • Divide and Conquer involves dividing the problem into smaller sub-problems, solving them recursively, then combining them to get final solution of the whole problem.
    • Dynamic programming is also known as the memoization technique because in this, the idea is to store the previously calculated result to avoid calculating it again and again.
    • Greedy Algorithm is where the solution is built part by part. The decision to choose the next part is done on the basis that it gives the immediate benefit. It never considers the choices that had taken previously. E.g. Dijkstra shortest path algorithm.
    • Backtracking solves the problem in an incremental way i.e. it is an algorithmic-technique for solving problems recursively by trying to build a solution incrementally, one piece at a time, removing those solutions that fail to satisfy the constraints of the problem at any point of time.
  • And you have other types of algorithms based on their purpose:
    • Sorting Algorithm is used to rearrange a given array or list elements according to a comparison operator on the elements. And following are the known sorting algorithms around:
      • Selection Sort
      • Bubble Sort
      • Insertion Sort
      • Merge Sort
      • Quicksort
      • Heapsort
      • Iterative Heapsort
      • Counting Sort
      • Radix Sort
      • Bucket Sort
    • Searching Algorithm is designed to check for an element or retrieve an element from any data structure where it is stored. They are further classified into:
      • Sequential Search (e.g. Linear Search)
      • Interval Search (e.g. Binary Search)
    • Pattern Search Algorithm also known as String Search Algorithm is designed for searching a string within another string
    • Mathematical Algorithm (e.g. Fibonacci, Greatest Common Denominator)
    • Geometric Algorithm is designed to solve Geometric Problems and require in-depth knowledge of different mathematical subjects like combinatorics, topology, algebra, differential geometry etc. (e.g. Lines, Triangles, 3D Objects, Polygons)
    • Graph Algorithm (e.g. Breadth First Search, Depth First Search, Cycle, Shortest Paths)

What’s Next

There’s a ton more things to know about algorithms (and data structures) and what best way to be familiar with them is to implement them using your favorite programming language. So I started implementing my own solutions to some of the algorithm problems out there, the code you can see and run on My .NET C# Fiddle library.

So far at the time of this posting, I managed to work on these algorithm problems: Robot Bounded in Circle, Maximize Score After N Operations, All Combination of Numbers Sum to Target, and Fractional Knapsack. I will be building that library as I continue implementing different kinds of algorithms. And that I feel is the best way to learn and make them stick longer. I am also planning on implementing them on Javascript and Python, so really I’m excited about this journey.

References