Outline

Relationship between algorithms and data structures
Physical memory
Concrete versus Abstract data structure
Contiguous versus Linked
Dynamic arrays – amortized analysis
Abstract Data Type (ADT)
Linear abstract data types: lists, stacks, queues, deques
Dictionaries
Linear implementations of dictionaries

Objective

Explain the relationship between algorithm and data structures
Distinguish between concrete data structure and abstract data structure
Analyze and explain contiguous and linked structure to implement ADT such as Stack, Queue, Deque
Analyze the dictionary operations based on different implementation

Concrete versus Abstract data structure

We therefore have two levels of thinking about data structures:
Concrete: concerned with addresses in physical memory
Abstract: concerned only with abstract operations supported
Example:
Concrete: arrays, linked lists
Abstract: sets, lists, dictionaries, trees, graphs
But our implementations of abstractions must be in terms of the concrete structures with which our computer operates

Contiguous versus Linked

We can subdivide concrete data structures into two classes:
Contiguous: Composed of a single block of memory
Linked: Composed of multiple distinct chunks of memory together by pointers

Benefits of using contiguous array structures

We can retrieve an array element from its index in constant time, O(1), meaning it costs us asymptotically nothing to look up a record – this is a really big deal
Consist solely of data, no space wasted on links
Physical continuity/memory locality: if we look up element $i$ , there is a high probability we will look up element $i+1$ next – this is exploited by cache memory in modern computer architectures

Drawbacks of using contiguous array structures

Inflexible: we have to decide in advance how much space we want when the array is allocated
Once the block of memory for the array has been allocated, that’s it – we’re stuck with the size we’ve got
If we try to write past the end of the array (overflow), we’ll be intruding on memory allocated for something else causing a segmentation fault
We can compensate by always allocating arrays larger than we think we’ll need, but this wastes a lot of space
Inflexible: think about removing or inserting sequences of records in the middle of an array

Dynamic arrays

A potential way around the problem of having to decide array size in advance is to use the dynamic arrays
We could start with an array of size 1
Each time we run out of space (i.e. want to write to index $m+1$ in an array of size $m$ ) we find a block of free memory, allocate a new array increasing the array size from $m$ to $2 m$ and copy all the contents across
Q: If we currently have $n$ items in our dynamic array, how many doubling operations will we have executed so far?
A: $⌈\log_2 n⌉$
The expensive part is copying every element into the new larger array when we have to resize
Q: How expensive is this?
A: Linear: $O(n)$

Dynamic arrays

The trickier question to answer is this
Q: What is the worst case complexity of inserting into a dynamic array?
A: It depends on whether we’ve filled up the array or not:
Not full: Just insert the element = O(1)
Full: Allocate new array, copy everything across, add new element = $O(n)$
We can’t give a definitive answer on the worst case complexity – it depends!

Dynamic Arrays

Let's imagine we've just copied our data to a larger array:

We can now make $n$ insertions at cost O(1) before we have to do anymore copying
The $n+1^{th}$ insertion will cost us $2n$ = $O(n)$
Total work for $n$ insertions is $3 n$ .
$n$ insertions into a dynamic array is complexity $O(n)$
$n$ insertions into our standard array is also complexity $O(n)$ ...

Amortized analysis

This sort of analysis is called amortized analysis
Meaning: average cost of an operation over a sequence of operations
Different to average-case analysis (which is averaging over probability distribution of possible inputs)
Key idea of dynamic arrays: insertions will “usually” be fast, accessing elements will always be O(1)
In Big Oh terms, a dynamic array is no more inefficient than a standard array

Stacks (Last-In First-Out:LIFO)

Operations:

IsEmpty(S) – Return true if stack is empty
Push(S,x) – Add x to top of stack
Pop(S) – Remove top item from stack

Stacks crop up in recursive algorithms

Deques

Operations:

PushFront(D,x) – Add x to the front of the queue
PopFront(D) – Remove item from front of queue (same as DeQueue)
PushBack(D,x) – Add x to the back of the queue (same as EnQueue)
PopBack(D) – Remove item from back of queue

More versatile variant of a queue
Short for double-ended queue, pronounced “deck”

Stacks and Queues Implemented as Arrays

Stacks as Arrays

We only need to keep track of length

Pop(S)- Returns S[length] and reduces length by 1
Push(S,x)- Increments length by 1 and sets S[length]=x
IsEmpty(S)- Tests length=0

Queues as Arrays

We keep track of front and back index

DeQueue(Q) – Returns Q[front] and increments front by 1, if front > length of array, wrap back round to 1
Enqueue(Q,x) – Increment Q[back] by 1, if back > length of array, wrap back round to 1, set Q[back]=x

Exercise:think up similar instructions for list implementations

Stacks and Queues

All operations on stacks and queues are O(1), implemented as either arrays or linked lists
Poping an empty stack or dequeueing an empty queue is called underflow
Trying to add an item when the memory limit of the chosen implementation has been reached is called overflow

Dictionary Operations	Unsorted array	Sorted array
Lookup(D, k)	O( $n$ )	O( $\log n$ )
Insert(D, k)	O(1)	O( $n$ )
Delete(D, k)	O( $n$ )	O( $n$ )
IsPresent(D, k)	O( $n$ )	O( $\log n$ )

Dictionary Operations	Unsorted array	Sorted array
Lookup(D, k)	O( $n$ )	O( $n$ )
Insert(D, k)	O(1)	O( $n$ )
Delete(D, k)	O( $n$ )	O( $n$ )
IsPresent(D, k)	O( $n$ )	O( $n$ )

Conclusion

We’ve seen the difference between concrete and abstract, linked and contiguous
We’ve seen some important examples of ADTs
Linear implementations of dictionaries aren’t very efficient
Using a sorted array makes dictionary lookups fast

Comments and Suggestions

Assistant Professor Krung Sinapiromsaran
Web: http://pioneer.netserv.chula.ac.th/~skrung
Email: Krung.S@chula.ac.th

Lecture 4:Data Structure

Outline

Objective

Relationship between algorithm and data structure

Relationship between algorithm and data structure

Physical memory

Physical memory

Concrete versus Abstract data structure

Contiguous versus Linked

Contiguous data structures -- Arrays and Record

Contiguous data structures -- Arrays and Record

Contiguous data structures -- Arrays and Record

Contiguous data structures -- Arrays and Record

Benefits of using contiguous array structures

Drawbacks of using contiguous array structures

Dynamic arrays

Dynamic arrays

Dynamic Arrays

Amortized analysis

Linked Structures

Linked Structures

Linked Structures

Linked Structures

Schematic representation of linked structures

Schematic representation of linked structures

Benefits of using linked list structures

Drawbacks of using linked list structures

Abstract Data Type

Abstract Data Type

Stacks (Last-In First-Out:LIFO)

Queues (Last-in First-out

Queues (Last-in First-out)

Deques

Stacks and Queues Implemented as Arrays

Stacks and Queues

Dictionary

Dictionary Operations

Dictionary Operations

Dictionary Operations

Dictionary Operations

Dictionary Operations

Conclusion

Comments and Suggestions