JMU logo

Array Set PA

Introduction

The objective of this assignment is to implement a contiguous implementation of a set data structure. In mathematics, a setis an unordered collection of distinct objects. Section 1.2.3 in our textbook also describes the USet interface, which is very similar.

Common uses of sets in computing include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.

IMPORTANT: This is a large programming assignment! Your solution will most likely span several hundred lines of code (the reference solution is ~300 lines). Your final grade for the assignment will be based on the final deliverable. However, you should not wait until the final week to work on this PA! To encourage early engagement, we are providing the opportunity for you to submit your work at two intermediate milestones to receive feedback on your progress. We highly recommend that you take advantage of this to avoid falling irreparably behind.

Getting Started

First download and decompress one of the starter files:

Don't forget to download the appropriate Makefile for your platform from Piazza or the course website.

The Set ADT

For the purposes of this assignment, the set interface (an abstract data type) should support the following operations:

In addition, you will need to implement standard data structure features:

For this assignment, your implementation of this set interface should store elements in an unordered dynamic array.

This is a very simple way to store a set. However, it is not a very efficient approach because determining set membership requires a sequential search through the array. As a result, some set operations may take quadratic time. For example: arrayset_is_disjoint(set1, set2) will require _O(n*m)_ steps in the worst case: we need to search through all melements in set2 for each of the n items in set1.

We will consider more efficient implementations of the Set ADT later in the semester.

Definitions

Because C does not support higher-level polymorphism, we will need to restrict set elements to a single type. However, we would like to parameterize it as much as possible to facilitate changes later on. Thus, we will use a single typedef to create a data_t type, and use that wherever we wish to refer to the type of set elements. For testing purposes, we will only use sets to store integers.

This definition is in the provided array_set.h:



typedef 
int data_t;

We also need a data type definition for the set structure itself. Recall from the class discussion that we need three pieces of information to maintain a dynamic array:

  1. A pointer to the actual storage array.
  2. The current maximum allocated capacity of the array.
  3. The current size (or length) of the array.

We will store this information in a single array_set struct and create an array_set_t type for it. These definitions are also in the provided array_set.h:



typedef 
struct array_set {
    data_t *array;
    size_t capacity;
    size_t size;
} array_set_t;

Finally, we will need an iterator type for our set. Iterators will allow us to traverse the set in a way that is independent of the underlying implementation. For the purposes of this assignment, an iterator will just be a pointer to a data element; i.e., a data_h* value. As before, the definition is in array_set.h:



typedef data_t* array_set_iter_t;

You should not modify these definitions, or any other definitions in array_set.h. Your code (in array_set.c) should compile against the provided array_set.h without errors or warnings.

Part 1: Basic Set Operations

MILESTONE DATE: Friday, October 2, 2015 at 23:59 EDT (11:59pm)

The following operations should all be O(1); i.e., their running time should not depend on the number of elements in the set.

HINT: In the array_set_free function, don't forget to clean up the other members of the array_set_t struct after you free the actual array.

The following operations deal with adding or removing element from the set. The pop operation should run in O(1) time, and the other operations (addition, removal and membership testing) should run in O(n) time.

Part 2: Iterators and Comparisons

MILESTONE DATE: Friday, October 9, 2015 at 23:59 EDT (11:59pm)

The following operations handle iteration over a set. They should all be O(1).

The following operations handle comparisons between two sets. Because the underlying storage structure for this assignment (dynamic arrays) only supports O(n) lookups, these methods will require O(n2) time.

Part 3: Mathematical Set Operations

The last few operations implement standard mathematical set operations. One thing that they all have in common is that they should populate a destination set with elements drawn from other sets. Thus, they all take a "dest" pointer to a set. Each function should ensure that the destination set is empty by clearing it before performing whatever operations are appropriate for that function.

The first function creates a shallow copy of a set, and should run in O(n) time.

The following operations build new sets using two existing sets. Because the underlying storage structure for this assignment (dynamic arrays) only supports O(n) lookups, these methods will require O(n2) time.

Final Submission

DUE DATE: Friday, October 16, 2015 at 23:59 EDT (11:59pm)

Submit ONLY your completed array_set.c file on Canvas using the appropriate assignment submission. Make sure the file contains a comment field at the top with your name and a statement of compliance with the JMU honor code.

Please check your code carefully and make sure that it complies with the project specification. In addition, make sure that your code adheres to general good coding style guidelines. Fix any spelling mistakes, incorrect code formatting, and inconsistent whitespace before submitting. Make sure you include any appropriate documentation.


Contact chaoaj[at]jmu.edu for more information about this page or Twitter: @chaoaj