Lab 1

In this lab, we will implement the perceptron algorithm. Given a linearly separable dataset where each datapoint belongs to either a class -1 or 1, and has d features, this algorithm computes a hyperplane in d dimensions that perfectly separates the two classes - i.e. acts as a binary classifier.

You have been provided with two code files: lab1_functions.py and lab1_main.py. lab1_functions.py contains function definitions which are called by the code in lab1_main.py. Your task is to fill in the missing code in two incompletely implemented functions in lab1_functions.py. There is no need to modify lab1_main.py.

Four toy datasets are also provided as csv files. The format is as follows: for a dataset with n datapoints, each having d dimensions, a csv file has d+1 columns, and n rows (excluding the header). The entry in the ith row and jth column contains the jth feature value for the ith datapoint. The (j+1)th column contains the class value for the datapoint.

lab1_functions.py contains six functions. The first four of these are for handling input, output and visualization of the data where possible. The last two are to be completed by you. Go through the comments in both the code files for further clarity.

To run the code, simply type the following in the terminal:
python3 lab1_main.py

When the input dataset has 2 features, a plotted visualization will be saved to the working directory. The separating hyperplane calculated will be printed to the terminal.


Part 1: Implementing the perceptron algorithm for linearly separable datasets

Here we can take any of dataset_1.csv, dataset_2.csv, or dataset_3.csv as input. You may also define your own input files in the appropriate format. Complete the 'perceptron_algorithm' defined in lab1_functions.py. You only need to add your code between the marked lines.


Part 2: Dealing with a dataset that is not linearly separable (at first glance...?)

Here, we work with dataset_4.csv. Complete the 'feature_addition' defined in lab1_functions.py. You only need to add your code between the marked lines.