Note that there separate sets of assignments for CS 451/651 and CS 431/631. Make sure you work on the correct asssignments!
This assignment is a warmup exercise to get you familar with some of the basic tools you will need for the remaining assignments. In particular, we will be making use of Python (for programming) and Jupyter notebooks.
The general setup is as follows: for each assignment, you will be provided with a "starter" notebook, which will describe what needs to be done for that assignment. You'll complete the assignment in the notebook, and then submit your notebook to a private GitHub repo. Shortly after the assignment deadline, we'll pull your repo for marking.
I'm assuming you already have a GitHub account. If not, create one as soon as possible. Once you've signed up for an account, go and request an educational account. This will allow you to create private repos for free. Please do this as soon as possible since there may be delays in the request verification process.
Create a private repo called bigdata2019w
. I'm
assuming that you're already familiar GitHub, but just in
case, here is how
you create a repo on GitHub. If you've successfully gotten an
educational account (per above), you should be able to create private
repos for free.
All of the programming required for the assignments will be in Python. If you have never programmed in Python before, you will need to gradually bring yourself up to speed. There are many on-line resources that can help with this. A good place to start is python.org. In particular, you can start with their Python Tutorial. If you don't like that particular tutorial, there are many others to choose from. There are also many Python books to choose from, if you prefer to learn that way. Choose a book that fits your needs. For example, some books target people who are migrating to Python from other languages, while others are directed at novice programmers.
Most Python tutorials expect you to try out examples as you go along, i.e., they expect you to write and run Python code. This kind of active learing is definitely the way to go. The simplest way for you to run Python code is by using a Jupyter notebook running on the CS Jupyter hub (see below). This will allow you to run Python in a web browser, without having to install any software on your machine. If you wish, you can also install Python locally on your own machine. Python is freely available for a variety of platforms. Bear in mind that all assignments for CS431/631 will be done using notebooks, so it is not a bad idea to get used to them.
For this course, you will be writing and running Python code in Jupyter notebooks. Each notebook consists of a sequence of cells. An cell can hold (formatted) text, Python code, or graphics. A great thing about notebooks is that you can open and run them in a web browser. This means that you can work on your own machine, using only a web browser, without having to install any additional software.
A Jupyter "hub" is a place to store and use Jupyter notebooks.
For the CS431/631 assignments, you'll be using a hub operated by
the School of Computer Science. To get started,
go to jupyter.student.cs.uwaterloo.ca.
Log in using your userid and password for the
It is a good idea to create a new folder to hold all of your work for this course, if you do not already have one. To do this, use the New dropdown on the top right to create a new folder, and call the folder cs431 (or whatever name you prefer). Then, open your new folder by clicking on it.
Once you are in your cs431 folder, try creating a new Jupyter notebook. To create a notebook, use the New dropdown to create a new Python 3 notebook. You should see something that looks like this:
This represents a notebook with a single, empty cell. Before going any further with your notebook, try out the following three basic things that you will need to be able to do:The basic workflow for each assignment will be something like this:
Once you are done with the assignment, submit A0 using the following steps: