cleanup

6f120c0b · Dan Suciu · a4575b4c · a4575b4c · a4575b4c · a4575b4c
Commit 6f120c0b authored 7 years ago by Dan Suciu
--- a/hw/hw1/hw1.md
+++ b/hw/hw1/hw1.md
-# CSE 544 Homework 1: Data Analytics Pipeline
-
-**Objectives:** To get familiar with the main components of the data analytic pipeline: schema design, data acquisition, data transformation, querying, and visualizing.
-
-**Assignment tools:** postgres, excel (or some other tool for visualization)
-
-**Assigned date:** January 3rd, 2018
-
-**Due date:** January 19, 2018
-
-**Questions:**  on the [google discussion board](https://groups.google.com/a/cs.washington.edu/forum/#!forum/cse544-18wi-discussion).
-
-**What to turn in:** These files: `pubER.pdf`, `createPubSchema.sql`, `importPubData.sql`, `solution.sql`, `graph.py`, `graph.pdf`. Your `solution.sql` file should be executable using the command `psql -f solution.sql`
-
-Turn in your solution on [CSE's GitLab](https://gitlab.cs.washington.edu). 
-See [submission instructions](#submission) below.
-
-
-**Motivation:** In this homework you will implement a basic data
-analysis pipeline: data acquisition, transformation and extraction,
-cleaning, analysis and sharing of results.  The data is
-[DBLP](http://www.informatik.uni-trier.de/~ley/db/), the reference
-citation website created and maintained by Michael Ley. The analysis
-will be done in postgres. The visualization in excel, or any other
-tool of your choice.
-
-
-**Resources:**
-
- [postgres](https://www.postgresql.org/)
-
- starter code
-
-
-
-# Problems
-
-## Problem 1: Conceptual Design
-
-Design and create a database schema about publications.  We will refer to this schema as `PubSchema`, and to the data as `PubData`.
- E/R Diagram. Design the E/R diagram, consisting of the entity sets and relationships below. Draw the E/R diagram for this schema,  identify all keys in all entity sets, and indicate the correct type of all relationships (many-many or many-one); make sure you use the ISA box where needed.
-  - `Author` has  attributes: `id` (a key; must be unique),  `name`, and `homepage` (a URL)
-  - `Publication` has  attributes: `pubid` (the key -- an integer), `pubkey` (an alternative key, text; must be unique), `title`, and `year`. It has the following subclasses:
-    - `Article` has additional attributes:  `journal`, `month`, `volume`, `number`
-    - `Book`  has additional attributes:  `publisher`, `isbn`
-    - `Incollection` has additional attributes:  `booktitle`, `publisher`, `isbn`
-    - `Inproceedings` has additional attributes:  `booktitle`, `editor`
-  - There is a many-many relationship `Authored` from `Author` to `Publication`
-  - Refer to Chapter 2, "Introduction to Database Design," and Chapter 3.5, "Logical Database Design: ER to Relational" in R&G if you need additional references.
-
-**Turn in** the file `pubER.pdf`
-
-## Problem 2: Schema Design
-
-Here you will create the SQL tables in a database in postgres.  First, check that you have installed postgres on your computer.  Then, create an empty database by running the following command:
-
-```sh
-$ createdb dblp
-```
-
-If you need to restart, then delete it by running:
-
-```sh
-$ dropdb dblp
-```
-
-To run queries in postgres, type:
-
-```sh
-$ psql dblp
-```
-then type in your SQL commands.  Remember three special commands:
-
-```sh
-\q -- quit psql
-\h -- help
-\? -- help for meta commands
-```
-
-Next, design the SQL tables that implement your conceptual schema (the E/R diagram).   We will call this database schema the  `PubSchema`.  Write `create Table` SQL statements, e.g.:
-
-```sql
-create Table Author (...);
-...
-```
-
-Choose `int` and `text` for all data types.  Create keys, foreign
-keys, and unique constraints, as needed; you may either do it within
-`CREATE TABLE`, or postpone this for later and use `ALTER TABLE`.  Do
-NOT use the `inherit` or `pivot` functionality in postgres, instead
-use the simple design principles discussed in class.
-
-Write all your commands in a file called  `createPubSchema.sql`.  You can execute them in two ways.  Start postgres interactively and copy/paste your commands one by one. Or, from the command line run:
-
-```sh
-psql -f createPubSchema.sql dblp
-```
-
-Hint: for debugging purposes, insert `drop Table` commands at the beginning of the `createPubSchema.sql` file:
-
-```sql
-drop table if exists Author;
-...
-```
-
-**Turn in** the file  `createPubSchema.sql`.
-
-
-## Problem 3: Data Acquisition
-
-Typically, this step consists of downloading data, or extracting it with a
-software tool, or inputting it manually, or all of the above.  Then it involves
-writing and running some python script, called a *wrapper* that
-reformats the data into some CSV format that we can upload to the database.
-
-Download the DBLP data `dblp.dtd` and `dblp.xml.gz` from the dblp [website](http://dblp.uni-trier.de/xml/), then unzip the xml file.
-Make sure you understand what data the  the big xml file contains: look inside by running:
-
-```sh
-more dblp.xml
-```
-
-If needed, edit the `wrapper.py` and update the   correct  location of `dblp.xml` and the output files `pubFile.txt`  and  `fieldFile.txt`, then run:
-
-```sh
-python wrapper.py
-```
-
-This will take several minutes, and produces two large files: `pubFile.txt` and `fieldFile.txt`. Before you proceed, make sure you understand what happened during this step, by looking inside these two files: they are tab-separated files, ready to be imported in postgres.
-
-Next, edit the file `createRawSchema.sql` in the starter code to point to the correct path of `pubFile.txt` and `fieldFile.txt`: they  must be absolute paths, e.g. `/home/myname/mycourses/544/pubFile.txt`.   Then run:
-
-```sh
-psql -f createRawSchema.sql dblp
-```
-
-This creates two tables, `Pub` and `Field`, then imports the data (which may take a few minutes).  We will call these two tables `RawSchema` and  `RawData` respectively.
-
-
-## Problem 4: Querying the Raw Data
-
-During typical data ingestion, you sometimes need to discover the true schema of the data, and for that you need to query the  `RawData`.
-
-Start `psql` then type the following commands:
-
-```sql
-select * from Pub limit 50;
-select * from Field limit 50;
-```
-
-For example, go to the dblp [website](http://dblp.uni-trier.de/), check out this paper, search for `Henry M. Levy`, look for the "Vanish" paper, and export the entry in BibTeX format.  You should see the following in your browser
-
-```bibtex
-@inproceedings{DBLP:conf/uss/GeambasuKLL09,
-  author    = {Roxana Geambasu and
-               Tadayoshi Kohno and
-               Amit A. Levy and
-               Henry M. Levy},
-  title     = {Vanish: Increasing Data Privacy with Self-Destructing Data},
-  booktitle = {18th {USENIX} Security Symposium, Montreal, Canada, August 10-14,
-               2009, Proceedings},
-  pages     = {299--316},
-  year      = {2009},
-  crossref  = {DBLP:conf/uss/2009},
-  url       = {http://www.usenix.org/events/sec09/tech/full_papers/geambasu.pdf},
-  timestamp = {Thu, 15 May 2014 18:36:21 +0200},
-  biburl    = {http://dblp.org/rec/bib/conf/uss/GeambasuKLL09},
-  bibsource = {dblp computer science bibliography, http://dblp.org}
-}
-```
-
-
-The **key** of this entry is `conf/uss/GeambasuKLL09`.  Try using by running this SQL query:
-
-```sql
-select * from Pub p, Field f where p.k='conf/uss/GeambasuKLL09' and f.k='conf/uss/GeambasuKLL09'
-```
-
-
-Write SQL Queries  to answer the following  questions using `RawSchema`:
-
- For each type of publication, count the total number of publications of that type. Your query should return a set of (publication-type, count) pairs. For example (article, 20000), (inproceedings, 30000), ... (not the real answer).
-
- We say that a field *occurs* in a publication type, if there exists at least one publication of that type having that field. For example, `publisher` in `incollection`, but `publisher` does not occur in `inproceedings`. Find the fields that occur in *all* publications types. Your query should return a set of field names: for example it may return title, if  title occurs in all publication types (article, inproceedings, etc. notice that title does not have to occur in every publication instance, only in some instance of every type), but it should not return publisher (since the latter does not occur in any publication of type inproceedings).
-
- Your two queries above may be slow. Speed them up by creating appropriate indexes, using the CREATE INDEX statement. You also need indexes on `Pub` and `Field` for the next question; create all indices you need on `RawSchema`
-
-**Turn in** a file  `solution.sql` consising of SQL queries and all their answers inserted as comments
-
-
-## Problem 5: Data Transformation.
-
-Next, you will transform the DBLP data from `RawSchema` to  `PubSchema`.  This step is sometimes done using an ETL tool, but we will just use several SQL queries.  You need to write queries to  populate the tables in `PubSchema`. For example, to populate `Article`, you will likely run a SQL query like this:
-
-```sql
-insert into Article (select ... from Pub, Field ... where ...);
-```
-
-The `RawSchema` and `PubSchema` are quite different, so you will need to go through some trial and error to get the transformation right.  Here are a few hints (but your approach may vary):
-
- create temporary tables (and indices) to speedup the data transformation. Remember to drop all your temp tables when you are done
-
- it is very inefficient to bulk insert into a table that contains a key and/or foreign keys (why?); to speed up, you may drop the key/foreign key constraints, perform the bulk insertion, then `alter Table` to create the constraints.
-
- `PubSchema` requires an  integer key for each author and each publication. Use a `sequence` in postgres. For example, try this and see what happens:
-
-```sql
-create table R(a text);
-insert into R values ('a');
-insert into R values ('b');
-insert into R values ('c');
-create table S(id int, a text);
-create sequence q;
-insert into S (select nextval('q') as id, a from R);
-drop sequence q;
-select * from S;
-```
- DBLP knows the Homepage of some authors, and you need to store these in the Author table. But where do you find homepages in `RawData`? DBLP uses a hack. Some publications of type `www` are not publications, but instead represent homepages. For example Hank's official name in DBLP is 'Henry M. Levy'; to find his homepage, run the following query (this  should run  very fast,  1 second or less, if you created the right indices):
-
-```sql
-select z.* from Pub x, Field y, Field z where x.k=y.k and y.k=z.k and x.p='www' and y.p='author' and y.v='Henry M. Levy';
-```
-
-Get it? Now you know Hank's homepage. However, you are not there yet. Some www entries are not homepages, but are real publications. Try this:
-
-```sql
-select z.* from Pub x, Field y, Field z where x.k=y.k and y.k=z.k and x.p='www' and y.p='author' and y.v='Dan Suciu'
-```
-
-Your challenge is to find out how to identify each author's correct Homepage. (A small number of authors have two correct, but distinct homepages; you may choose any of them to insert in Author)
-
- What if a publication in `RawData` has two titles? Or two `publishers`? Or two `years`? (You will encounter duplicate fields, but not necessarily these ones.) You may pick any of them, but you need to work a little to write this in SQL.
-
-**Turn in** the file `importPubData.sql` containing several `insert`, `create Table`, `alter Table`, etc  statements.
-
-## Problem 6: Run Data Analytic Queries
-
-Finally, you reached the fun part. Write SQL queries to answer the following questions:
-
- Find the top 20 authors with the largest number of publications. (Runtime: under 10s)
-
- Find the top 20 authors with the largest number of publications in STOC. Repeat this for two more conferences, of your choice.  Suggestions: top 20 authors in SOSP, or CHI, or SIGMOD, or SIGGRAPH; note that you need to do some digging to find out how DBLP spells the name of your conference. (Runtime: under 10s.)
-
- The two major database conferences are 'PODS' (theory) and 'SIGMOD Conference' (systems). Find
-    - (a). all authors who published at least 10 SIGMOD papers but never published a PODS paper, and 
-    - (b). all authors who published at least 5 PODS papers but never published a SIGMOD paper. (Runtime: under 10s)
-
- A decade is a sequence of ten consecutive years, e.g. 1982, 1983, ..., 1991. For each decade, compute the total number of publications in DBLP in that decade. Hint: for this and the next query you may want to compute a temporary table with all distinct years. (Runtime: under 1minute.)
-
- Find the top 20 most collaborative authors. That is, for each author determine its number of collaborators, then find the top 20. Hint: for this and some question below you may want to compute a temporary table of coauthors. (Runtime: a couple of minutes.)
-
- For each decade, find the most prolific author in that decade. Hint: you may want to first compute a temporary table, storing for each decade and each author the number of publications of that author in that decade. Runtime: a few minutes.
-
- Find the institutions that have published most papers in STOC; return the top 20 institutions. Then repeat this query with your favorite conference (SOSP or CHI, or ...), and see which are the best places and you didn't know about. Hint: where do you get information about institutions? Use the Homepage information: convert a Homepage like <http://www.cs.washington.edu/homes/levy/> to <http://www.cs.washington.edu>, or even to www.cs.washington.edu; now you have grouped all authors from our department, and we use this URL as surrogate for the institution.  Read about substring manipulation in postres, by looking up `substring`, `position`, and `trim`.
-
-
-**Turn in** SQL queries in the file called `solution.sql`.
-
-## Problem 7: Data Visualization.
-
-Here you are asked to create some histograms (graphs), by writing a python script that first runs a query, then produces a graph using the result of the query.  
-
-Construct two histograms: the histogram of the number of collaborators, and the histogram of the number of publications.  The first histograph will have these axes:
-
- the X axis is a number X=1,2,3,...
- the Y axis represents the number of authors with X collaborators: Y(0)= number of authors with 0 collaborators, Y(1) = number of authors with 1 collaborator, etc
-
-Similarly for the second histogram.  Try using a log scale, or a log-log scale, and choose the most appropriate.  Feel free to produce a very nice graph (not necessarily a histogram).
-
-Resources:
- Accessing postgres from python [tutorial](https://wiki.postgresql.org/wiki/Psycopg2_Tutorial); see also `pythonpsql.py` in the starter code
- [Plotpy library](https://plot.ly/python/)
-
-**Turn in** a file `graph.py` and the output it generated in a file `graph.pdf`
-
-# Submission Instructions
-<a name="submission"></a>
-
-We will be using `git`, a source code control tool, for distributing and submitting homework assignments in this class.
-This will allow you to download the code and instruction for the homework, 
-and also submit the labs in a standardized format that will streamline grading.
-
-You will also be able to use `git` to commit your progress on the labs
-as you go. This is **important**: Use `git` to back up your work. Back
-up regularly by both committing and pushing your code as we describe below.
-
-Course git repositories will be hosted as a repository in [CSE's
-gitlab](https://gitlab.cs.washington.edu/), that is visible only to
-you and the course staff.
-
-## Getting started with Git
-
-There are numerous guides on using `git` that are available. They range from being interactive to just text-based. 
-Find one that works and experiment -- making mistakes and fixing them is a great way to learn. 
-Here is a [link to resources](https://help.github.com/articles/what-are-other-good-resources-for-learning-git-and-github) 
-that GitHub suggests starting with. If you have no experience with `git`, you may find this 
-[web-based tutorial helpful](https://try.github.io/levels/1/challenges/1).
-
-Git may already be installed in your environment; if it's not, you'll need to install it first. 
-For `bash`/`Linux` environments, git should be a simple `apt-get` / `yum` / etc. install. 
-More detailed instructions may be [found here](http://git-scm.com/book/en/Getting-Started-Installing-Git).
-Git is already installed on the CSE linux machines.
-
-If you are using Eclipse or IntelliJ, many versions come with git already configured. 
-The instructions will be slightly different than the command line instructions listed but will work 
-for any OS. For Eclipse, detailed instructions can be found at 
-[EGit User Guide](http://wiki.eclipse.org/EGit/User_Guide) or the
-[EGit Tutorial](http://eclipsesource.com/blogs/tutorials/egit-tutorial).
-
-
-## Cloning your repository for homework assignments
-
-We have created a git repository that you will use to commit and submit your the homework assignments. 
-This repository is hosted on the [CSE's GitLab](https://gitlab.cs.washington.edu) , 
-and you can view it by visiting the GitLab website at 
-`https://gitlab.cs.washington.edu/cse544-2018wi/cse544-[your CSE or UW username]`. 
-
-You'll be using this **same repository** for each of the homework assignments this quarter, 
-so if you don't see this repository or are unable to access it, let us know immediately!
-
-The first thing you'll need to do is set up a SSH key to allow communication with GitLab:
-
-1.  If you don't already have one, generate a new SSH key. See [these instructions](http://doc.gitlab.com/ce/ssh/README.html) for details on how to do this.
-2.  Visit the [GitLab SSH key management page](https://gitlab.cs.washington.edu/profile/keys). You'll need to log in using your CSE account.
-3.  Click "Add SSH Key" and paste in your **public** key into the text area.
-
-While you're logged into the GitLab website, browse around to see which projects you have access to. 
-You should have access to `cse544-[your CSE or UW username]`. 
-Spend a few minutes getting familiar with the directory layout and file structure. For now nothing will
-be there except for the `hw1` directory with these instructions.
-
-We next want to move the code from the GitLab repository onto your local file system. 
-To do this, you'll need to clone the 544 repository by issuing the following commands on the command line:
-
-```sh
-$ cd [directory that you want to put your 544 assignments]
-$ git clone git@gitlab.cs.washington.edu:cse544-2018wi/cse544-[your CSE or UW username].git
-$ cd cse544-[your CSE or UW username]
-```
-
-This will make a complete replica of the repository locally. If you get an error that looks like:
-
-```sh
-Cloning into 'cse544-[your CSE or UW username]'...
-Permission denied (publickey).
-fatal: Could not read from remote repository.
-```
-
-... then there is a problem with your GitLab configuration. Check to make sure that your GitLab username matches the repository suffix, that your private key is in your SSH directory (`~/.ssh`) and has the correct permissions, and that you can view the repository through the website.
-
-Cloning will make a complete replica of the homework repository locally. Any time you `commit` and `push` your local changes, they will appear in the GitLab repository.  Since we'll be grading the copy in the GitLab repository, it's important that you remember to push all of your changes!
-
-## Adding an upstream remote
-
-The repository you just cloned is a replica of your own private repository on GitLab. 
-The copy on your file system is a local copy, and the copy on GitLab is referred to as the `origin` remote copy.  You can view a list of these remote links as follows:
-
-```sh
-$ git remote -v
-```
-
-There is one more level of indirection to consider.
-When we created your `cse544-[your CSE or UW username]` repository, we forked a copy of it from another 
-repository `cse544-2018wi`.  In `git` parlance, this "original repository" referred to as an `upstream` repository.
-When we release bug fixes and subsequent homeworks, we will put our changes into the upstream repository, and you will need to be able to pull those changes into your own.  See [the documentation](https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes) for more details on working with remotes -- they can be confusing!
-
-In order to be able to pull the changes from the upstream repository, we'll need to record a link to the `upstream` remote in your own local repository:
-
-```sh
-$ # Note that this repository does not have your username as a suffix!
-$ git remote add upstream git@gitlab.cs.washington.edu:suciu/cse544-2018wi.git
-```
-
-For reference, your final remote configuration should read like the following when it's setup correctly:
-
-```sh
-$ git remote -v
-  origin  git@gitlab.cs.washington.edu:cse544-2018wi/cse544-[your CSE username].git (fetch)
-  origin  git@gitlab.cs.washington.edu:cse544-2018wi/cse544-[your CSE username].git (push)
-  upstream    git@gitlab.cs.washington.edu:suciu/cse544-2018wi.git (fetch)
-  upstream    git@gitlab.cs.washington.edu:suciu/cse544-2018wi.git (push)
-```
-
-In this configuration, the `origin` (default) remote links to **your** repository 
-where you'll be pushing your individual submission. The `upstream` remote points to **our** 
-repository where you'll be pulling subsequent homework and bug fixes (more on this below).
-
-Let's test out the origin remote by doing a push of your master branch to GitLab. Do this by issuing the following commands:
-
-```sh
-$ touch empty_file
-$ git add empty_file
-$ git commit empty_file -m 'Testing git'
-$ git push # ... to origin by default
-```
-
-The `git push` tells git to push all of your **committed** changes to a remote.  If none is specified, `origin` is assumed by default (you can be explicit about this by executing `git push origin`).  Since the `upstream` remote is read-only, you'll only be able to `pull` from it -- `git push upstream` will fail with a permission error.
-
-After executing these commands, you should see something like the following:
-
-```sh
-Counting objects: 4, done.
-Delta compression using up to 4 threads.
-Compressing objects: 100% (2/2), done.
-Writing objects: 100% (3/3), 286 bytes | 0 bytes/s, done.
-Total 3 (delta 1), reused 0 (delta 0)
-To git@gitlab.cs.washington.edu:cse544-2018wi/cse544-[your CSE or UW username].git
-   cb5be61..9bbce8d  master -> master
-```
-
-We pushed a blank file to our origin remote, which isn't very interesting. Let's clean up after ourselves:
-
-```sh
-$ # Tell git we want to remove this file from our repository
-$ git rm empty_file
-$ # Now commit all pending changes (-a) with the specified message (-m)
-$ git commit -a -m 'Removed test file'
-$ # Now, push this change to GitLab
-$ git push
-```
-
-If you don't know Git that well, this probably seemed very arcane. Just keep using Git and you'll understand more and more. We'll provide explicit instructions below on how to use these commands to actually indicate your final lab solution.
-
-## Pulling from the upstream remote
-
-If we release additional details or bug fixes for this homework, 
-we'll push them to the repository that you just added as an `upstream` remote. You'll need to `pull` and `merge` them into your own repository. (You'll also do this for subsequent homeworks!) You can do both of these things with the following command:
-
-```sh
-$ git pull upstream master
-remote: Counting objects: 3, done.
-remote: Compressing objects: 100% (3/3), done.
-remote: Total 3 (delta 2), reused 0 (delta 0)
-Unpacking objects: 100% (3/3), done.
-From gitlab.cs.washington.edu:cse544-2018wi/cse544-2018wi
- * branch            master     -> FETCH_HEAD
-   7f81148..b0c4a3e  master     -> upstream/master
-Merge made by the 'recursive' strategy.
- README.md | 2 +-
- 1 file changed, 1 insertion(+), 1 deletion(-)
-```
-
-Here we pulled and merged changes to the `README.md` file. Git may open a text editor to allow you to specify a merge commit message; you may leave this as the default. Note that these changes are merged locally, but we will eventually want to push them to the GitLab repository (`git push`).
-
-Note that it's possible that there aren't any pending changes in the upstream repository for you to pull.  If so, `git` will tell you that everything is up to date.
-
-
-## Collaboration
-
-All CSE 544 assignments are to be completed **INDIVIDUALLY**! However, you may discuss your high-level approach to solving each lab with other students in the class.
-
-## Submitting your assignment
-
-You may submit your code multiple times; we will use the latest version you submit that arrives 
-before the deadline. 
-Put all your files(`pubER.pdf`, `createPubSchema.sql`, `solution.sql`, `importPubData.sql`, `graph.py`, `graph.pdf`) in `hw1/submission`. Your directory structure should 
-look like this after you have completed the assignment: 
-
-```sh
-cse544-[your CSE or UW username]
-\-- README.md
-\-- turnInHW.sh     # script for turning in hw
-\-- hw1
-    \-- hw1.md      # this is the file that you are currently reading
-    \-- submission
-        \-- pubER.pdf  # your solution to question 1
-        \-- createPubSchema.sql  # your solution to question 2
-        \-- solution.sql  # your solution to question 3
-        ...
-```
-
-**Important**: In order for your write-up to be added to the git repo, you need to explicitly add it:
-
-```sh
-$ cd submission
-$ git add pubER.pdf createPubSchema.sql ...
-```
-
-Or if you do
-```sh
-$ git add submission
-```
-
-Then it will add *all* the files inside the `submission` directory to the repo.
-
-The criteria for your homework being submitted on time is that your code must be tagged and 
-pushed by the due date and time. This means that if one of the TAs or the instructor were to open up GitLab, they would be able to see your solutions on the GitLab web page.
-
-**Just because your code has been committed on your local machine does not mean that it has been submitted -- it needs to be on GitLab!**
-
-There is a bash script `turnInHw.sh` in the root level directory of your repository that commits your changes, deletes any prior tag for the current lab, tags the current commit, and pushes the branch and tag to GitLab. If you are using Linux or Mac OSX, you should be able to run the following:
-
-```sh
-$ ./turnInHw.sh hw1
-```
-
-You should see something like the following output:
-
-```sh
-$ ./turnInHw.sh hw1
-[master b155ba0] Homework 1
- 1 file changed, 1 insertion(+)
-Deleted tag 'hw1' (was b26abd0)
-To git@gitlab.com:cse544-2018wi/cse544-[your CSE or UW username].git
- - [deleted]         hw1
-Counting objects: 11, done.
-Delta compression using up to 4 threads.
-Compressing objects: 100% (4/4), done.
-Writing objects: 100% (6/6), 448 bytes | 0 bytes/s, done.
-Total 6 (delta 3), reused 0 (delta 0)
-To git@gitlab.com:cse544-2018wi/cse544-[your CSE or UW username].git
-   ae31bce..b155ba0  master -> master
-Counting objects: 1, done.
-Writing objects: 100% (1/1), 152 bytes | 0 bytes/s, done.
-Total 1 (delta 0), reused 0 (delta 0)
-To git@gitlab.com:cse544-2018wi/cse544-[your CSE or UW username].git
- * [new tag]         hw1 -> hw1
-```
-
-
-## Final Word of Caution!
-
-Git is a distributed version control system. This means everything operates offline until you run `git pull` or `git push`. This is a great feature.
-
-The bad thing is that you may **forget to `git push` your changes**. This is why we strongly, strongly suggest that you **check GitLab to be sure that what you want us to see matches up with what you expect**.  As a second sanity check, you can re-clone your repository in a different directory to confirm the changes:
-
-```sh
-$ git clone git@gitlab.cs.washington.edu:cse544-2018wi/cse544-[your CSE or UW username].git confirmation_directory
-$ cd confirmation_directory
-$ # ... make sure everything is as you expect ...
-```
\ No newline at end of file
--- a/hw/hw1/starter-code/createRawSchema.sql
+++ b/hw/hw1/starter-code/createRawSchema.sql
-create table Pub (k text, p text);
-create table Field (k text, i text, p text, v text);
-copy Pub from 'pubFile.txt';
-copy Field from 'fieldFile.txt';
--- a/hw/hw1/starter-code/pythonpsql.py
+++ b/hw/hw1/starter-code/pythonpsql.py
-#!/usr/bin/python
-import psycopg2
-
-def main():
-	try:
-	    conn = psycopg2.connect("dbname='dblp' user='<YOUR USER NAME>' host='localhost' password=''")
-	except psycopg2.Error, e:
-	    print "I am unable to connect to the database"
-
-	cur = conn.cursor()
-
-	cur.execute("SELECT * FROM author LIMIT 10")
-
-	rows = cur.fetchall()
-
-	print "Showing first 10 results:\n"
-
-	for row in rows:
-	    print row[0], row[1]
-
-if __name__ == "__main__":
-	main()
--- a/hw/hw1/starter-code/wrapper.py
+++ b/hw/hw1/starter-code/wrapper.py
-import xml.sax
-import re
-
-
-class DBLPContentHandler(xml.sax.ContentHandler):
-  """
-  Reads the dblp.xml file and produces two output files.
-        pubFile.txt = (key, pubtype) tuples
-        fieldFile.txt = (key, fieldCnt, field, value) tuples
-  Each file is tab-separated
-
-  Once the program finishes,  load these two files in a relational database; run createSchema.sql
-  """
-
-  def __init__(self):
-    xml.sax.ContentHandler.__init__(self)
-
-
-  def startElement(self, name, attrs):
-    if name == "dblp":
-      DBLPContentHandler.pubFile = open('pubFile.txt', 'w')
-      DBLPContentHandler.fieldFile = open('fieldFile.txt', 'w')
-      DBLPContentHandler.pubList = ["article", "inproceedings", "proceedings", "book", "incollection", "phdthesis", "mastersthesis", "www"]
-      DBLPContentHandler.fieldList = ["author", "editor", "title", "booktitle", "pages", "year", "address", "journal", "volume", "number", "month", "url", "ee", "cdrom", "cite", "publisher", "note", "crossref", "isbn", "series", "school", "chapter"]
-      DBLPContentHandler.content = ""
-    if name in DBLPContentHandler.pubList:
-      DBLPContentHandler.key = attrs.getValue("key")
-      DBLPContentHandler.pub = name
-      DBLPContentHandler.fieldCount = 0
-      DBLPContentHandler.content = ""
-    if name in DBLPContentHandler.fieldList:
-      DBLPContentHandler.field = name
-      DBLPContentHandler.content = ""
- 
-  def endElement(self, name):
-    if name in DBLPContentHandler.fieldList:
-      DBLPContentHandler.fieldFile.write(DBLPContentHandler.key)
-      DBLPContentHandler.fieldFile.write("\t")
-      DBLPContentHandler.fieldFile.write(str(DBLPContentHandler.fieldCount))
-      DBLPContentHandler.fieldFile.write( "\t")
-      DBLPContentHandler.fieldFile.write(DBLPContentHandler.field)
-      DBLPContentHandler.fieldFile.write("\t")
-      DBLPContentHandler.fieldFile.write(DBLPContentHandler.content)
-      DBLPContentHandler.fieldFile.write("\n")
-      DBLPContentHandler.fieldCount += 1
-    if name in DBLPContentHandler.pubList:
-      DBLPContentHandler.pubFile.write(DBLPContentHandler.key)
-      DBLPContentHandler.pubFile.write("\t")
-      DBLPContentHandler.pubFile.write(DBLPContentHandler.pub)
-      DBLPContentHandler.pubFile.write("\n")
-
-  def characters(self, content):
-    DBLPContentHandler.content += content.replace('\\','\\\\')
-      
-def main(sourceFileName):
-  source = open(sourceFileName)
-  xml.sax.parse(source, DBLPContentHandler())
- 
-if __name__ == "__main__":
-  main("dblp.xml")
--- a/hw/hw1/submission/README
+++ b/hw/hw1/submission/README
-put your .sql files in this directory, one file per question.
--- a/hw/hw2/figs/ra.pdf
+++ b/hw/hw2/figs/ra.pdf
--- a/hw/hw2/figs/ra.pptx
+++ b/hw/hw2/figs/ra.pptx
--- a/hw/hw2/hw2.md
+++ b/hw/hw2/hw2.md
-# CSE 544 Homework 2: Finding the Mitochondrial Eve
-
-**Objectives:**
-To understand how queries are translated into the relational algebra. To master writing relational queries in a logic formalism using datalog.
-
-**Assignment tools:**
-Part 1: pen and paper; Part 2: Soufflé 
-
-**Assigned date:** January 21st, 2018
-
-**Due date:** February 2nd, 2018
-
-**What to turn in:** Put the following files in the `submission` folder: `hw2-q1.txt`, `hw2-q2.txt`, `hw2-q3.dl` along with its output `hw2-q3-1.ans`, `hw2-q3-2.ans`, `hw2-q3-3.ans`, `hw2-q3-4.ans`, `hw2-q3-5.ans`(see details below) 
-
-**Resources:** 
-
- Soufflé (https://github.com/souffle-lang/souffle)
-    
- Soufflé [language documentation](http://souffle-lang.org/docs/datalog/)
-
- [Soufflé tutorial](http://souffle-lang.org/pdf/SoufflePLDITutorial.pdf)
-
- Starter code in your personal repo for Part 2.
-
- General information for Part 2:    
-    - The [Mitochondrial Eve](https://en.wikipedia.org/wiki/Mitochondrial_Eve)        
-    - List of [women in the Bible](https://en.wikipedia.org/wiki/List_of_women_in_the_Bible)         
-    - List of [minor biblical figures](https://en.wikipedia.org/wiki/List_of_minor_biblical_figures,_A%E2%80%93K)        
-    - Note that the parent-child relationship is randomly generated and may change.
-
-
-## Assignment Details
-
-### Part 1: Warm Up with Relational Algebra
-
-1. (10 points) Write the equivalent SQL query to this [relational algebra plan](figs/ra.pdf "Relational Algebra Plan"). Save your answer in `hw2-q1.txt`. 
-
-2. (10 points) Write a relational algebra plan for the following SQL query:
-
-    ```sql
-    select a.p
-    from   person_living a, male b
-    where  a.p = b.name and 
-           not exists (select * 
-                       from   parent_child c, female d 
-                       where  c.p1=d.name and c.p2=a.p)
-   ```
-
-    You do not need to draw the query plan as a tree and can use the linear style instead. To make precedence clear, we ask you to break down your query plan by using *at most one* operator on each line.  For example, given the query in question 1, you could write it as:
-
-    ```sh
-    T1(x,p1,p2) = person_living(x) Join[x=p1] parent_child(p1,p2)
-    T2(p3,p4) = rename[p3,p4] parent_child(p3,p4)
-    T3(x,p1,p2,p3,p4) = T1(x,p1,p2) Join[p2=p3] T2(p3,p4)
-    T4(p1,p2,y) = GroupBy[p1,p2,count(*)->y] T3(x,p1,p2,p3,p4)
-    T5(p1,z) =  GroupBy[p1,max(y)->z] T4(p1,p2,y)
-    ```
-
-    where `T1`, `T2`, etc are temporary relations. Note that each line has at most one relational operator. You do not need to use the Greek symbols if you prefer. You also don't need to distinguish among the different flavors of join (just make sure that you write out the full join predicate). 
-
-    Save your answer in `hw2-q2.txt`. 
-
-
-### Part 2. Finding the Mitochondrial Eve
-
-Every human has a mother, who had her own mother, who in turn had her own mother.  The matrilineal ancestor of an individual consists of the mother, the mother’s mother, and so on, following only the female lineage.  A matrilinial common ancestor, MCA, is a matrilinial ancestor of all living humans.  An MCA is very, very likely to exist (why?), and in fact there are many MCAs.  The matrilineal most recent ancestor, or MRCA, is the only individual (woman) who is the MCA of all living humans and is the most recent such.  Who is she?  When did she live?  In the 1980s three researchers, Cann, Stoneking and Wilson, analyzed the mitocondrial DNA of living humans and determined that the MRCA lived about 200,000 years ago.  The researchers called her the [Mithcondrial Eve](https://en.wikipedia.org/wiki/Mitochondrial_Eve).
-
-In this homework, you will analyze a database of 800 individuals, compute several things, culminating with the the computation of the Mithocondrial Eve.  The genealogy database consists of over 800 biblical names, obtained from Wikipedia, with a randomly generated parent-child relationship.
-
-### Getting Started
-
-1. Install Soufflé
-    1. **Mac user**
-        * Download the [souffle-1.2.0.pkg](https://github.com/souffle-lang/souffle/releases/tag/1.2.0)
-    2. **Windows user**
-        * To ease the installation process, we recommand using the pre-built version of Soufflé on Debian
-        * Download the [VMPlayer](https://my.vmware.com/en/web/vmware/free#desktop_end_user_computing/vmware_workstation_player/12_0) 
-        * Download the [Debian Image](https://www.debian.org/distrib/netinst). Make sure you install the amd64 version.
-        * When VMplayer starts running, click on the "Open a Virtual Machine" link.  Navigate to the folder where you sotre the Debian Image. Click "OK".  Then click on the left-side tab that appears containing the VM name. Click "Play virtual machine".
-        * When Debian is setup, obtain the pre-built package [souffle_1.2.0-1_amd64.deb](https://github.com/souffle-lang/souffle/releases/tag/1.2.0)
-        * Open a terminal and navigate to the location where you downloaded the package (which is probably `~/Downloads`)
-        * Then type `sudo apt install ./souffle_1.2.0-1_amd64.deb`
-
-2. Verify Soufflé is working:
-    ```
-    $ cd hw2/starter-code
-    $ souffle hw2-q3.dl
-    ```
-  
-    Congratulations! You just ran your first datalog query.
-
-### Questions
-For each question below, write in the file `hw2-q3.dl` a program that computes the answer to that question. See the Example section below.
-
-1. (10 points) Find all descendants of Priscilla and their descriptions.  Name your predicate `p1(x,d)`. Write the output to a file called `hw2-q3-1.ans`(123 rows)
-
-
-2. (10 points) Find the woman/women with the largest number of children and the man/men with the largest number of children. For each individual, you should return the name of that individual, his/her description, and the number of children. Name your predicate `p2(x,d,n)`. Write the output to a file called `hw2-q3-2.ans`(2 rows)
-
-
-3. (20 points) For each person x, we call a "complete lineage" any sequence x0=x, x1, x2, … , xn where each person is the parent of the previous person, and the last person has no parents; the length of the sequence is n.  If x has a complete lineage of length n, then we also say that "x is in generation n".  Compute the minimum and maximum generation of each living person x. 
-
-    Name your predicate `p3(x,m1,m2)`, where x is a living person, and `m1`, `m2` are the minimal/maximal generation. (Hint: You may want to first compute all generations for all x: think about when can you say that x is in generation 0, and when can you say that x is in generation n+1.  Of course x can be in multiple generations, e.g., x's mother is in generation 0 and x's father is in generation 2.   Once you know everybody's generations, you can answer the problem easily.) Write the output to a file called `hw2-q3-3.ans` (22 rows)
-
-4. (20 points) Compute all matrilineal common ancestors, MCA. Name your predicate `p4(x)`. Write the output to a file called `hw2-q3-4.ans` (6 rows)
-
-5. (20 points) Find the mitochondrial Eve.  Name your predicate `p5(x)`. Remember that you can utilize your predicates defined earlier. Write the output to a file called `hw2-q3-5.ans` (1 row)
-
-
-#### Example
-
-For example, suppose the question were: find all children of Priscilla; return their names and their descriptions. Then you write this in the `hw3-q3.dl` file (it’s already there):
-
-```c
-.output p0(IO=stdout)
-p0(x,d) :- parent_child("Priscilla",x), person(x,d).  //NOTE the period at the end 
-```
-
-## Submission Instructions
-
-For Part 1, write your answers in a file `hw2-q1.txt`, and `hw2-q2.txt` and put them in the `submission` folder.
-
-For part 2, write your answers in the provided file `hw2-q3.dl` and name the output generated from p1, p2, p3, p4, p5: `hw2-q3-1.ans`, `hw2-q3-2.ans`, `hw2-q3-3.ans`, `hw2-q3-4.ans`, `hw2-q3-5.ans` and put them in the `submission` folder.
-
-**Important**: To remind you, in order for your answers to be added to the git repo, 
-you need to explicitly add each file:
-
-```sh
-$ git add *.txt *.ans
-```
-
-**Again, just because your code has been committed on your local machine does not mean that it has been 
-submitted -- it needs to be on GitLab!**
-
-Use the same bash script `turnInHw.sh` in the root level directory of your repository that 
-commits your changes, deletes any prior tag for the current lab, tags the current commit,
-and pushes the branch and tag to GitLab. 
-
-If you are using Linux or Mac OSX, you should be able to run the following:
-
-```sh
-$ ./turnInHw.sh hw2
-```
-
-Like previous assignments, make sure you check the results afterwards to make sure that your file(s)
-have been committed.
--- a/hw/hw2/starter-code/DATA/female.facts
+++ b/hw/hw2/starter-code/DATA/female.facts
-Leah
-Mahlah #2
-Mahlah #1
-Abital
-Milcah #2
-Milcah #1
-Jehudijah
-Matred
-Jerusha
-Noah
-Hammolekheth
-Athaliah
-Achsah (or Acsah)
-Queen Vashti
-Mahalath #1
-Deborah #2
-Meshullemeth
-Abishag
-Timnah (or Timna)
-Zillah #2
-Elisabeth
-Lois, grandmother of Saint Timothy. II Timothy[101]
-Aholibamah (or Oholibamah)
-Naomi
-Eglah
-Abigail #4
-Abigail #3
-Haggith
-Rhoda
-Lo-Ruhamah
-Zilpah
-Jehosheba (or Jehoshebeath/Josaba)
-Jemima
-Helah
-Maacah
-Asenath
-Jerusha #2
-Eve
-Zillah
-Abihail #2
-Abihail #1
-Hagar
-Michal
-Mahalath
-Susanna #1
-Susanna #2
-Shiphrah
-Tamar #3
-Jemima #2
-Rizpah
-Zipporah
-Jehoaddan
-Antiochus
-Azubah #1
-Azubah #2
-Delilah
-Deborah #1
-Medium of En Dor
-Zeresh
-Baara
-Adah
-Mehetabeel
-Merab #2
-Mehetabel #2
-Jezebel #1
-Jezebel #2
-Esther (also known as Hadassah)
-Baara #2
-Orpah
-Martha
-Keziah
-Salome #2
-Salome #1
-Mehetabel
-Rebekah
-Lo–Ruhamah
-Reumah
-Bathsheba
-Jezebel
-Basemeth #1
-Basemeth #3
-Basemeth #2
-Tirzah
-Puah
-Euodia
-Hushim #2
-Damaris. Acts[41]
-Naamah #2
-Hannah
-Syntyche
-Me-Zahab
-Mahalath #2
-Diblaim
-Miriam #1
-Keziah #2
-Miriam #2
-Shelomith
-Ephrath
-Jael
-Ahlai #1
-Ahlai #2
-Noadiah
-Tabitha (Acts 9:36)
-Taphath
-Cozbi
-Tamar #1
-Tamar #2
-Elisheba #2
-Jehoaddan (or Jehoaddin)
-Rahab
-Elisheba
-Hogla (or Hoglah)
-Mary #1
-Mary #3
-Mary #2
-Mary #5
-Mary #4
-Mary #6
-Dinah
-Taphath #2
-Phoebe
-Junia or Junias
-Iscah
-Priscilla
-Ahinoam #2
-Ahinoam #1
-Naarah
-Hodiah's wife
-Mahlah
-Lydia of Thyatira
-Hephziba
-Shelomit #1
-Basemath
-Shelomit #2
-Jecholiah (or Jecoliah)
-Reumah #2
-Rachel
-Jerioth #2
-Jochebed
-Atarah
-Persis
-Merab
-Matred #2
-Anah
-Jerioth
-Claudia #2
-Julia
-Maacah #2
-Iscah #2
-Judith
-Hazelelponi (or Hazzelelponi)
-Eunice
-Bithiah
-Nehushta
-Sarah #2
-Jecholiah
-Sarah #1
-Dorcas, also known as Tabitha. Acts[46]
-Julia #2
-Anna the Prophetess
-Keren–Happuch
-Claudia
-Candace
-Sheerah
-Huldah
-Keturah
-Jedidah
-Eglah #2
-Adah # 1
-Adah #2
-Gomer
-Ruth
-Hodesh
-Peninnah
-Joanna
-Ephah
-Hamutal
-Bilhah
-Sapphira
-Zeruiah
-Hodesh #2
-Naamah #1
--- a/hw/hw2/starter-code/DATA/male.facts
+++ b/hw/hw2/starter-code/DATA/male.facts
-Persian 'مهمان signifies a stranger or guest' [17] Melatiah
-Ethnan
-Ibneiah
-Iphdeiah
-Bidkar
-Elioenai
-Ishhod
-Hashubah
-Joshbekashah
-Ebed-melech
-Milalai
-Malcam
-Maon
-Ehi
-Ishui
-Jimnah
-Bechorath
-Jaareshiah
-Raamiah
-Dalphon
-Ethni
-Elzaphan
-Muppim
-Hiel
-Elpaal
-Ishiah
-Adlai
-Dibri
-Ophir
-Igdaliah
-Josiphiah
-Jarha
-Appaim
-Ahimoth
-Ishuah
-Adbeel
-Adalia
-Hajehudijah
-Shuthelah
-Harumaph
-Jehizkiah
-Ahinadab
-Hoham
-Amasiah
-Amminadib
-Ahasbai
-Jehallelel
-Jokim
-Deuel
-Hammelech
-Eubulus
-Helon
-Ahiezer
-Semachiah
-Igal
-Gideon
-Machbanai
-Ithmah
-Pul
-Rinnah
-Shillem
-Jeriel
-Naharai
-Zedekiah
-Kelal
-Meshillemoth
-Jeiel
-Alvah
-Aiah
-Jidlaph
-Jehudi
-Ithran
-Jaanai
-Amon
-Isui
-Haddad
-Imla
-Ocran
-Ribai
-Simon Iscariot
-Habazziniah
-Hashabnah
-Elimelech
-Amos
-Becher
-Shemeber
-Hathach
-Eran
-Ahi
-Gaddiel
-Zephon
-Naphtuhim
-Anani
-Jehiah
-Jareb
-Aggaba
-Gilalai
-Narcissus
-Obal
-Shammah
-Jeshohaiah
-Jeuel
-Dishan
-Mehuman
-Hashub
-Azaniah
-Jehush
-Harhaiah
-Jahleel
-Shemida
-Evi
-Maasiai
-Elidad
-Phallu
-Rephael
-Libni
-Abdi
-Hasadiah
-Ziphion
-Rabmag
-Magpiash
-Shimeah
-Ephlal
-Malchiah
-Hagab
-Nepheg
-Harhas
-Joezer
-Izhar
-Mehujael
-Matthan
-Uriah ben Shemaiah
-Izrahiah
-Stachys
-Isshiah
-Jishui
-Hachmoni
-Vophsi
-Jacan
-Ahishar
-Parnach
-Jeremai
-Keren-happuch
-Shelomi
-Kelita
-Diklah
-Athlai
-Harnepher
-Maai
-Matthat
-Hoshama
-Mishmannah
-Zeri
-Sachar
-Jamlech
-Joed
-Jaziz
-Birsha
-Jarah
-Joel
-Malchiel
-Moza
-Allon
-Q
-Rekem
-Jecamiah
-Gemalli
-Jahzeel
-Zabad
-Jekamiah
-Abinadab
-Pethahiah
-Sharar
-Sheconiah
-Immer
-Irijah
-Mahazioth
-Ben Hesed
-Linus
-Amzi
-Jaresiah
-Likhi
-Ishpan
-Ishpah
-Harum
-Heldai
-Hazo
-Tola
-Meshelemiah
-Mehir
-Kemuel
-Ilai
-Zuar
-Putiel
-Salu
-Helek
-Carmi
-Jozachar
-Meshullam
-Machnadebai
-Paseah
-Piram
-Caleb, son of Hezron
-Zichri
-Michael
-Jezer
-Vaniah
-Nebat
-Chenaanah
-Hallohesh
-Arodi
-Eri
-Ezrah
-Ahishahar
-Shedeur
-Ahasai
-Adna
-Abdeel
-Joshah
-Arnan
-Chelal
-Elzabad
-Rosh
-Mahali
-Joshibiah
-Pelatiah
-Romamti-ezer
-Jaasau
-Jaasai
-Ibnijah
-Elead
-Elionenai
-Shaphat
-Hezekiah
-Meremoth
-Shaashgaz
-Job
-Habaiah
-Hakkoz
-Melech
-Hubbah
-Mibsam
-Ahab
-Machbena
-Dodo
-Uri
-Enoch
-Mash
-Segub
-Sered
-Jeshishai
-Bukki
-Mijamin
-Seled
-Lahmi
-Delaiah
-Hori
-Ziza
-Jesui
-Elon
-Gazez
-Ishvi
-Aphiah
-Aduel
-Guni
-Aristobulus
-Izri
-Eleasah
-Izziah
-Ashbel
-Ahzai
-Hazaiah
-Mithredath
-Ben Hur
-Iram
-Hattil
-Jahmai
-Carshena
-Levi
-Irad
-Lo-Ammi
-Ittai
-Enan #2
-Enan #1
-Phalti
-Adnah
-Alexander
-Raphu
-Homam
-Abitub
-Azaliah
-Jozabad
-Harim
-Rohgah
-Jeush
-Jerijah
-Mishael
-Hagabah
-Mattatha
-Shechem
-Jathniel
-Kolaiah
-Ahisamach
-Malluch
-Jobab
-Amasa
-Melea
-Pelaiah
-Joiarib
-Zebadiah
-Ben Deker
-Joshua the Bethshemite
-Gera
-Seba
-Ozem
-Urijah
-Hasupha
-Shemaiah
-Ikkesh
-Darda
-Ahiram
-Zaccur
-Jaasu
-Ithai
-Zabdi
-Maadai
-Rephaiah
-Arah
-Ahuzzam
-Nephish
-Chalcol
-Jephunneh
-Admin
-Maaziah
-Genubath
-Shuni
-Elnaam
-Shinab
-Henadad
-Shaaph
-Melzar
-Ishvah
-Beno
-Haahashtari
-Shimi
-Hadlai
-Jediael
-Ibsam
-Naggai
-Miniamin
-Minjamin
-Alvan
-Shobal
-Shammua
-Shobab
-Jasiel
-Ephron
-Elishaphat
-Sodi
-Jogli
-Imna
-Miamin
-Eliathah
-Jehoaddah
-Nekoda
-Nereus
-Pelaliah
-Shearjashub
-Matri
-Geber
-Hermogenes
-Mallothi
-Hadadezer
-Jehoshaphat
-Rehum
-Idbash
-Zeror
-Nemuel
-Bigtha
-Abida
-Moab
-Shelumiel
-Obadiah
-Sabtah
-Antothijah
-Ozni
-Joshaviah
-Elihoreph
-Machi
-Zephaniah
-Heber
-Hotham
-Shimron
-Jalon
-Ner
-Shemer
-Kallai
-Jaaziel
-Meres
-Mahath
-Gatam
-Elizur
-Ishod
-Jeshaiah
-Jekameam
-Eliphal
-Peresh
-Nedabiah
-Aedias
-Vaizatha
-Parmashta
-Ginath
-Ishmerai
-Gaddi
-Peleth
-Malchi-shua
-Regem
-Aharhel
-Zabud
-Hamul
-Jesimiel
-Ajah
-Ishbah
-Laadah
-Gideoni
-Ammizabad
-Assir
-Ahilud
-Matthanias
-Gemariah
-Hareph
-Pethuel
-Areli
-Meraioth
-Chuza
-Neariah
-Haran
-Hezron
-Imri
-Meraiah
-Haggi
-Nahath
-Zuriel
-Admatha
-Jehozabad
-Hakkatan
-Elmadam
-Raddai
-Beriah
-Huzzab
-Naboth
-Molid
-Tahan
-Joash
-Japhia
-On
-Elienai
-Elpalet
-Hammoleketh
-Iru
-Ithream
-Iri
-Jehdeiah
-Asiel
-Shimshai
-Rezon
-Hanniel
-Hashabiah
-Maadiah
-Akan
-Rakem
-Hanoch
-Huppim
-Hananiah
-Baanah
-Azgad
-Jehubbah
-Eliada
-Pedahzur
-Johanan son of Kareah
-Chimham
-Ben Abinadab
-Helkai
-Hasrah
-Phaltiel
-Pedahel
-Zaavan
-Melchi
-Amaziah
-Naum
-Anan
-Anak
-Michri
-Nahum
-Ir
-Jahzerah
-Asriel
-Elizaphan
-Elpelet
-Hammedatha
-Nahbi
-Joelah
-Dodavahu
-Jeezer
-Josibiah
-Shaharaim
-Elishama
-Saph
-Tryphosa
-Azzan
-Nobah
-Barachel
-Laish
-Jushab-hesed
-Jonathan son of Kareah
-Hushim
-Zithri
-Shephatiah
-Aziel
-Naphish
-Marsena
-Elasah
-Jezrahiah
-Poratha
-Tyrannus
-Shisha
-Imrah
-Ishuai
-Paruah
-Phurah
-Eluzai
-Mibhar
-Ard
-Jekuthiel
-Pinon
-Phuvah
-Chelub
-Ahitub
-Zippor
-Harbona
-Jibsam
-Jerah
-Palti
-Abijah
-Hothir
-Ahian
-Hemam
-Ben-Ammi
-Hiram
-Eliadah
-Geuel
-Gamaliel
-Nehum
-Merib-baal
-Zohar
-Jahzeiah
-Sabtechah
-Shiphtan
-Jakeh
-Naaman
-Azariah
-Ahlai
-Athaiah
-Ezbon
-Nogah
-Sachia
-Eliasaph
-Parshandatha
-Paltiel
-Jeatherai
-Reba
-Eldaah
-Jaasiel
-Agee
-Letushim
-Anaiah
-Jezoar
-Mnason
-Lael
-Ismaiah
-Barkos
-Regem-melech
-Abimael
-Zidkijah
-Ephod
-Karshena
-Jeziah
-Mushi
-Ramiah
-Zurishaddai
-Zerah
-Jamin
-Obil
-Ben Geber
-Chenaniah
-Sethur
-Ishbi-benob
-Jesher
-Rehabiah
-Maaseiah
-Akkub
-Eshek
-Hamor
-Jekoliah
-Abdon
-Shemuel
-Adina
-Nebuzaradan
-Shelemiah
-Jehiel
-Abiel
-Zobebah
-Isshijah
-Pildash
-Uel
-Jehoiada
-Pagiel
-Maher-shalal-hash-baz
-Zalmon
-Ishijah
-Bela
-Joahaz
-Ahuzzath
-Susi
-Joseph
-Joshua the governor of the city
-Peulthai
-Hod
-Ispah
-Hon
-Mezahab
-Chelluh
-Shabbethai
-Asareel
-Barzillai
-Judas of Straight Street in Damascus
-Jahaziah
-Abiasaph
-Massa
-Elnathan
-Sheshan
-Hodaviah
-Janai
-Hillel
-Jakim
-Jeriah
-Eliphelet
-Gishpa
-Jemuel
-Sarsekim
-Hepher
-Hathath
-Ebed
-Shagee
-Jonathan son of Abiathar
-Michaiah
-Chislon
-Jachin
-Ziphah
-Mikloth
-Hanameel
-Ishmaiah
-Hobab
-Jahath
-Methushael
-Leummim
-Gamul
-Mahol
-Jarib
-Jaaziah
--- a/hw/hw2/starter-code/DATA/parent_child.facts
+++ b/hw/hw2/starter-code/DATA/parent_child.facts
--- a/hw/hw2/starter-code/DATA/person.facts
+++ b/hw/hw2/starter-code/DATA/person.facts
--- a/hw/hw2/starter-code/DATA/person_living.facts
+++ b/hw/hw2/starter-code/DATA/person_living.facts
-Abishag
-Mahalath
-Jaziz
-Ahab
-Salome #2
-Irad
-Pelaiah
-Damaris. Acts[41]
-Hannah
-Chalcol
-Elienai
-Nobah
-Hushim
-Sabtechah
-Agee
-Ismaiah
-Adina
-Isshijah
-Zalmon
-Abiasaph
-Shagee
-Chislon
--- a/hw/hw2/starter-code/hw2-q3.dl
+++ b/hw/hw2/starter-code/hw2-q3.dl
-/************ data model **************/
-.symbol_type PersonType
-.symbol_type DescriptionType
-
-.decl person(name:PersonType, description:DescriptionType)
-.input person(filename="DATA/person.facts")
-
-.decl female(name:PersonType)
-.input female(filename="DATA/female.facts")
-
-.decl male(name:PersonType)
-.input male(filename="DATA/male.facts")
-
-.decl parent_child(p1:PersonType, p2:PersonType)
-.input parent_child(filename="DATA/parent_child.facts")
-
-.decl person_living(p:PersonType)
-.input person_living(filename="DATA/person_living.facts")
-
-/************* problem 0 **************/
-/**** Find all children of Priscilla ****/
-.decl p0(x:PersonType, d:DescriptionType)
-// NOTE: if you want to redirect the output to a file
-// you can use the syntax:
-// .output p0(filename="hw2-q3-0.ans")
-.output p0(IO=stdout)
-p0(x,d) :- parent_child("Priscilla",x), person(x,d).
--- a/hw/hw2/submission/README
+++ b/hw/hw2/submission/README
-put your .sql and .txt files in this directory, one file per question.
--- a/hw/hw3/hw3.md
+++ b/hw/hw3/hw3.md
--- a/hw/hw3/starter-code/.gitignore
+++ b/hw/hw3/starter-code/.gitignore
-*.iml
-.classpath
-.project
-bin/
-out/
-.idea/
-log
-*.dat
-dblp_simpledb.schema
\ No newline at end of file
--- a/hw/hw3/starter-code/build.xml
+++ b/hw/hw3/starter-code/build.xml
-<?xml version="1.0" encoding="UTF-8"?>
-<project name="simpledb" default="dist" basedir=".">
-    <property name="src" location="src"/>
-    <property name="testd" location="test"/>
-
-    <property name="build" location="bin"/>
-    <property name="build.src" location="${build}/src"/>
-    <property name="build.test" location="${build}/test"/>
-    <property name="depcache" location="${build}/depcache"/>
-
-    <property name="lib" location="lib"/>
-    <property name="doc" location="javadoc"/>
-    <property name="dist" location="dist"/>
-    <property name="jarfile" location="${dist}/${ant.project.name}.jar"/>
-    <property name="compile.debug" value="true"/>
-    <property name="test.reports" location="testreport"/>
-
-    <property name="sourceversion" value="1.7"/>
-
-    <path id="classpath.base">
-        <pathelement location="${build.src}"/>
-        <pathelement location="${lib}/zql.jar"/>
-        <pathelement location="${lib}/jline-0.9.94.jar"/>
-        <pathelement location="${lib}/mina-core-2.0.4.jar"/>
-        <pathelement location="${lib}/mina-filter-compression-2.0.4.jar"/>
-        <pathelement location="${lib}/slf4j-api-1.6.1.jar"/>
-        <pathelement location="${lib}/slf4j-log4j12-1.6.1.jar"/>
-        <pathelement location="${lib}/log4j-1.2.17.jar"/>
-        <pathelement location="${lib}/jzlib-1.0.7.jar"/>
-    </path>
-
-    <path id="classpath.test">
-        <path refid="classpath.base"/>
-        <pathelement location="${build.test}"/>
-        <pathelement location="${lib}/junit-4.5.jar"/>
-        <pathelement location="${lib}/javassist-3.16.1-GA.jar"/>
-    </path>
-    <!-- Common macro for compiling Java source -->
-    <macrodef name="Compile">
-        <attribute name="srcdir"/>
-        <attribute name="destdir"/>
-        <element name="compileoptions" implicit="true" optional="true"/>
-        <sequential>
-            <mkdir dir="@{destdir}"/>
-            <!-- avoids needing ant clean when changing interfaces -->
-            <depend srcdir="${srcdir}" destdir="${destdir}" cache="${depcache}"/>
-            <javac srcdir="@{srcdir}" destdir="@{destdir}" includeAntRuntime="no"
-                    debug="${compile.debug}" source="${sourceversion}">
-                <compilerarg value="-Xlint:unchecked" />
-                <!--<compilerarg value="-Xlint:deprecation" />-->
-                <compileoptions/>
-            </javac>
-        </sequential>
-    </macrodef>
-
-
-    <!-- Common macro for running junit tests in both the test and runtest targets -->
-    <macrodef name="RunJunit">
-        <attribute name="haltonfailure" default="yes" />
-        <element name="testspecification" implicit="yes" />
-        <sequential>
-            <!-- timeout at 10.5 minutes, since TransactionTest is limited to 10 minutes. -->
-            <junit printsummary="on" fork="yes" timeout="630000" haltonfailure="@{haltonfailure}" maxmemory="128M" failureproperty="junit.failed">
-                <classpath refid="classpath.test" />
-                <formatter type="plain" usefile="false"/>
-                <assertions><enable/></assertions>
-                <testspecification/>
-            </junit>
-        </sequential>
-    </macrodef>
-
-    <taskdef resource="net/sf/antcontrib/antlib.xml">
-        <classpath>
-            <pathelement location="lib/ant-contrib-1.0b3.jar"/>
-        </classpath>
-    </taskdef>
-
-    <target name="eclipse" description="Make current directory an eclipse project">
-        <echo file=".project" append="false">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
-&lt;projectDescription&gt;
-    &lt;name&gt;simpledb&lt;/name&gt;
-    &lt;comment&gt;&lt;/comment&gt;
-    &lt;projects&gt;
-    &lt;/projects&gt;
-    &lt;buildSpec&gt;
-        &lt;buildCommand&gt;
-            &lt;name&gt;org.eclipse.jdt.core.javabuilder&lt;/name&gt;
-            &lt;arguments&gt;
-            &lt;/arguments&gt;
-        &lt;/buildCommand&gt;
-    &lt;/buildSpec&gt;
-    &lt;natures&gt;
-        &lt;nature&gt;org.eclipse.jdt.core.javanature&lt;/nature&gt;
-    &lt;/natures&gt;
-&lt;/projectDescription&gt;</echo>
-        <echo file=".classpath" append="false">&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
-&lt;classpath&gt;
-    &lt;classpathentry kind=&quot;src&quot; output=&quot;bin/src&quot; path=&quot;src/java&quot;/&gt;
-    &lt;classpathentry kind=&quot;src&quot; output=&quot;bin/test&quot; path=&quot;test&quot;/&gt;
-    &lt;classpathentry kind=&quot;con&quot; path=&quot;org.eclipse.jdt.launching.JRE_CONTAINER&quot;/&gt;
-    &lt;classpathentry kind=&quot;output&quot; path=&quot;bin/src&quot;/&gt;
-    </echo>
-    <if> <available file="${lib}/junit-4.5.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/junit-4.5.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/jline-0.9.94.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/jline-0.9.94.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/zql.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/zql.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/mina-core-2.0.4.jar" />    <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/mina-core-2.0.4.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/mina-filter-compression-2.0.4.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/mina-filter-compression-2.0.4.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/jzlib-1.0.7.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/jzlib-1.0.7.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/slf4j-api-1.6.1.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/slf4j-api-1.6.1.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/slf4j-log4j12-1.6.1.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/slf4j-log4j12-1.6.1.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/log4j-1.2.17.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/log4j-1.2.17.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <if> <available file="${lib}/javassist-3.16.1-GA.jar" /> <then>
-            <echo file=".classpath" append="true">
-                &lt;classpathentry kind=&quot;lib&quot; path=&quot;lib/javassist-3.16.1-GA.jar&quot;/&gt;
-            </echo>
-        </then>
-    </if>
-    <echo file=".classpath" append="true">
-        &lt;/classpath&gt;
-    </echo>
-    </target>
-
-    <target name="compile" description="Compile code">
-        <Compile srcdir="${src}/java" destdir="${build.src}">
-            <classpath refid="classpath.base"/>
-        </Compile>
-	<copy todir="${build}" flatten="true">
-		<fileset dir="${src}">
-			<include name="bin/*.sh"/>
-		</fileset>
-	</copy>
-    </target>
-
-    <target name="javadocs" description="Build javadoc documentation">
-        <javadoc destdir="${doc}" access="private" failonerror="true" source="${sourceversion}">
-            <classpath refid="classpath.base" />
-            <fileset dir="src/java" defaultexcludes="yes">
-                <include name="simpledb/**/*.java"/>
-            </fileset>
-        </javadoc>
-    </target>
-
-    <target name="dist" depends="compile" description="Build jar">
-        <mkdir dir="${dist}"/>
-        <jar jarfile="${jarfile}" basedir="${build.src}">
-            <manifest>
-                <attribute name="Main-Class" value="simpledb.SimpleDb"/>
-                <attribute name="Class-Path" value="../lib/zql.jar ../lib/jline-0.9.94.jar ../lib/jzlib-1.0.7.jar ../lib/mina-core-2.0.4.jar ../lib/mina-filter-compression-2.0.4.jar ../lib/slf4j-api-1.6.1.jar ../lib/slf4j-log4j12-1.6.1.jar ../lib/log4j-1.2.17.jar "/>
-            </manifest>
-            <!-- Merge library jars into final jar file -->
-            <!--<zipgroupfileset refid="lib.jars"/>-->
-        </jar>
-    </target>
-
-    <target name="clean" description="Remove build and dist directories">
-        <delete dir="${build}"/>
-        <delete dir="${dist}"/>
-        <delete dir="${doc}"/>
-        <delete dir="${test.reports}"/>
-    </target>
-
-    <target name="testcompile" depends="compile" description="Compile all unit and system tests">
-        <Compile srcdir="${testd}" destdir="${build.test}">
-            <classpath refid="classpath.test"/>
-        </Compile>
-    </target>
-
-    <target name="test" depends="testcompile" description="Run all unit tests">
-        <RunJunit>
-            <batchtest>
-                <fileset dir="${build.test}">
-                    <include name="**/*Test.class"/>
-                    <exclude name="**/*$*.class"/>
-                    <exclude name="simpledb/systemtest/*.class"/>
-                </fileset>
-            </batchtest>
-        </RunJunit>
-    </target>
-
-    <target name="systemtest" depends="testcompile" description="Run all system tests">
-        <RunJunit>
-            <batchtest>
-                <fileset dir="${build.test}">
-                    <include name="simpledb/systemtest/*Test.class"/>
-                </fileset>
-            </batchtest>
-        </RunJunit>
-    </target>
-
-    <target name="runtest" depends="testcompile"
-            description="Runs the test you specify on the command line with -Dtest=">
-        <!-- Check for -Dtest command line argument -->
-        <fail unless="test" message="You must run this target with -Dtest=TestName"/>
-
-        <!-- Check if the class exists -->
-        <available property="test.exists" classname="simpledb.${test}">
-                <classpath refid="classpath.test" />
-        </available>
-        <fail unless="test.exists" message="Test ${test} could not be found"/>
-
-        <RunJunit>
-            <test name="simpledb.${test}"/>
-        </RunJunit>
-    </target>
-
-    <target name="runsystest" depends="testcompile"
-            description="Runs the system test you specify on the command line with -Dtest=">
-        <!-- Check for -Dtest command line argument -->
-        <fail unless="test" message="You must run this target with -Dtest=TestName"/>
-
-        <!-- Check if the class exists -->
-        <available property="test.exists" classname="simpledb.systemtest.${test}">
-                <classpath refid="classpath.test" />
-        </available>
-        <fail unless="test.exists" message="Test ${test} could not be found"/>
-
-        <RunJunit>
-            <test name="simpledb.systemtest.${test}"/>
-        </RunJunit>
-    </target>
-
-
-    <!-- The following target is used for automated grading. -->
-    <target name="test-report" depends="testcompile"
-            description="Generates HTML test reports in ${test.reports}">
-        <mkdir dir="${test.reports}"/>
-
-        <!-- do not halt on failure so we always produce HTML reports. -->
-        <RunJunit haltonfailure="no">
-            <formatter type="xml"/>
-            <formatter type="plain" usefile="true"/>
-            <batchtest todir="${test.reports}" >
-                <fileset dir="${build.test}">
-                    <include name="**/*Test.class"/>
-                    <exclude name="**/*$*.class"/>
-                </fileset>
-            </batchtest>
-        </RunJunit>
-
-        <junitreport todir="${test.reports}">
-            <fileset dir="${test.reports}">
-                <include name="TEST-*.xml" />
-            </fileset>
-            <report todir="${test.reports}" />
-        </junitreport>
-        
-        <!-- Fail here if the junit tests failed. -->
-        <fail if="junit.failed" message="Some JUnit tests failed"/>
-    </target>
-    
-    <target name="handin" depends="clean"
-        description="Create a tarball of your code to hand in">
-        <tar destfile="lab-handin.tar.bz2" compression="bzip2"
-            basedir="." />
-        <echo message="Tarball created!  Please submit 'lab-handin.tar.bz2' per the instructions in the lab document." />
-	<subant target="dist">
-	    <fileset dir="." includes="build.xml"/>
-	</subant>
-    </target>
-
-    <target name="test-and-handin" depends="test,systemtest,handin"
-        description="Run all the tests and system tests; if they succeed, create a tarball of the source code to submit" />
-
-</project>
--- a/hw/hw3/starter-code/lib/LICENSE.javassist.html
+++ b/hw/hw3/starter-code/lib/LICENSE.javassist.html
--- a/hw/hw3/starter-code/lib/LICENSE.jzlib.txt
+++ b/hw/hw3/starter-code/lib/LICENSE.jzlib.txt
-JZlib 0.0.* were released under the GNU LGPL license.  Later, we have switched 
-over to a BSD-style license. 
-
------------------------------------------------------------------------------
-Copyright (c) 2000,2001,2002,2003 ymnk, JCraft,Inc. All rights reserved.
-
-Redistribution and use in source and binary forms, with or without
-modification, are permitted provided that the following conditions are met:
-
-  1. Redistributions of source code must retain the above copyright notice,
-     this list of conditions and the following disclaimer.
-
-  2. Redistributions in binary form must reproduce the above copyright 
-     notice, this list of conditions and the following disclaimer in 
-     the documentation and/or other materials provided with the distribution.
-
-  3. The names of the authors may not be used to endorse or promote products
-     derived from this software without specific prior written permission.
-
-THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES,
-INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
-FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JCRAFT,
-INC. OR ANY CONTRIBUTORS TO THIS SOFTWARE BE LIABLE FOR ANY DIRECT, INDIRECT,
-INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
-OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
-LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
-NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
-EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.