diff --git a/hw/hw3/hw3.md b/hw/hw3/hw3.md new file mode 100644 index 0000000000000000000000000000000000000000..0f24a8d813898374a80db02dc947e653ab4d32f0 --- /dev/null +++ b/hw/hw3/hw3.md @@ -0,0 +1,954 @@ +CSE 544 Homework 3: SimpleDB +============================ + +**Objectives:** + +To get experience implementing the internals of a DBMS. + +**Assignment tools:** + +apache ant and your favorite code editor + +**Assigned date:** Jan 28, 2018 + +**Due date:** Feb. 16, 2018 + +**What to turn in:** + +See below. + +**Starter code:** + +In your `hw/hw3/starter-code` folder + +Acknowledgement +--------------- + +This assignment comes from Prof. Sam Madden's 6.830 class at MIT. + +The full series of SimpleDB assignments includes what we will do in this +homework, which is to build the basic functionality for query +processing. It also includes transactions and query optimization, which +we will NOT do. + +We also use this series of assignments in [CSE +444](http://courses.cs.washington.edu/courses/cse444/). We have +contributed bug fixes and an extra lab to SimpleDB. The lab that we +added involves building a parallel, shared-nothing version of SimpleDB. +We invite you to take a look at the CSE 444 course webpage to see what +all the SimpleDB labs are about. + + +Assignment goal +--------------- + +In this assignment, you will write a basic database management system +called SimpleDB. First, you will implement the core modules required to +access stored data on disk. You will then write a set of operators for +SimpleDB to implement selections, joins, and aggregates. The end result +is a database system that can perform simple queries over multiple +tables. We will not ask you to add transactions, locking, and concurrent +queries because we do not have time to do the full project in 544. +However, we invite you to think how you would add such functionality +into the system. + +SimpleDB is written in Java. We have provided you with a set of mostly +unimplemented classes and interfaces. You will need to write the code +for these classes. We will grade your code by running a set of system +tests written using [JUnit](http://www.junit.org/). We have also +provided a number of unit tests, which we will not use for grading but +that you may find useful in verifying that your code works. Note that +the unit tests we provide are to help guide your implementation along, +but they are not intended to be comprehensive or to establish +correctness. + +The remainder of this document describes the basic architecture of +SimpleDB, gives some suggestions about how to start coding, and +discusses how to hand in your assignment. + +We **strongly recommend** that you start as early as possible on this +assignment. It requires you to write a fair amount of code! + +0. Find bugs, be patient, earn candy bars +----------------------------------------- + +SimpleDB is a relatively complex piece of code. It is very possible you +are going to find bugs, inconsistencies, and bad, outdated, or incorrect +documentation, etc. + +We ask you, therefore, to do this assignment with an adventurous +mindset. Don't get mad if something is not clear, or even wrong; rather, +try to figure it out yourself or send us a friendly email. We promise to +help out by posting bug fixes, new tarballs, etc., as bugs and issues +are reported. + +1. Getting started +------------------ + +These instructions are written for any Unix-based platform (e.g., Linux, +MacOS, etc.). Because the code is written in Java, it should work under +Windows as well, though the directions in this document may not apply. + +We have included Section 1.2 on using the project with Eclipse and Intellij. Using +those IDEs is recommended, especially if you are on Windows. + +Pull the latest changes from upstream to your local master branch +(the upstream is the one you added in +[hw1](https://gitlab.cs.washington.edu/suciu/cse544-2018wi/blob/master/hw/hw1/hw1.md)) + +```bash +$ git pull upstream master +``` + +SimpleDB uses the [Ant build tool](http://ant.apache.org/) to compile +the code and run tests. Ant is similar to +[make](http://www.gnu.org/software/make/manual/), but the build file is +written in XML and is somewhat better suited to Java code. Most modern +Linux distributions include Ant. + +To help you during development, we have provided a set of unit tests in +addition to the end-to-end tests that we use for grading. These are by +no means comprehensive, and you should not rely on them exclusively to +verify the correctness of your project. + +To run the unit tests use the test build target: + +```bash +$ cd hw/hw3/starter-code +$ # run all unit tests +$ ant test +$ # run a specific unit test +$ ant runtest -Dtest=TupleTest +``` + +You should see output similar to: + +```bash +# build output... + +test: + [junit] Running simpledb.TupleTest + [junit] Testsuite: simpledb.TupleTest + [junit] Tests run: 3, Failures: 0, Errors: 3, Time elapsed: 0.036 sec + [junit] Tests run: 3, Failures: 0, Errors: 3, Time elapsed: 0.036 sec + +# ... stack traces and error reports ... +``` + +The output above indicates that three errors occurred during +compilation; this is because the code we have given you doesn't yet +work. As you complete parts of the assignment, you will work towards +passing additional unit tests. If you wish to write new unit tests as +you code, they should be added to the test/simpledb directory. + +For more details about how to use Ant, see the +[manual](http://ant.apache.org/manual/). The [Running +Ant](http://ant.apache.org/manual/running.html) section provides details +about using the ant command. However, the quick reference table below +should be sufficient for working on the assignments. + + - `ant` Build the default target (for simpledb, this is dist). + - `ant eclipse` Make the project an Eclipse project. + - `ant -projecthelp` List all the targets in `build.xml` with descriptions. + - `ant dist` Compile the code in src and package it in `dist/simpledb.jar`. + - `ant test` Compile and run all the unit tests. + - `ant runtest -Dtest=testname` Run the unit test named `testname`. + - `ant systemtest` Compile and run all the system tests. + - `ant runsystest -Dtest=testname` Compile and run the system test named `testname`. + +### 1.1. Running end-to-end tests + +We have also provided a set of end-to-end tests that will eventually be +used for grading. These tests are structured as JUnit tests that live in +the test/simpledb/systemtest directory. To run all the system tests, use +the systemtest build target: + +```bash +$ ant systemtest + +# ... build output ... + +systemtest: + +[junit] Running simpledb.systemtest.ScanTest + [junit] Testsuite: simpledb.systemtest.ScanTest + [junit] Tests run: 3, Failures: 0, Errors: 3, Time elapsed: 0.237 sec + [junit] Tests run: 3, Failures: 0, Errors: 3, Time elapsed: 0.237 sec + [junit] + [junit] Testcase: testSmall took 0.017 sec + [junit] Caused an ERROR + [junit] implement this + [junit] java.lang.UnsupportedOperationException: implement this + [junit] at simpledb.HeapFile.id(HeapFile.java:46) + [junit] at simpledb.systemtest.SystemTestUtil.matchTuples(SystemTestUtil.java:90) + [junit] at simpledb.systemtest.SystemTestUtil.matchTuples(SystemTestUtil.java:83) + [junit] at simpledb.systemtest.ScanTest.validateScan(ScanTest.java:30) + [junit] at simpledb.systemtest.ScanTest.testSmall(ScanTest.java:41) + +# ... more error messages ... +``` + +This indicates that this test failed, showing the stack trace where the +error was detected. To debug, start by reading the source code where the +error occurred. When the tests pass, you will see something like the +following: + +```bash +$ ant systemtest + +# ... build output ... + + [junit] Testsuite: simpledb.systemtest.ScanTest + [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 7.278 sec + [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 7.278 sec + [junit] + [junit] Testcase: testSmall took 0.937 sec + [junit] Testcase: testLarge took 5.276 sec + [junit] Testcase: testRandom took 1.049 sec + +BUILD SUCCESSFUL +Total time: 52 seconds +``` + +### 1.1.1 Creating dummy tables + +It is likely you'll want to create your own tests and your own data +tables to test your own implementation of SimpleDB. You can create any +.txt file and convert it to a .dat file in SimpleDB's HeapFile format +using the command: + +```bash +$ ant dist + +$ java -jar dist/simpledb.jar convert file.txt N +``` + +where file.txt is the name of the file and N is the number of columns in +the file. Notice that file.txt has to be in the following format: + +``` +int1,int2,...,intN +int1,int2,...,intNint1,int2,...,intN +int1,int2,...,intN +``` + +...where each intN is a non-negative integer. + +To view the contents of a table, use the print command. Note that this +command will not work until later in the assignment: + +```bash +$ java -jar dist/simpledb.jar print file.dat N +``` + +where file.dat is the name of a table created with the convert command, +and N is the number of columns in the file. + +### 1.1.2 Debugging + +You can get some debug messages printed out while running tests if you +add + + -Dsimpledb.Debug + +as one of the arguments to the Java VM. + + +### 1.2. Working in Eclipse and Intellij(Optional) + +IDEs like [Eclipse](http://www.eclipse.org/) or [Intellij](https://www.jetbrains.com/idea/) +is a graphical software development +environment that you might be more comfortable with working in. + +**Setting the Assignment Up in Eclipse** + +- `cd hw/hw3/starter-code` and then run `ant eclipse` +- With Eclipse running, select File-\>Open Projects from File System + (or File-\>Import, select "Existing Projects into Workspace" under "General") +- Select the starter-code folder. It should contain build.xml. +- Click Finish, and you should be able to see "simpledb" as a new project + in the Project Explorer tab on the left-hand side of your screen (if + you just installed Eclipse, make sure to close the "Welcome" + window). Opening this project reveals the directory structure + discussed above - implementation code can be found in "src," and + unit tests and system tests found in "test." + +**Setting the Assignment Up in Intellij** + +- `cd hw/hw3/starter-code` and then run `ant eclipse` (The TA is too lazy to + create a separate ant target for Intellij, so we are just going to import + the Eclipse project into the Intellij) +- Open Intellij, and select "Import Project" if you are in the startup panel, + or go to File-\>New-\>Project from Existing Sources +- Select the starter-code folder then hit "OK" +- On the first prompt window, make sure "Eclipse" is selected +- Hit "Next" until no more prompt windows showing up. Leave all the suggested + options as-is + +**Running Individual Unit and System Tests** + +To run a unit test or system test (both are JUnit tests, and can be +initialized the same way), go to the Package Explorer tab on the left +side of your screen. In your IDE, open the "test" directory. +Unit tests are found in the "simpledb" package, and system tests are +found in the "simpledb.systemtests" package. To run one of these tests, +select the test (they are all called `\*Test.java` - don't select +`TestUtil.java` or `SystemTestUtil.java`), right click on it, in Eclipse, +select "Run As," and select "JUnit Test." In Intellij, simply select the +"Run '...Test.java'" option. This will bring up a JUnit tab, which will +tell you the status of the individual tests within the JUnit test suite, +and will show you exceptions and other errors that will help you debug +problems. + +**Running Ant Build Targets** + +If you want to run commands such as `ant test` or `ant systemtest`, +- In Eclipse: + - right click on build.xml in the Package Explorer. + - Select "Run As" and then "Ant Build..." (note: select the option with the ellipsis (...), + otherwise you won't be presented with a set of build targets to run). + - Then, in the "Targets" tab of the next screen, check off the targets you + want to run (probably "dist" and one of "test" or "systemtest"). This + should run the build targets and show you the results in Eclipse's + console window. +- In Intellij: + - right click on build.xml in the Package Explorer. + - Select "Add as Ant Build File" + - Now you should be able to see the available commands on the Ant Build panel + + +### 1.3. Implementation hints + +Before beginning to write code, we **strongly encourage** you to read +through this entire document to get a feel for the high-level design of +SimpleDB. + +You will need to fill in any piece of code that is not implemented. It +will be obvious where we think you should write code. You may need to +add private methods and/or helper classes. You may change APIs, but make +sure our grading tests still run and make sure to mention, explain, and +defend your decisions in your writeup. + +In addition to the methods that you need to fill out for this +assignment, the class interfaces contain numerous methods that you need +not implement in this assignment. These will either be indicated per +class: + +```java +// Not necessary for this assignment +public class Insert implements DbIterator { +``` + +or per method: + +```java +public boolean deleteTuple(Tuple t) throws DbException { + + // Not necessary for this assignment + return false; +} +``` + +The code that you submit should compile without having to modify these +methods. + +We suggest exercises along this document to guide your implementation, +but you may find that a different order makes more sense for you. Here's +a rough outline of one way you might proceed with your SimpleDB +implementation: + +- Implement the classes to manage tuples, namely `Tuple`, `TupleDesc`. We + have already implemented `Field`, `IntField`, `StringField`, and `Type` for + you. Since you only need to support integer and (fixed length) + string fields and fixed length tuples, these are straightforward. +- Implement the `Catalog` (this should be very simple). +- Implement the `BufferPool` constructor and the `getPage()` method. +- Implement the access methods, `HeapPage` and `HeapFile` and associated + ID classes. A good portion of these files has already been written + for you. +- Implement the operator `SeqScan`. +- At this point, you should be able to pass the `ScanTest system test`. +- Implement the operators `Filter` and `Join` and verify + that their corresponding tests work. The Javadoc comments for these + operators contain details about how they should work. We have given + you implementations of `Project` and `OrderBy` which + may help you understand how other operators work. +- (Extra credit) Implement `IntAggregator` and + `StringAggregator`. Here, you will write the logic that + actually computes an aggregate over a particular field across + multiple groups in a sequence of input tuples. Use integer division + for computing the average, since SimpleDB only supports integers. + `StringAggegator` only needs to support the `COUNT` aggregate, since the + other operations do not make sense for strings. +- (Extra credit) Implement the `Aggregate` operator. As with + other operators, aggregates implement the `DbIterator` + interface so that they can be placed in SimpleDB query plans. Note + that the output of an `Aggregate` operator is an aggregate + value of an entire group for each call to `next()`, and that + the aggregate constructor takes the aggregation and grouping fields. +- (Extra credit) Use thre provided parse to run some queries, and + report your query execution times. + +At this point you should be able to pass all of the tests in the ant +`systemtest` target, which is the goal of this homework. Section +2 below walks you through these implementation steps and the unit tests +corresponding to each one in more detail. + +### 1.4. Transactions, locking, and recovery + +As you look through the interfaces that we have provided you, you will +see a number of references to locking, transactions, and recovery. You +do not need to support these features. We will not be implementing this +part of SimpleDB in 544. The test code we have provided you with +generates a fake transaction ID that is passed into the operators of the +query it runs; you should pass this transaction ID into other operators +and the buffer pool. + +2. SimpleDB Architecture and Implementation Guide +------------------------------------------------- + +SimpleDB consists of: + +- Classes that represent fields, tuples, and tuple schemas; +- Classes that apply predicates and conditions to tuples; +- One or more access methods (e.g., heap files) that store relations + on disk and provide a way to iterate through tuples of those + relations; +- A collection of operator classes (e.g., select, join, insert, + delete, etc.) that process tuples; +- A buffer pool that caches active tuples and pages in memory and + handles concurrency control and transactions (neither of which you + need to worry about for this homework); and, +- A catalog that stores information about available tables and their + schemas. + +SimpleDB does not include many things that you may think of as being a +part of a "database." In particular, SimpleDB does not have: + +- A SQL front end or parser that allows you to type queries directly + into SimpleDB. Instead, queries are built up by chaining a set of + operators together into a hand-built query plan (see [Section + 2.7](#query_walkthrough)). We will provide a simple parser for use + if you like to work on the extra credit problems (see below). +- Views. +- Data types except integers and fixed length strings. +- Query optimizer. +- Indices. + +In the rest of this Section, we describe each of the main components of +SimpleDB that you will need to implement in this homework. You should +use the exercises in this discussion to guide your implementation. This +document is by no means a complete specification for SimpleDB; you will +need to make decisions about how to design and implement various parts +of the system. + +### 2.1. The Database Class + +The Database class provides access to a collection of static objects +that are the global state of the database. In particular, this includes +methods to access the catalog (the list of all the tables in the +database), the buffer pool (the collection of database file pages that +are currently resident in memory), and the log file. You will not need +to worry about the log file in this homework. We have implemented the +Database class for you. You should take a look at this file as you will +need to access these objects. + +### 2.2. Fields and Tuples + +Tuples in SimpleDB are quite basic. They consist of a collection of +`Field` objects, one per field in the `Tuple`. `Field` is an interface +that different data types (e.g., integer, string) implement. `Tuple` +objects are created by the underlying access methods (e.g., heap files, +or B-trees), as described in the next section. Tuples also have a type +(or schema), called a *tuple descriptor*, represented by a `TupleDesc` +object. This object consists of a collection of `Type` objects, one per +field in the tuple, each of which describes the type of the +corresponding field. + +**Exercise 1.** Implement the skeleton methods in: + +- `src/java/simpledb/TupleDesc.java` +- `src/java/simpledb/Tuple.java` + +At this point, your code should pass the unit tests `TupleTest` and +`TupleDescTest`. At this point, `modifyRecordId()` should fail because you +havn't implemented it yet. + +### 2.3. Catalog + +The catalog (class `Catalog` in SimpleDB) consists of a list of the +tables and schemas of the tables that are currently in the database. You +will need to support the ability to add a new table, as well as getting +information about a particular table. Associated with each table is a +`TupleDesc` object that allows operators to determine the types and +number of fields in a table. + +The global catalog is a single instance of `Catalog` that is allocated +for the entire SimpleDB process. The global catalog can be retrieved via +the method `Database.getCatalog()`, and the same goes for the global +buffer pool (using `Database.getBufferPool()`). + +**Exercise 2.** Implement the skeleton methods in: + +- `src/java/simpledb/Catalog.java` + +At this point, your code should pass the unit tests in `CatalogTest`. + +### 2.4. BufferPool + +The buffer pool (class `BufferPool` in SimpleDB) is responsible for +caching pages in memory that have been recently read from disk. All +operators read and write pages from various files on disk through the +buffer pool. It consists of a fixed number of pages, defined by the +`numPages` parameter to the `BufferPool` constructor. For this homework, +implement the constructor and the `BufferPool.getPage()` method used by +the `SeqScan` operator. The BufferPool should store up to `numPages` +pages. If more than `numPages` requests are made for different pages, +then instead of implementing an eviction policy, you may throw a +`DbException`. + +The `Database` class provides a static method, +`Database.getBufferPool()`, that returns a reference to the single +BufferPool instance for the entire SimpleDB process. + +**Exercise 3.** Implement the `getPage()` method in: + +- `src/java/simpledb/BufferPool.java` + +We have not provided unit tests for BufferPool. The functionality you +implemented will be tested in the implementation of HeapFile below. You +should use the `DbFile.readPage` method to access pages of a DbFile. + +### 2.5. HeapFile access method + +Access methods provide a way to read or write data from disk that is +arranged in a specific way. Common access methods include heap files +(unsorted files of tuples) and B-trees; for this assignment, you will +only implement a heap file access method, and we have written some of +the code for you. + +A `HeapFile` object is arranged into a set of pages, each of which +consists of a fixed number of bytes for storing tuples, (defined by the +constant `BufferPool.PAGE_SIZE`), including a header. In SimpleDB, there +is one `HeapFile` object for each table in the database. Each page in a +`HeapFile` is arranged as a set of slots, each of which can hold one +tuple (tuples for a given table in SimpleDB are all of the same size). +In addition to these slots, each page has a header that consists of a +bitmap with one bit per tuple slot. If the bit corresponding to a +particular tuple is 1, it indicates that the tuple is valid; if it is 0, +the tuple is invalid (e.g., has been deleted or was never initialized.) +Pages of `HeapFile` objects are of type `HeapPage` which implements the +`Page` interface. Pages are stored in the buffer pool but are read and +written by the `HeapFile` class. + +SimpleDB stores heap files on disk in more or less the same format they +are stored in memory. Each file consists of page data arranged +consecutively on disk. Each page consists of one or more bytes +representing the header, followed by the +`BufferPool.PAGE_SIZE - # header bytes ` bytes of actual page content. +Each tuple requires *tuple size* \* 8 bits for its content and 1 bit for +the header. Thus, the number of tuples that can fit in a single page is: + +` tupsPerPage = floor((BufferPool.PAGE_SIZE * 8) / (tuple size * 8 + 1))` + +Where *tuple size* is the size of a tuple in the page in bytes. The idea +here is that each tuple requires one additional bit of storage in the +header. We compute the number of bits in a page (by mulitplying +`PAGE_SIZE` by 8), and divide this quantity by the number of bits in a +tuple (including this extra header bit) to get the number of tuples per +page. The floor operation rounds down to the nearest integer number of +tuples (we don't want to store partial tuples on a page!) + +Once we know the number of tuples per page, the number of bytes required +to store the header is simply: + +` headerBytes = ceiling(tupsPerPage/8)` + +The ceiling operation rounds up to the nearest integer number of bytes +(we never store less than a full byte of header information.) + +The low (least significant) bits of each byte represents the status of +the slots that are earlier in the file. Hence, the lowest bit of the +first byte represents whether or not the first slot in the page is in +use. Also, note that the high-order bits of the last byte may not +correspond to a slot that is actually in the file, since the number of +slots may not be a multiple of 8. Also note that all Java virtual +machines are [big-endian](http://en.wikipedia.org/wiki/Endianness). + +**Exercise 4.** Implement the skeleton methods in: + +- `src/java/simpledb/HeapPageId.java` +- `src/java/simpledb/RecordId.java` +- `src/java/simpledb/HeapPage.java` + +Although you will not use them directly in this lab, we ask you to +implement `getNumEmptySlots()` and `isSlotUsed()` in `HeapPage`. These +require pushing around bits in the page header. You may find it helpful +to look at the other methods that have been provided in `HeapPage` or in +`src/java/simpledb/HeapFileEncoder.java` to understand the layout of +pages. + +You will also need to implement an Iterator over the tuples in the page, +which may involve an auxiliary class or data structure. + +At this point, your code should pass the unit tests in `HeapPageIdTest`, +`RecordIdTest`, and `HeapPageReadTest`. + +After you have implemented `HeapPage`, you will write methods for +`HeapFile` in this homework to calculate the number of pages in a file +and to read a page from the file. You will then be able to fetch tuples +from a file stored on disk. + +**Exercise 5.** Implement the skeleton methods in: + +- `src/java/simpledb/HeapFile.java` + +To read a page from disk, you will first need to calculate the correct +offset in the file. Hint: you will need random access to the file in +order to read and write pages at arbitrary offsets. You should not call +BufferPool methods when reading a page from disk. + +You will also need to implement the `HeapFile.iterator()` method, which +should iterate through through the tuples of each page in the HeapFile. +The iterator must use the `BufferPool.getPage()` method to access pages +in the `HeapFile`. This method loads the page into the buffer pool. Do +not load the entire table into memory on the `open()` call -- this will +cause an out of memory error for very large tables. + +At this point, your code should pass the unit tests in `HeapFileReadTest`. + +### 2.6. Operators + +Operators are responsible for the actual execution of the query plan. +They implement the operations of the relational algebra. In SimpleDB, +operators are iterator based; each operator implements the `DbIterator` +interface. + +Operators are connected together into a plan by passing lower-level +operators into the constructors of higher-level operators, i.e., by +'chaining them together.' Special access method operators at the leaves +of the plan are responsible for reading data from the disk (and hence do +not have any operators below them). + +At the top of the plan, the program interacting with SimpleDB simply +calls `getNext` on the root operator; this operator then calls `getNext` +on its children, and so on, until these leaf operators are called. They +fetch tuples from disk and pass them up the tree (as return arguments to +`getNext`); tuples propagate up the plan in this way until they are +output at the root or combined or rejected by another operator in the +plan. + +#### 2.6.1. Scan + +**Exercise 6.** Implement the skeleton methods in: + +- `src/java/simpledb/SeqScan.java` + +This operator sequentially scans all of the tuples from the pages of the +table specified by the `tableid` in the constructor. This operator +should access tuples through the `DbFile.iterator()` method. + +At this point, you should be able to complete the `ScanTest` system test. +Good work! + +#### 2.6.2. Filter and Join + +Recall that SimpleDB DbIterator classes implement the operations of the +relational algebra. You will now implement two operators that will +enable you to perform queries that are slightly more interesting than a +table scan. + +- *Filter*: This operator only returns tuples that satisfy a + `Predicate` that is specified as part of its constructor. Hence, it + filters out any tuples that do not match the predicate. +- *Join*: This operator joins tuples from its two children according + to a `JoinPredicate` that is passed in as part of its constructor. + We only require a simple nested loops join, but you may explore more + interesting join implementations. Describe your implementation in + your writeup. + +**Exercise 7.** Implement the skeleton methods in: + +- `src/simpledb/Predicate.java` +- `src/simpledb/JoinPredicate.java` +- `src/simpledb/Filter.java` +- `src/simpledb/Join.java` + +At this point, your code should pass the unit tests in `PredicateTest`, +`JoinPredicateTest`, `FilterTest`, and `JoinTest`. Furthermore, you should be +able to pass the system tests `FilterTest` and `JoinTest`. + +#### 2.6.3. Aggregates (EXTRA CREDIT) + +**All the materials in this section is optional and will count only as +extra credit.** + +An additional SimpleDB operator implements basic SQL aggregates with a +`GROUP BY` clause. You should implement the five SQL aggregates +(`COUNT`, `SUM`, `AVG`, `MIN`, `MAX`) and support grouping. You only +need to support aggregates over a single field, and grouping by a single +field. + +In order to calculate aggregates, we use an `Aggregator` interface which +merges a new tuple into the existing calculation of an aggregate. The +`Aggregator` is told during construction what operation it should use +for aggregation. Subsequently, the client code should call +`Aggregator.mergeTupleIntoGroup()` for every tuple in the child +iterator. After all tuples have been merged, the client can retrieve a +DbIterator of aggregation results. Each tuple in the result is a pair of +the form `(groupValue, aggregateValue)`, unless the value of the group +by field was `Aggregator.NO_GROUPING`, in which case the result is a +single tuple of the form `(aggregateValue)`. + +Note that this implementation requires space linear in the number of +distinct groups. For the purposes of this homework, you do not need to +worry about the situation where the number of groups exceeds available +memory. + +**Exercise 8.** Implement the skeleton methods in: + +- `src/simpledb/IntegerAggregator.java` +- `src/simpledb/StringAggregator.java` +- `src/simpledb/Aggregate.java` + +At this point, your code should pass the unit tests +`IntegerAggregatorTest`, `StringAggregatorTest`, and `AggregateTest`. +Furthermore, you should be able to pass the `AggregateTest` system test. + +### 2.7. Query Parser and Contest (EXTRA CREDIT) + +**All the materials in this section is optional and will count only as +extra credit.** + +We've provided you with a query parser for SimpleDB that you can use to +write and run SQL queries against your database once you have completed +the exercises in this homework. + +The first step is to create some data tables and a catalog. Suppose you +have a file `data.txt` with the following contents: + + 1,10 + 2,20 + 3,30 + 4,40 + 5,50 + 5,50 + +You can convert this into a SimpleDB table using the `convert` command +(make sure to type `ant` first!): + + java -jar dist/simpledb.jar convert data.txt 2 "int,int" + +This creates a file `data.dat`. In addition to the table's raw data, the +two additional parameters specify that each record has two fields and +that their types are `int` and `int`. + +Next, create a catalog file, `catalog.txt`, with the follow contents: + + data (f1 int, f2 int) + +This tells SimpleDB that there is one table, `data` (stored in +`data.dat`) with two integer fields named `f1` and `f2`. + +Finally, invoke the parser. You must run java from the command line (ant +doesn't work properly with interactive targets.) From the `simpledb/` +directory, type: + + java -jar dist/simpledb.jar parser catalog.txt + +You should see output like: + + Added table : data with schema INT(f1), INT(f2), + SimpleDB> + +Finally, you can run a query: + + SimpleDB> select d.f1, d.f2 from data d; + Started a new transaction tid = 1221852405823 + ADDING TABLE d(data) TO tableMap + TABLE HAS tupleDesc INT(d.f1), INT(d.f2), + 1 10 + 2 20 + 3 30 + 4 40 + 5 50 + 5 50 + + 6 rows. + ---------------- + 0.16 seconds + + SimpleDB> + +The parser is relatively full featured (including support for SELECTs, +INSERTs, DELETEs, and transactions), but does have some problems and +does not necessarily report completely informative error messages. Here +are some limitations to bear in mind: + +- You must preface every field name with its table name, even if the + field name is unique (you can use table name aliases, as in the + example above, but you cannot use the AS keyword.) +- Nested queries are supported in the WHERE clause, but not the FROM + clause. +- No arithmetic expressions are supported (for example, you can't take + the sum of two fields.) +- At most one GROUP BY and one aggregate column are allowed. +- Set-oriented operators like IN, UNION, and EXCEPT are not allowed. +- Only AND expressions in the WHERE clause are allowed. +- UPDATE expressions are not supported. +- The string operator LIKE is allowed, but must be written out fully + (that is, the Postgres tilde [\~] shorthand is not allowed.) + +**Exercise 9: Please execute the three queries below using your SimpleDB +prototype and report the times in your homework write-up.** + +We have built a SimpleDB-encoded version of the DBLP database; the +needed files are located at: +[http://www.cs.washington.edu/education/courses/cse544/15au/hw/hw2/dblp\_data.tar.gz](http://www.cs.washington.edu/education/courses/cse544/15au/hw/hw2/dblp_data.tar.gz) + +You should download the file and unpack it. It will create four files in +the `dblp_data` directory. Move them into the `simpledb` directory. The +following commands will acomplish this, if you run them from the +`simpledb` directory: + +```bash + $ wget http://www.cs.washington.edu/education/courses/cse544/15au/hw/hw2/dblp_data.tar.gz + $ tar xvzf dblp_data.tar.gz + $ mv dblp_data/* . + $ rm -r dblp_data.tar.gz dblp_data +``` + +You can then run the parser with: +```bash + $ java -jar dist/simpledb.jar parser dblp_simpledb.schema +``` +We will start a thread on the course message board inviting anyone +interested to post their runtimes for the following three queries +(please run the queries on a lab machine and indicate which one you used +so it becomes easier to compare runtimes). The contest is just for fun. +It will not affect your grade: + +1. + + SELECT p.title + FROM papers p + WHERE p.title LIKE 'selectivity'; + +2. + + SELECT p.title, v.name + FROM papers p, authors a, paperauths pa, venues v + WHERE a.name = 'E. F. Codd' + AND pa.authorid = a.id + AND pa.paperid = p.id + AND p.venueid = v.id; + +3. + + SELECT a2.name, count(p.id) + FROM papers p, authors a1, authors a2, paperauths pa1, paperauths pa2 + WHERE a1.name = 'Michael Stonebraker' + AND pa1.authorid = a1.id + AND pa1.paperid = p.id + AND pa2.authorid = a2.id + AND pa1.paperid = pa2.paperid + GROUP BY a2.name + ORDER BY a2.name; + + +Note that each query will print out its runtime after it executes. + +You may wish to create optimized implementations of some of the +operators; in particular, a fast join operator (e.g., not nested loops) +will be essential for good performance on queries 2 and 3. + +There is currently no optimizer in the parser, so the queries above have +been written to cause the parser to generate reasonable plans. Here are +some helpful hints about how the parser works that you may wish to +exploit while running these queries: + +- The table on the left side of the joins in these queries is passed + in as the first `DbIterator` parameter to `Join`. +- Expressions in the WHERE clause are added to the plan from top to + bottom, such that first expression will be the bottom-most operator + in the generated query plan. For example, the generated plan for + Query 2 is: + + Project(Join(Join(Filter(a),pa),p)) + +Our reference implementation can run Query 1 in about .35 seconds, Query +2 in about 10 seconds, and Query 3 in about 20 seconds. We implemented a +special-purpose join operator for equality joins but did little else to +optimize performance. Actual runtimes might vary depending on your +machine setting. + +Depending on the efficiency of your implementation, each of these +queries will take seconds to minutes to run to completion, outputting +tuples as they are computed. Certainly don't expect the level of +performance of postgres. :) + +Turn in instructions +-------------------- + +You must submit your code (see below) as well as a short (2 pages, +maximum) writeup file called `writeup.txt` describing your approach. This writeup should: + +- Describe any design decisions you made. For example any class or + complex data structure you add to the project. If you used something + other than a nested-loops join, describe the tradeoffs of the + algorithm you chose. +- Discuss and justify any changes you made to the API. +- Describe any missing or incomplete elements of your code. +- Describe how long you spent on the assignment, and whether there was + anything you found particularly difficult or confusing. + +Put all your code as well as the `writeup.txt` file in the `starter-code` folder. +and run the `turnInHw.sh` script: +```bash +# also remember to add the writeup.txt!! +$ ./turnInHw.sh hw3 +``` + +Submitting a bug +---------------- + +Please submit (friendly!) bug reports to the TA and instructor. When you +do, please try to include: + +- A description of the bug. +- A .java file we can drop in the test/simpledb directory, compile, + and run. +- A .txt file with the data that reproduces the bug. We should be able + to convert it to a .dat file using HeapFileEncoder. + +Grading +------- + +50% of your grade will be based on whether or not your code passes the +system test suite we will run over it. These tests will be a superset of +the tests we have provided. Before handing in your code, you should make +sure it produces no errors (passes all of the tests) from both ant test +and ant systemtest. + +**Important**: before testing, we will replace your build.xml, +`HeapFileEncoder.java`, and the entire contents of the test directory +with our version of these files. This means you cannot change the format +of .dat files! You should also be careful changing our APIs. You should +test that your code compiles the unmodified tests. In other words, we +will untar your tarball, replace the files mentioned above, compile it, +and then grade it. It will look roughly like this: + +```bash +$ cd ./hw3/starter-code[replace build.xml, HeapFileEncoder.java, and test] +$ ant test +$ ant systemtest +[additional tests] +``` + +An additional 50% of your grade will be based on the quality of your +writeup and our subjective evaluation of your code. + +Extra credit: 2% for each. + +We hope you will enjoy this assignment and will learn a lot about how a +simple DBMS system can be implemented! diff --git a/hw/hw3/starter-code/.gitignore b/hw/hw3/starter-code/.gitignore new file mode 100644 index 0000000000000000000000000000000000000000..c63eb4c14fd83ef674e512858cd63d5d7327faaf --- /dev/null +++ b/hw/hw3/starter-code/.gitignore @@ -0,0 +1,9 @@ +*.iml +.classpath +.project +bin/ +out/ +.idea/ +log +*.dat +dblp_simpledb.schema \ No newline at end of file diff --git a/hw/hw3/starter-code/build.xml b/hw/hw3/starter-code/build.xml new file mode 100644 index 0000000000000000000000000000000000000000..cb0258a8592b91333aa53930a31c630e5cc940bd --- /dev/null +++ b/hw/hw3/starter-code/build.xml @@ -0,0 +1,309 @@ +<?xml version="1.0" encoding="UTF-8"?> +<project name="simpledb" default="dist" basedir="."> + <property name="src" location="src"/> + <property name="testd" location="test"/> + + <property name="build" location="bin"/> + <property name="build.src" location="${build}/src"/> + <property name="build.test" location="${build}/test"/> + <property name="depcache" location="${build}/depcache"/> + + <property name="lib" location="lib"/> + <property name="doc" location="javadoc"/> + <property name="dist" location="dist"/> + <property name="jarfile" location="${dist}/${ant.project.name}.jar"/> + <property name="compile.debug" value="true"/> + <property name="test.reports" location="testreport"/> + + <property name="sourceversion" value="1.7"/> + + <path id="classpath.base"> + <pathelement location="${build.src}"/> + <pathelement location="${lib}/zql.jar"/> + <pathelement location="${lib}/jline-0.9.94.jar"/> + <pathelement location="${lib}/mina-core-2.0.4.jar"/> + <pathelement location="${lib}/mina-filter-compression-2.0.4.jar"/> + <pathelement location="${lib}/slf4j-api-1.6.1.jar"/> + <pathelement location="${lib}/slf4j-log4j12-1.6.1.jar"/> + <pathelement location="${lib}/log4j-1.2.17.jar"/> + <pathelement location="${lib}/jzlib-1.0.7.jar"/> + </path> + + <path id="classpath.test"> + <path refid="classpath.base"/> + <pathelement location="${build.test}"/> + <pathelement location="${lib}/junit-4.5.jar"/> + <pathelement location="${lib}/javassist-3.16.1-GA.jar"/> + </path> + <!-- Common macro for compiling Java source --> + <macrodef name="Compile"> + <attribute name="srcdir"/> + <attribute name="destdir"/> + <element name="compileoptions" implicit="true" optional="true"/> + <sequential> + <mkdir dir="@{destdir}"/> + <!-- avoids needing ant clean when changing interfaces --> + <depend srcdir="${srcdir}" destdir="${destdir}" cache="${depcache}"/> + <javac srcdir="@{srcdir}" destdir="@{destdir}" includeAntRuntime="no" + debug="${compile.debug}" source="${sourceversion}"> + <compilerarg value="-Xlint:unchecked" /> + <!--<compilerarg value="-Xlint:deprecation" />--> + <compileoptions/> + </javac> + </sequential> + </macrodef> + + + <!-- Common macro for running junit tests in both the test and runtest targets --> + <macrodef name="RunJunit"> + <attribute name="haltonfailure" default="yes" /> + <element name="testspecification" implicit="yes" /> + <sequential> + <!-- timeout at 10.5 minutes, since TransactionTest is limited to 10 minutes. --> + <junit printsummary="on" fork="yes" timeout="630000" haltonfailure="@{haltonfailure}" maxmemory="128M" failureproperty="junit.failed"> + <classpath refid="classpath.test" /> + <formatter type="plain" usefile="false"/> + <assertions><enable/></assertions> + <testspecification/> + </junit> + </sequential> + </macrodef> + + <taskdef resource="net/sf/antcontrib/antlib.xml"> + <classpath> + <pathelement location="lib/ant-contrib-1.0b3.jar"/> + </classpath> + </taskdef> + + <target name="eclipse" description="Make current directory an eclipse project"> + <echo file=".project" append="false"><?xml version="1.0" encoding="UTF-8"?> +<projectDescription> + <name>simpledb</name> + <comment></comment> + <projects> + </projects> + <buildSpec> + <buildCommand> + <name>org.eclipse.jdt.core.javabuilder</name> + <arguments> + </arguments> + </buildCommand> + </buildSpec> + <natures> + <nature>org.eclipse.jdt.core.javanature</nature> + </natures> +</projectDescription></echo> + <echo file=".classpath" append="false"><?xml version="1.0" encoding="UTF-8"?> +<classpath> + <classpathentry kind="src" output="bin/src" path="src/java"/> + <classpathentry kind="src" output="bin/test" path="test"/> + <classpathentry kind="con" path="org.eclipse.jdt.launching.JRE_CONTAINER"/> + <classpathentry kind="output" path="bin/src"/> + </echo> + <if> <available file="${lib}/junit-4.5.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/junit-4.5.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/jline-0.9.94.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/jline-0.9.94.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/zql.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/zql.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/mina-core-2.0.4.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/mina-core-2.0.4.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/mina-filter-compression-2.0.4.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/mina-filter-compression-2.0.4.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/jzlib-1.0.7.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/jzlib-1.0.7.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/slf4j-api-1.6.1.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/slf4j-api-1.6.1.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/slf4j-log4j12-1.6.1.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/slf4j-log4j12-1.6.1.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/log4j-1.2.17.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/log4j-1.2.17.jar"/> + </echo> + </then> + </if> + <if> <available file="${lib}/javassist-3.16.1-GA.jar" /> <then> + <echo file=".classpath" append="true"> + <classpathentry kind="lib" path="lib/javassist-3.16.1-GA.jar"/> + </echo> + </then> + </if> + <echo file=".classpath" append="true"> + </classpath> + </echo> + </target> + + <target name="compile" description="Compile code"> + <Compile srcdir="${src}/java" destdir="${build.src}"> + <classpath refid="classpath.base"/> + </Compile> + <copy todir="${build}" flatten="true"> + <fileset dir="${src}"> + <include name="bin/*.sh"/> + </fileset> + </copy> + </target> + + <target name="javadocs" description="Build javadoc documentation"> + <javadoc destdir="${doc}" access="private" failonerror="true" source="${sourceversion}"> + <classpath refid="classpath.base" /> + <fileset dir="src/java" defaultexcludes="yes"> + <include name="simpledb/**/*.java"/> + </fileset> + </javadoc> + </target> + + <target name="dist" depends="compile" description="Build jar"> + <mkdir dir="${dist}"/> + <jar jarfile="${jarfile}" basedir="${build.src}"> + <manifest> + <attribute name="Main-Class" value="simpledb.SimpleDb"/> + <attribute name="Class-Path" value="../lib/zql.jar ../lib/jline-0.9.94.jar ../lib/jzlib-1.0.7.jar ../lib/mina-core-2.0.4.jar ../lib/mina-filter-compression-2.0.4.jar ../lib/slf4j-api-1.6.1.jar ../lib/slf4j-log4j12-1.6.1.jar ../lib/log4j-1.2.17.jar "/> + </manifest> + <!-- Merge library jars into final jar file --> + <!--<zipgroupfileset refid="lib.jars"/>--> + </jar> + </target> + + <target name="clean" description="Remove build and dist directories"> + <delete dir="${build}"/> + <delete dir="${dist}"/> + <delete dir="${doc}"/> + <delete dir="${test.reports}"/> + </target> + + <target name="testcompile" depends="compile" description="Compile all unit and system tests"> + <Compile srcdir="${testd}" destdir="${build.test}"> + <classpath refid="classpath.test"/> + </Compile> + </target> + + <target name="test" depends="testcompile" description="Run all unit tests"> + <RunJunit> + <batchtest> + <fileset dir="${build.test}"> + <include name="**/*Test.class"/> + <exclude name="**/*$*.class"/> + <exclude name="simpledb/systemtest/*.class"/> + </fileset> + </batchtest> + </RunJunit> + </target> + + <target name="systemtest" depends="testcompile" description="Run all system tests"> + <RunJunit> + <batchtest> + <fileset dir="${build.test}"> + <include name="simpledb/systemtest/*Test.class"/> + </fileset> + </batchtest> + </RunJunit> + </target> + + <target name="runtest" depends="testcompile" + description="Runs the test you specify on the command line with -Dtest="> + <!-- Check for -Dtest command line argument --> + <fail unless="test" message="You must run this target with -Dtest=TestName"/> + + <!-- Check if the class exists --> + <available property="test.exists" classname="simpledb.${test}"> + <classpath refid="classpath.test" /> + </available> + <fail unless="test.exists" message="Test ${test} could not be found"/> + + <RunJunit> + <test name="simpledb.${test}"/> + </RunJunit> + </target> + + <target name="runsystest" depends="testcompile" + description="Runs the system test you specify on the command line with -Dtest="> + <!-- Check for -Dtest command line argument --> + <fail unless="test" message="You must run this target with -Dtest=TestName"/> + + <!-- Check if the class exists --> + <available property="test.exists" classname="simpledb.systemtest.${test}"> + <classpath refid="classpath.test" /> + </available> + <fail unless="test.exists" message="Test ${test} could not be found"/> + + <RunJunit> + <test name="simpledb.systemtest.${test}"/> + </RunJunit> + </target> + + + <!-- The following target is used for automated grading. --> + <target name="test-report" depends="testcompile" + description="Generates HTML test reports in ${test.reports}"> + <mkdir dir="${test.reports}"/> + + <!-- do not halt on failure so we always produce HTML reports. --> + <RunJunit haltonfailure="no"> + <formatter type="xml"/> + <formatter type="plain" usefile="true"/> + <batchtest todir="${test.reports}" > + <fileset dir="${build.test}"> + <include name="**/*Test.class"/> + <exclude name="**/*$*.class"/> + </fileset> + </batchtest> + </RunJunit> + + <junitreport todir="${test.reports}"> + <fileset dir="${test.reports}"> + <include name="TEST-*.xml" /> + </fileset> + <report todir="${test.reports}" /> + </junitreport> + + <!-- Fail here if the junit tests failed. --> + <fail if="junit.failed" message="Some JUnit tests failed"/> + </target> + + <target name="handin" depends="clean" + description="Create a tarball of your code to hand in"> + <tar destfile="lab-handin.tar.bz2" compression="bzip2" + basedir="." /> + <echo message="Tarball created! Please submit 'lab-handin.tar.bz2' per the instructions in the lab document." /> + <subant target="dist"> + <fileset dir="." includes="build.xml"/> + </subant> + </target> + + <target name="test-and-handin" depends="test,systemtest,handin" + description="Run all the tests and system tests; if they succeed, create a tarball of the source code to submit" /> + +</project> diff --git a/hw/hw3/starter-code/lib/LICENSE.javassist.html b/hw/hw3/starter-code/lib/LICENSE.javassist.html new file mode 100644 index 0000000000000000000000000000000000000000..7d842b40bcb66a19c745720e0b6024dea6355de1 --- /dev/null +++ b/hw/hw3/starter-code/lib/LICENSE.javassist.html @@ -0,0 +1,373 @@ +<HTML> +<HEAD> +<TITLE>Javassist License</TITLE> +<META http-equiv=Content-Type content="text/html; charset=iso-8859-1"> +<META content="MSHTML 5.50.4916.2300" name=GENERATOR></HEAD> + +<BODY text=#000000 vLink=#551a8b aLink=#ff0000 link=#0000ee bgColor=#ffffff> +<CENTER><B><FONT size=+2>MOZILLA PUBLIC LICENSE</FONT></B> <BR><B>Version +1.1</B> +<P> +<HR width="20%"> +</CENTER> +<P><B>1. Definitions.</B> +<UL><B>1.0.1. "Commercial Use" </B>means distribution or otherwise making the + Covered Code available to a third party. + <P><B>1.1. ''Contributor''</B> means each entity that creates or contributes + to the creation of Modifications. + <P><B>1.2. ''Contributor Version''</B> means the combination of the Original + Code, prior Modifications used by a Contributor, and the Modifications made by + that particular Contributor. + <P><B>1.3. ''Covered Code''</B> means the Original Code or Modifications or + the combination of the Original Code and Modifications, in each case including + portions thereof<B>.</B> + <P><B>1.4. ''Electronic Distribution Mechanism''</B> means a mechanism + generally accepted in the software development community for the electronic + transfer of data. + <P><B>1.5. ''Executable''</B> means Covered Code in any form other than Source + Code. + <P><B>1.6. ''Initial Developer''</B> means the individual or entity identified + as the Initial Developer in the Source Code notice required by <B>Exhibit + A</B>. + <P><B>1.7. ''Larger Work''</B> means a work which combines Covered Code or + portions thereof with code not governed by the terms of this License. + <P><B>1.8. ''License''</B> means this document. + <P><B>1.8.1. "Licensable"</B> means having the right to grant, to the maximum + extent possible, whether at the time of the initial grant or subsequently + acquired, any and all of the rights conveyed herein. + <P><B>1.9. ''Modifications''</B> means any addition to or deletion from the + substance or structure of either the Original Code or any previous + Modifications. When Covered Code is released as a series of files, a + Modification is: + <UL><B>A.</B> Any addition to or deletion from the contents of a file + containing Original Code or previous Modifications. + <P><B>B.</B> Any new file that contains any part of the Original Code or + previous Modifications. <BR> </P></UL><B>1.10. ''Original Code''</B> + means Source Code of computer software code which is described in the Source + Code notice required by <B>Exhibit A</B> as Original Code, and which, at the + time of its release under this License is not already Covered Code governed by + this License. + <P><B>1.10.1. "Patent Claims"</B> means any patent claim(s), now owned or + hereafter acquired, including without limitation, method, process, and + apparatus claims, in any patent Licensable by grantor. + <P><B>1.11. ''Source Code''</B> means the preferred form of the Covered Code + for making modifications to it, including all modules it contains, plus any + associated interface definition files, scripts used to control compilation and + installation of an Executable, or source code differential comparisons against + either the Original Code or another well known, available Covered Code of the + Contributor's choice. The Source Code can be in a compressed or archival form, + provided the appropriate decompression or de-archiving software is widely + available for no charge. + <P><B>1.12. "You'' (or "Your") </B> means an individual or a legal entity + exercising rights under, and complying with all of the terms of, this License + or a future version of this License issued under Section 6.1. For legal + entities, "You'' includes any entity which controls, is controlled by, or is + under common control with You. For purposes of this definition, "control'' + means (a) the power, direct or indirect, to cause the direction or management + of such entity, whether by contract or otherwise, or (b) ownership of more + than fifty percent (50%) of the outstanding shares or beneficial ownership of + such entity.</P></UL><B>2. Source Code License.</B> +<UL><B>2.1. The Initial Developer Grant.</B> <BR>The Initial Developer hereby + grants You a world-wide, royalty-free, non-exclusive license, subject to third + party intellectual property claims: + <UL><B>(a)</B> <B> </B>under intellectual property rights (other than + patent or trademark) Licensable by Initial Developer to use, reproduce, + modify, display, perform, sublicense and distribute the Original Code (or + portions thereof) with or without Modifications, and/or as part of a Larger + Work; and + <P><B>(b)</B> under Patents Claims infringed by the making, using or selling + of Original Code, to make, have made, use, practice, sell, and offer for + sale, and/or otherwise dispose of the Original Code (or portions thereof). + <UL> + <UL></UL></UL><B>(c) </B>the licenses granted in this Section 2.1(a) and (b) + are effective on the date Initial Developer first distributes Original Code + under the terms of this License. + <P><B>(d) </B>Notwithstanding Section 2.1(b) above, no patent license is + granted: 1) for code that You delete from the Original Code; 2) separate + from the Original Code; or 3) for infringements caused by: i) the + modification of the Original Code or ii) the combination of the Original + Code with other software or devices. <BR> </P></UL><B>2.2. Contributor + Grant.</B> <BR>Subject to third party intellectual property claims, each + Contributor hereby grants You a world-wide, royalty-free, non-exclusive + license + <UL> <BR><B>(a)</B> <B> </B>under intellectual property rights (other + than patent or trademark) Licensable by Contributor, to use, reproduce, + modify, display, perform, sublicense and distribute the Modifications + created by such Contributor (or portions thereof) either on an unmodified + basis, with other Modifications, as Covered Code and/or as part of a Larger + Work; and + <P><B>(b)</B> under Patent Claims infringed by the making, using, or selling + of Modifications made by that Contributor either alone and/or in<FONT + color=#000000> combination with its Contributor Version (or portions of such + combination), to make, use, sell, offer for sale, have made, and/or + otherwise dispose of: 1) Modifications made by that Contributor (or portions + thereof); and 2) the combination of Modifications made by that + Contributor with its Contributor Version (or portions of such + combination).</FONT> + <P><B>(c) </B>the licenses granted in Sections 2.2(a) and 2.2(b) are + effective on the date Contributor first makes Commercial Use of the Covered + Code. + <P><B>(d) </B> Notwithstanding Section 2.2(b) above, no + patent license is granted: 1) for any code that Contributor has deleted from + the Contributor Version; 2) separate from the Contributor + Version; 3) for infringements caused by: i) third party + modifications of Contributor Version or ii) the combination of + Modifications made by that Contributor with other software (except as + part of the Contributor Version) or other devices; or 4) under Patent Claims + infringed by Covered Code in the absence of Modifications made by that + Contributor.</P></UL></UL> +<P><BR><B>3. Distribution Obligations.</B> +<UL><B>3.1. Application of License.</B> <BR>The Modifications which You create + or to which You contribute are governed by the terms of this License, + including without limitation Section <B>2.2</B>. The Source Code version of + Covered Code may be distributed only under the terms of this License or a + future version of this License released under Section <B>6.1</B>, and You must + include a copy of this License with every copy of the Source Code You + distribute. You may not offer or impose any terms on any Source Code version + that alters or restricts the applicable version of this License or the + recipients' rights hereunder. However, You may include an additional document + offering the additional rights described in Section <B>3.5</B>. + <P><B>3.2. Availability of Source Code.</B> <BR>Any Modification which You + create or to which You contribute must be made available in Source Code form + under the terms of this License either on the same media as an Executable + version or via an accepted Electronic Distribution Mechanism to anyone to whom + you made an Executable version available; and if made available via Electronic + Distribution Mechanism, must remain available for at least twelve (12) months + after the date it initially became available, or at least six (6) months after + a subsequent version of that particular Modification has been made available + to such recipients. You are responsible for ensuring that the Source Code + version remains available even if the Electronic Distribution Mechanism is + maintained by a third party. + <P><B>3.3. Description of Modifications.</B> <BR>You must cause all Covered + Code to which You contribute to contain a file documenting the changes You + made to create that Covered Code and the date of any change. You must include + a prominent statement that the Modification is derived, directly or + indirectly, from Original Code provided by the Initial Developer and including + the name of the Initial Developer in (a) the Source Code, and (b) in any + notice in an Executable version or related documentation in which You describe + the origin or ownership of the Covered Code. + <P><B>3.4. Intellectual Property Matters</B> + <UL><B>(a) Third Party Claims</B>. <BR>If Contributor has knowledge that a + license under a third party's intellectual property rights is required to + exercise the rights granted by such Contributor under Sections 2.1 or 2.2, + Contributor must include a text file with the Source Code distribution + titled "LEGAL'' which describes the claim and the party making the claim in + sufficient detail that a recipient will know whom to contact. If Contributor + obtains such knowledge after the Modification is made available as described + in Section 3.2, Contributor shall promptly modify the LEGAL file in all + copies Contributor makes available thereafter and shall take other steps + (such as notifying appropriate mailing lists or newsgroups) reasonably + calculated to inform those who received the Covered Code that new knowledge + has been obtained. + <P><B>(b) Contributor APIs</B>. <BR>If Contributor's Modifications include + an application programming interface and Contributor has knowledge of patent + licenses which are reasonably necessary to implement that API, Contributor + must also include this information in the LEGAL file. + <BR> </P></UL> + <B>(c) Representations.</B> + <UL>Contributor represents that, except as disclosed pursuant to Section + 3.4(a) above, Contributor believes that Contributor's Modifications are + Contributor's original creation(s) and/or Contributor has sufficient rights + to grant the rights conveyed by this License.</UL> + <P><BR><B>3.5. Required Notices.</B> <BR>You must duplicate the notice in + <B>Exhibit A</B> in each file of the Source Code. If it is not possible + to put such notice in a particular Source Code file due to its structure, then + You must include such notice in a location (such as a relevant directory) + where a user would be likely to look for such a notice. If You created + one or more Modification(s) You may add your name as a Contributor to the + notice described in <B>Exhibit A</B>. You must also duplicate this + License in any documentation for the Source Code where You describe + recipients' rights or ownership rights relating to Covered Code. You may + choose to offer, and to charge a fee for, warranty, support, indemnity or + liability obligations to one or more recipients of Covered Code. However, You + may do so only on Your own behalf, and not on behalf of the Initial Developer + or any Contributor. You must make it absolutely clear than any such warranty, + support, indemnity or liability obligation is offered by You alone, and You + hereby agree to indemnify the Initial Developer and every Contributor for any + liability incurred by the Initial Developer or such Contributor as a result of + warranty, support, indemnity or liability terms You offer. + <P><B>3.6. Distribution of Executable Versions.</B> <BR>You may distribute + Covered Code in Executable form only if the requirements of Section + <B>3.1-3.5</B> have been met for that Covered Code, and if You include a + notice stating that the Source Code version of the Covered Code is available + under the terms of this License, including a description of how and where You + have fulfilled the obligations of Section <B>3.2</B>. The notice must be + conspicuously included in any notice in an Executable version, related + documentation or collateral in which You describe recipients' rights relating + to the Covered Code. You may distribute the Executable version of Covered Code + or ownership rights under a license of Your choice, which may contain terms + different from this License, provided that You are in compliance with the + terms of this License and that the license for the Executable version does not + attempt to limit or alter the recipient's rights in the Source Code version + from the rights set forth in this License. If You distribute the Executable + version under a different license You must make it absolutely clear that any + terms which differ from this License are offered by You alone, not by the + Initial Developer or any Contributor. You hereby agree to indemnify the + Initial Developer and every Contributor for any liability incurred by the + Initial Developer or such Contributor as a result of any such terms You offer. + + <P><B>3.7. Larger Works.</B> <BR>You may create a Larger Work by combining + Covered Code with other code not governed by the terms of this License and + distribute the Larger Work as a single product. In such a case, You must make + sure the requirements of this License are fulfilled for the Covered +Code.</P></UL><B>4. Inability to Comply Due to Statute or Regulation.</B> +<UL>If it is impossible for You to comply with any of the terms of this + License with respect to some or all of the Covered Code due to statute, + judicial order, or regulation then You must: (a) comply with the terms of this + License to the maximum extent possible; and (b) describe the limitations and + the code they affect. Such description must be included in the LEGAL file + described in Section <B>3.4</B> and must be included with all distributions of + the Source Code. Except to the extent prohibited by statute or regulation, + such description must be sufficiently detailed for a recipient of ordinary + skill to be able to understand it.</UL><B>5. Application of this License.</B> +<UL>This License applies to code to which the Initial Developer has attached + the notice in <B>Exhibit A</B> and to related Covered Code.</UL><B>6. Versions +of the License.</B> +<UL><B>6.1. New Versions</B>. <BR>Netscape Communications Corporation + (''Netscape'') may publish revised and/or new versions of the License from + time to time. Each version will be given a distinguishing version number. + <P><B>6.2. Effect of New Versions</B>. <BR>Once Covered Code has been + published under a particular version of the License, You may always continue + to use it under the terms of that version. You may also choose to use such + Covered Code under the terms of any subsequent version of the License + published by Netscape. No one other than Netscape has the right to modify the + terms applicable to Covered Code created under this License. + <P><B>6.3. Derivative Works</B>. <BR>If You create or use a modified version + of this License (which you may only do in order to apply it to code which is + not already Covered Code governed by this License), You must (a) rename Your + license so that the phrases ''Mozilla'', ''MOZILLAPL'', ''MOZPL'', + ''Netscape'', "MPL", ''NPL'' or any confusingly similar phrase do not appear + in your license (except to note that your license differs from this License) + and (b) otherwise make it clear that Your version of the license contains + terms which differ from the Mozilla Public License and Netscape Public + License. (Filling in the name of the Initial Developer, Original Code or + Contributor in the notice described in <B>Exhibit A</B> shall not of + themselves be deemed to be modifications of this License.)</P></UL><B>7. +DISCLAIMER OF WARRANTY.</B> +<UL>COVERED CODE IS PROVIDED UNDER THIS LICENSE ON AN "AS IS'' BASIS, WITHOUT + WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, WITHOUT + LIMITATION, WARRANTIES THAT THE COVERED CODE IS FREE OF DEFECTS, MERCHANTABLE, + FIT FOR A PARTICULAR PURPOSE OR NON-INFRINGING. THE ENTIRE RISK AS TO THE + QUALITY AND PERFORMANCE OF THE COVERED CODE IS WITH YOU. SHOULD ANY COVERED + CODE PROVE DEFECTIVE IN ANY RESPECT, YOU (NOT THE INITIAL DEVELOPER OR ANY + OTHER CONTRIBUTOR) ASSUME THE COST OF ANY NECESSARY SERVICING, REPAIR OR + CORRECTION. THIS DISCLAIMER OF WARRANTY CONSTITUTES AN ESSENTIAL PART OF THIS + LICENSE. NO USE OF ANY COVERED CODE IS AUTHORIZED HEREUNDER EXCEPT UNDER THIS + DISCLAIMER.</UL><B>8. TERMINATION.</B> +<UL><B>8.1. </B>This License and the rights granted hereunder will + terminate automatically if You fail to comply with terms herein and fail to + cure such breach within 30 days of becoming aware of the breach. All + sublicenses to the Covered Code which are properly granted shall survive any + termination of this License. Provisions which, by their nature, must remain in + effect beyond the termination of this License shall survive. + <P><B>8.2. </B>If You initiate litigation by asserting a patent + infringement claim (excluding declatory judgment actions) against Initial + Developer or a Contributor (the Initial Developer or Contributor against whom + You file such action is referred to as "Participant") alleging that: + <P><B>(a) </B>such Participant's Contributor Version directly or + indirectly infringes any patent, then any and all rights granted by such + Participant to You under Sections 2.1 and/or 2.2 of this License shall, upon + 60 days notice from Participant terminate prospectively, unless if within 60 + days after receipt of notice You either: (i) agree in writing to pay + Participant a mutually agreeable reasonable royalty for Your past and future + use of Modifications made by such Participant, or (ii) withdraw Your + litigation claim with respect to the Contributor Version against such + Participant. If within 60 days of notice, a reasonable royalty and + payment arrangement are not mutually agreed upon in writing by the parties or + the litigation claim is not withdrawn, the rights granted by Participant to + You under Sections 2.1 and/or 2.2 automatically terminate at the expiration of + the 60 day notice period specified above. + <P><B>(b)</B> any software, hardware, or device, other than such + Participant's Contributor Version, directly or indirectly infringes any + patent, then any rights granted to You by such Participant under Sections + 2.1(b) and 2.2(b) are revoked effective as of the date You first made, used, + sold, distributed, or had made, Modifications made by that Participant. + <P><B>8.3. </B>If You assert a patent infringement claim against + Participant alleging that such Participant's Contributor Version directly or + indirectly infringes any patent where such claim is resolved (such as by + license or settlement) prior to the initiation of patent infringement + litigation, then the reasonable value of the licenses granted by such + Participant under Sections 2.1 or 2.2 shall be taken into account in + determining the amount or value of any payment or license. + <P><B>8.4.</B> In the event of termination under Sections 8.1 or 8.2 + above, all end user license agreements (excluding distributors and + resellers) which have been validly granted by You or any distributor hereunder + prior to termination shall survive termination.</P></UL><B>9. LIMITATION OF +LIABILITY.</B> +<UL>UNDER NO CIRCUMSTANCES AND UNDER NO LEGAL THEORY, WHETHER TORT (INCLUDING + NEGLIGENCE), CONTRACT, OR OTHERWISE, SHALL YOU, THE INITIAL DEVELOPER, ANY + OTHER CONTRIBUTOR, OR ANY DISTRIBUTOR OF COVERED CODE, OR ANY SUPPLIER OF ANY + OF SUCH PARTIES, BE LIABLE TO ANY PERSON FOR ANY INDIRECT, SPECIAL, + INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER INCLUDING, WITHOUT + LIMITATION, DAMAGES FOR LOSS OF GOODWILL, WORK STOPPAGE, COMPUTER FAILURE OR + MALFUNCTION, OR ANY AND ALL OTHER COMMERCIAL DAMAGES OR LOSSES, EVEN IF SUCH + PARTY SHALL HAVE BEEN INFORMED OF THE POSSIBILITY OF SUCH DAMAGES. THIS + LIMITATION OF LIABILITY SHALL NOT APPLY TO LIABILITY FOR DEATH OR PERSONAL + INJURY RESULTING FROM SUCH PARTY'S NEGLIGENCE TO THE EXTENT APPLICABLE LAW + PROHIBITS SUCH LIMITATION. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR + LIMITATION OF INCIDENTAL OR CONSEQUENTIAL DAMAGES, SO THIS EXCLUSION AND + LIMITATION MAY NOT APPLY TO YOU.</UL><B>10. U.S. GOVERNMENT END USERS.</B> +<UL>The Covered Code is a ''commercial item,'' as that term is defined in 48 + C.F.R. 2.101 (Oct. 1995), consisting of ''commercial computer software'' and + ''commercial computer software documentation,'' as such terms are used in 48 + C.F.R. 12.212 (Sept. 1995). Consistent with 48 C.F.R. 12.212 and 48 C.F.R. + 227.7202-1 through 227.7202-4 (June 1995), all U.S. Government End Users + acquire Covered Code with only those rights set forth herein.</UL><B>11. +MISCELLANEOUS.</B> +<UL>This License represents the complete agreement concerning subject matter + hereof. If any provision of this License is held to be unenforceable, such + provision shall be reformed only to the extent necessary to make it + enforceable. This License shall be governed by California law provisions + (except to the extent applicable law, if any, provides otherwise), excluding + its conflict-of-law provisions. With respect to disputes in which at least one + party is a citizen of, or an entity chartered or registered to do business in + the United States of America, any litigation relating to this License shall be + subject to the jurisdiction of the Federal Courts of the Northern District of + California, with venue lying in Santa Clara County, California, with the + losing party responsible for costs, including without limitation, court costs + and reasonable attorneys' fees and expenses. The application of the United + Nations Convention on Contracts for the International Sale of Goods is + expressly excluded. Any law or regulation which provides that the language of + a contract shall be construed against the drafter shall not apply to this + License.</UL><B>12. RESPONSIBILITY FOR CLAIMS.</B> +<UL>As between Initial Developer and the Contributors, each party is + responsible for claims and damages arising, directly or indirectly, out of its + utilization of rights under this License and You agree to work with Initial + Developer and Contributors to distribute such responsibility on an equitable + basis. Nothing herein is intended or shall be deemed to constitute any + admission of liability.</UL><B>13. MULTIPLE-LICENSED CODE.</B> +<UL>Initial Developer may designate portions of the Covered Code as + "Multiple-Licensed". "Multiple-Licensed" means that the Initial + Developer permits you to utilize portions of the Covered Code under Your + choice of the MPL or the alternative licenses, if any, specified by the + Initial Developer in the file described in Exhibit A.</UL> +<P><BR><B>EXHIBIT A -Mozilla Public License.</B> +<UL>The contents of this file are subject to the Mozilla Public License + Version 1.1 (the "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + <BR>http://www.mozilla.org/MPL/ + <P>Software distributed under the License is distributed on an "AS IS" basis, + WITHOUT WARRANTY OF <BR>ANY KIND, either express or implied. See the License + for the specific language governing rights and <BR>limitations under the + License. + <P>The Original Code is Javassist. + <P>The Initial Developer of the Original Code is Shigeru Chiba. + Portions created by the Initial Developer are<BR> + Copyright (C) 1999- Shigeru Chiba. All Rights Reserved. + <P>Contributor(s): __Bill Burke, Jason T. Greene______________. + +<p>Alternatively, the contents of this software may be used under the +terms of the GNU Lesser General Public License Version 2.1 or later +(the "LGPL"), or the Apache License Version 2.0 (the "AL"), +in which case the provisions of the LGPL or the AL are applicable +instead of those above. If you wish to allow use of your version of +this software only under the terms of either the LGPL or the AL, and not to allow others to +use your version of this software under the terms of the MPL, indicate +your decision by deleting the provisions above and replace them with +the notice and other provisions required by the LGPL or the AL. If you do not +delete the provisions above, a recipient may use your version of this +software under the terms of any one of the MPL, the LGPL or the AL. + + <P></P></UL> +</BODY> +</HTML> diff --git a/hw/hw3/starter-code/lib/LICENSE.jzlib.txt b/hw/hw3/starter-code/lib/LICENSE.jzlib.txt new file mode 100644 index 0000000000000000000000000000000000000000..cdce5007d0ec51a8f19ac48e188d0f0d7474c8ff --- /dev/null +++ b/hw/hw3/starter-code/lib/LICENSE.jzlib.txt @@ -0,0 +1,29 @@ +JZlib 0.0.* were released under the GNU LGPL license. Later, we have switched +over to a BSD-style license. + +------------------------------------------------------------------------------ +Copyright (c) 2000,2001,2002,2003 ymnk, JCraft,Inc. All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: + + 1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the distribution. + + 3. The names of the authors may not be used to endorse or promote products + derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED WARRANTIES, +INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL JCRAFT, +INC. OR ANY CONTRIBUTORS TO THIS SOFTWARE BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, +OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING +NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, +EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/hw/hw3/starter-code/lib/LICENSE.mina.txt b/hw/hw3/starter-code/lib/LICENSE.mina.txt new file mode 100644 index 0000000000000000000000000000000000000000..66a27ec5ff940d3a9652d2948746ebac4c9d0188 --- /dev/null +++ b/hw/hw3/starter-code/lib/LICENSE.mina.txt @@ -0,0 +1,177 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + diff --git a/hw/hw3/starter-code/lib/LICENSE.slf4j.txt b/hw/hw3/starter-code/lib/LICENSE.slf4j.txt new file mode 100644 index 0000000000000000000000000000000000000000..e663b1d7f0ebbfcd7afba663d7e333e41978f999 --- /dev/null +++ b/hw/hw3/starter-code/lib/LICENSE.slf4j.txt @@ -0,0 +1,28 @@ +Copyright (c) 2004-2007 QOS.ch +All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +"Software"), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, and/or sell copies of the Software, and to permit persons +to whom the Software is furnished to do so, provided that the above +copyright notice(s) and this permission notice appear in all copies of +the Software and that both the above copyright notice(s) and this +permission notice appear in supporting documentation. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT +OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR +HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY +SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER +RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF +CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN +CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + +Except as contained in this notice, the name of a copyright holder +shall not be used in advertising or otherwise to promote the sale, use +or other dealings in this Software without prior written authorization +of the copyright holder. + diff --git a/hw/hw3/starter-code/lib/README b/hw/hw3/starter-code/lib/README new file mode 100644 index 0000000000000000000000000000000000000000..7a6efb645c2fb156beca728f8b8d1a9e9b9e2239 --- /dev/null +++ b/hw/hw3/starter-code/lib/README @@ -0,0 +1,60 @@ +junit-4.5.jar +* http://junit.sourceforge.net/ +* CPL (free for all use) + +zql.jar +* http://www.gibello.com/code/zql/ +* Free for non-commercial use + +JLine +* http://jline.sourceforge.net/ +* BSD (free for all use) +<<<<<<< .working + +mina-core-2.0.4.jar +mina-filter-compression-2.0.4.jar +* http://mina.apache.org/ +* Apache License v2.0 (free for all use) + +slf4j-api-1.6.1.jar +* http://www.slf4j.org/license.html +* MIT license (free for all use) + +jzlib-1.0.7.jar +* http://www.jcraft.com/jzlib/ +* BSD (free for all use) + +javassist-3.16.1-GA.jar +* http://www.javassist.org/ +* MPL v1.1, LGPL and Apache License + +======= + +mina-core-2.0.4.jar +mina-filter-compression-2.0.4.jar +* http://mina.apache.org/ +* Apache License v2.0 (free for all use) + +slf4j-api-1.6.1.jar +slf4j-log4j12-1.6.1.jar +* http://www.slf4j.org/license.html +* MIT license (free for all use) + +jzlib-1.0.7.jar +* http://www.jcraft.com/jzlib/ +* BSD (free for all use) + +javassist-3.16.1-GA.jar +* http://www.javassist.org/ +* MPL v1.1, LGPL and Apache License + +ant-contrib-1.0b3.jar +* http://ant-contrib.sourceforge.net/ +* Apache Software License + +log4j-1.2.17.jar +* logging.apache.org/log4j/1.2/ +* Apache Software license 2.0 + + +>>>>>>> .merge-right.r755 diff --git a/hw/hw3/starter-code/lib/ant-contrib-1.0b3.jar b/hw/hw3/starter-code/lib/ant-contrib-1.0b3.jar new file mode 100644 index 0000000000000000000000000000000000000000..062537661a514c2ce97d18948f4f25f7226cc1a0 Binary files /dev/null and b/hw/hw3/starter-code/lib/ant-contrib-1.0b3.jar differ diff --git a/hw/hw3/starter-code/lib/javassist-3.16.1-GA.jar b/hw/hw3/starter-code/lib/javassist-3.16.1-GA.jar new file mode 100644 index 0000000000000000000000000000000000000000..e8abb1971439e2508c6f370cdd025bc9067202db Binary files /dev/null and b/hw/hw3/starter-code/lib/javassist-3.16.1-GA.jar differ diff --git a/hw/hw3/starter-code/lib/jline-0.9.94.jar b/hw/hw3/starter-code/lib/jline-0.9.94.jar new file mode 100644 index 0000000000000000000000000000000000000000..dafca7c46e96ce462ef8e2457a4bbd6c21dcd0b7 Binary files /dev/null and b/hw/hw3/starter-code/lib/jline-0.9.94.jar differ diff --git a/hw/hw3/starter-code/lib/junit-4.5.jar b/hw/hw3/starter-code/lib/junit-4.5.jar new file mode 100644 index 0000000000000000000000000000000000000000..733921623d4a71ae2ae1432228e6eba5e508ae4c Binary files /dev/null and b/hw/hw3/starter-code/lib/junit-4.5.jar differ diff --git a/hw/hw3/starter-code/lib/jzlib-1.0.7.jar b/hw/hw3/starter-code/lib/jzlib-1.0.7.jar new file mode 100644 index 0000000000000000000000000000000000000000..112d4fd43d14c9880b204235587158264cdf723c Binary files /dev/null and b/hw/hw3/starter-code/lib/jzlib-1.0.7.jar differ diff --git a/hw/hw3/starter-code/lib/log4j-1.2.17.jar b/hw/hw3/starter-code/lib/log4j-1.2.17.jar new file mode 100644 index 0000000000000000000000000000000000000000..068867ebfd231db09a7775794eea8127420380ed Binary files /dev/null and b/hw/hw3/starter-code/lib/log4j-1.2.17.jar differ diff --git a/hw/hw3/starter-code/lib/mina-core-2.0.4.jar b/hw/hw3/starter-code/lib/mina-core-2.0.4.jar new file mode 100644 index 0000000000000000000000000000000000000000..985fb797a44f7eccdf27b4ec7da9e667f0683f91 Binary files /dev/null and b/hw/hw3/starter-code/lib/mina-core-2.0.4.jar differ diff --git a/hw/hw3/starter-code/lib/mina-filter-compression-2.0.4.jar b/hw/hw3/starter-code/lib/mina-filter-compression-2.0.4.jar new file mode 100644 index 0000000000000000000000000000000000000000..3db65710c167f5ac3c482132aa43e7d1c072e4e1 Binary files /dev/null and b/hw/hw3/starter-code/lib/mina-filter-compression-2.0.4.jar differ diff --git a/hw/hw3/starter-code/lib/slf4j-api-1.6.1.jar b/hw/hw3/starter-code/lib/slf4j-api-1.6.1.jar new file mode 100644 index 0000000000000000000000000000000000000000..f1f4fdd214940c76cefaad2419e538e4c13cef6b Binary files /dev/null and b/hw/hw3/starter-code/lib/slf4j-api-1.6.1.jar differ diff --git a/hw/hw3/starter-code/lib/slf4j-log4j12-1.6.1.jar b/hw/hw3/starter-code/lib/slf4j-log4j12-1.6.1.jar new file mode 100644 index 0000000000000000000000000000000000000000..641159959c21651235677a4912bb58bc47cfef41 Binary files /dev/null and b/hw/hw3/starter-code/lib/slf4j-log4j12-1.6.1.jar differ diff --git a/hw/hw3/starter-code/lib/zql.jar b/hw/hw3/starter-code/lib/zql.jar new file mode 100644 index 0000000000000000000000000000000000000000..7e7b256f41e7a77947be636dfd089139374ccd0f Binary files /dev/null and b/hw/hw3/starter-code/lib/zql.jar differ diff --git a/hw/hw3/starter-code/src/java/simpledb/Aggregate.java b/hw/hw3/starter-code/src/java/simpledb/Aggregate.java new file mode 100644 index 0000000000000000000000000000000000000000..11f169427cb12ea5adb096f29cbdcabe4dc6e817 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Aggregate.java @@ -0,0 +1,137 @@ +package simpledb; + +import java.util.*; + +/** + * The Aggregation operator that computes an aggregate (e.g., sum, avg, max, + * min). Note that we only support aggregates over a single column, grouped by a + * single column. + */ +public class Aggregate extends Operator { + + private static final long serialVersionUID = 1L; + + /** + * Constructor. + * + * Implementation hint: depending on the type of afield, you will want to + * construct an {@link IntAggregator} or {@link StringAggregator} to help + * you with your implementation of readNext(). + * + * + * @param child + * The DbIterator that is feeding us tuples. + * @param afield + * The column over which we are computing an aggregate. + * @param gfield + * The column over which we are grouping the result, or -1 if + * there is no grouping + * @param aop + * The aggregation operator to use + */ + public Aggregate(DbIterator child, int afield, int gfield, Aggregator.Op aop) { + // some code goes here + } + + /** + * @return If this aggregate is accompanied by a groupby, return the groupby + * field index in the <b>INPUT</b> tuples. If not, return + * {@link simpledb.Aggregator#NO_GROUPING} + * */ + public int groupField() { + // some code goes here + return -1; + } + + /** + * @return If this aggregate is accompanied by a group by, return the name + * of the groupby field in the <b>OUTPUT</b> tuples If not, return + * null; + * */ + public String groupFieldName() { + // some code goes here + return null; + } + + /** + * @return the aggregate field + * */ + public int aggregateField() { + // some code goes here + return -1; + } + + /** + * @return return the name of the aggregate field in the <b>OUTPUT</b> + * tuples + * */ + public String aggregateFieldName() { + // some code goes here + return null; + } + + /** + * @return return the aggregate operator + * */ + public Aggregator.Op aggregateOp() { + // some code goes here + return null; + } + + public static String nameOfAggregatorOp(Aggregator.Op aop) { + return aop.toString(); + } + + public void open() throws NoSuchElementException, DbException, + TransactionAbortedException { + // some code goes here + } + + /** + * Returns the next tuple. If there is a group by field, then the first + * field is the field by which we are grouping, and the second field is the + * result of computing the aggregate, If there is no group by field, then + * the result tuple should contain one field representing the result of the + * aggregate. Should return null if there are no more tuples. + */ + protected Tuple fetchNext() throws TransactionAbortedException, DbException { + // some code goes here + return null; + } + + public void rewind() throws DbException, TransactionAbortedException { + // some code goes here + } + + /** + * Returns the TupleDesc of this Aggregate. If there is no group by field, + * this will have one field - the aggregate column. If there is a group by + * field, the first field will be the group by field, and the second will be + * the aggregate value column. + * + * The name of an aggregate column should be informative. For example: + * "aggName(aop) (child_td.getFieldName(afield))" where aop and afield are + * given in the constructor, and child_td is the TupleDesc of the child + * iterator. + */ + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + public void close() { + // some code goes here + } + + @Override + public DbIterator[] getChildren() { + // some code goes here + return null; + } + + @Override + public void setChildren(DbIterator[] children) { + // some code goes here + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Aggregator.java b/hw/hw3/starter-code/src/java/simpledb/Aggregator.java new file mode 100644 index 0000000000000000000000000000000000000000..ee53ada6f5b76956643bdefff70e6348d2ebd3bd --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Aggregator.java @@ -0,0 +1,85 @@ +package simpledb; + +import java.io.Serializable; + +/** + * The common interface for any class that can compute an aggregate over a + * list of Tuples. + */ +public interface Aggregator extends Serializable { + static final int NO_GROUPING = -1; + + /** + * SUM_COUNT and SC_AVG will + * only be used in lab6, you are not required + * to implement them until then. + * */ + public enum Op implements Serializable { + MIN, MAX, SUM, AVG, COUNT, + /** + * SUM_COUNT: compute sum and count simultaneously, will be + * needed to compute distributed avg in lab6. + * */ + SUM_COUNT, + /** + * SC_AVG: compute the avg of a set of SUM_COUNT tuples, + * will be used to compute distributed avg in lab6. + * */ + SC_AVG; + + /** + * Interface to access operations by a string containing an integer + * index for command-line convenience. + * + * @param s a string containing a valid integer Op index + */ + public static Op getOp(String s) { + return getOp(Integer.parseInt(s)); + } + + /** + * Interface to access operations by integer value for command-line + * convenience. + * + * @param i a valid integer Op index + */ + public static Op getOp(int i) { + return values()[i]; + } + + public String toString() + { + if (this==MIN) + return "min"; + if (this==MAX) + return "max"; + if (this==SUM) + return "sum"; + if (this==SUM_COUNT) + return "sum_count"; + if (this==AVG) + return "avg"; + if (this==COUNT) + return "count"; + if (this==SC_AVG) + return "sc_avg"; + throw new IllegalStateException("impossible to reach here"); + } + } + + /** + * Merge a new tuple into the aggregate for a distinct group value; + * creates a new group aggregate result if the group value has not yet + * been encountered. + * + * @param tup the Tuple containing an aggregate field and a group-by field + */ + public void mergeTupleIntoGroup(Tuple tup); + + /** + * Create a DbIterator over group aggregate results. + * @see simpledb.TupleIterator for a possible helper + */ + public DbIterator iterator(); + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/BufferPool.java b/hw/hw3/starter-code/src/java/simpledb/BufferPool.java new file mode 100644 index 0000000000000000000000000000000000000000..b8a9a2d6f2cd59b743ab7e1f9c19f10eb4de42ff --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/BufferPool.java @@ -0,0 +1,195 @@ +package simpledb; + +import java.io.*; + +import java.util.concurrent.ConcurrentHashMap; + +/** + * BufferPool manages the reading and writing of pages into memory from + * disk. Access methods call into it to retrieve pages, and it fetches + * pages from the appropriate location. + * <p> + * The BufferPool is also responsible for locking; when a transaction fetches + * a page, BufferPool checks that the transaction has the appropriate + * locks to read/write the page. + * + * @Threadsafe, all fields are final + */ +public class BufferPool { + /** Bytes per page, including header. */ + public static final int PAGE_SIZE = 4096; + + private static int pageSize = PAGE_SIZE; + + /** Default number of pages passed to the constructor. This is used by + other classes. BufferPool should use the numPages argument to the + constructor instead. */ + public static final int DEFAULT_PAGES = 50; + + /** + * Creates a BufferPool that caches up to numPages pages. + * + * @param numPages maximum number of pages in this buffer pool. + */ + public BufferPool(int numPages) { + // some code goes here + } + + public static int getPageSize() { + return pageSize; + } + + // THIS FUNCTION SHOULD ONLY BE USED FOR TESTING!! + public static void setPageSize(int pageSize) { + BufferPool.pageSize = pageSize; + } + + /** + * Retrieve the specified page with the associated permissions. + * Will acquire a lock and may block if that lock is held by another + * transaction. + * <p> + * The retrieved page should be looked up in the buffer pool. If it + * is present, it should be returned. If it is not present, it should + * be added to the buffer pool and returned. If there is insufficient + * space in the buffer pool, an page should be evicted and the new page + * should be added in its place. + * + * @param tid the ID of the transaction requesting the page + * @param pid the ID of the requested page + * @param perm the requested permissions on the page + */ + public Page getPage(TransactionId tid, PageId pid, Permissions perm) + throws TransactionAbortedException, DbException { + // some code goes here + return null; + } + + /** + * Releases the lock on a page. + * Calling this is very risky, and may result in wrong behavior. Think hard + * about who needs to call this and why, and why they can run the risk of + * calling it. + * + * @param tid the ID of the transaction requesting the unlock + * @param pid the ID of the page to unlock + */ + public void releasePage(TransactionId tid, PageId pid) { + // some code goes here + // not necessary for this assignment + } + + /** + * Release all locks associated with a given transaction. + * + * @param tid the ID of the transaction requesting the unlock + */ + public void transactionComplete(TransactionId tid) throws IOException { + // some code goes here + // not necessary for this assignment + } + + /** Return true if the specified transaction has a lock on the specified page */ + public boolean holdsLock(TransactionId tid, PageId p) { + // some code goes here + // not necessary for this assignment + return false; + } + + /** + * Commit or abort a given transaction; release all locks associated to + * the transaction. + * + * @param tid the ID of the transaction requesting the unlock + * @param commit a flag indicating whether we should commit or abort + */ + public void transactionComplete(TransactionId tid, boolean commit) + throws IOException { + // some code goes here + // not necessary for this assignment + } + + /** + * Add a tuple to the specified table on behalf of transaction tid. Will + * acquire a write lock on the page the tuple is added to and any other + * pages that are updated (Lock acquisition is not needed for lab2). + * May block if the lock(s) cannot be acquired. + * + * Marks any pages that were dirtied by the operation as dirty by calling + * their markDirty bit, and updates cached versions of any pages that have + * been dirtied so that future requests see up-to-date pages. + * + * @param tid the transaction adding the tuple + * @param tableId the table to add the tuple to + * @param t the tuple to add + */ + public void insertTuple(TransactionId tid, int tableId, Tuple t) + throws DbException, IOException, TransactionAbortedException { + // some code goes here + // not necessary for this assignment + } + + /** + * Remove the specified tuple from the buffer pool. + * Will acquire a write lock on the page the tuple is removed from and any + * other pages that are updated. May block if the lock(s) cannot be acquired. + * + * Marks any pages that were dirtied by the operation as dirty by calling + * their markDirty bit, and updates cached versions of any pages that have + * been dirtied so that future requests see up-to-date pages. + * + * @param tid the transaction deleting the tuple. + * @param t the tuple to delete + */ + public void deleteTuple(TransactionId tid, Tuple t) + throws DbException, IOException, TransactionAbortedException { + // some code goes here + // not necessary for this assignment + } + + /** + * Flush all dirty pages to disk. + * NB: Be careful using this routine -- it writes dirty data to disk so will + * break simpledb if running in NO STEAL mode. + */ + public synchronized void flushAllPages() throws IOException { + // some code goes here + // not necessary for this assignment + } + + /** Remove the specific page id from the buffer pool. + Needed by the recovery manager to ensure that the + buffer pool doesn't keep a rolled back page in its + cache. + */ + public synchronized void discardPage(PageId pid) { + // some code goes here + // not necessary for this assignment + } + + /** + * Flushes a certain page to disk + * @param pid an ID indicating the page to flush + */ + private synchronized void flushPage(PageId pid) throws IOException { + // some code goes here + // not necessary for this assignment + } + + /** Write all pages of the specified transaction to disk. + */ + public synchronized void flushPages(TransactionId tid) throws IOException { + // some code goes here + // not necessary for this assignment + } + + /** + * Discards a page from the buffer pool. + * Flushes the page to disk to ensure dirty pages are updated on disk. + */ + private synchronized void evictPage() throws DbException { + // some code goes here + // not necessary for this assignment + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Catalog.java b/hw/hw3/starter-code/src/java/simpledb/Catalog.java new file mode 100644 index 0000000000000000000000000000000000000000..1b181daaf33af067c9524395e15195c96f3262c5 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Catalog.java @@ -0,0 +1,163 @@ +package simpledb; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileReader; +import java.io.IOException; +import java.util.*; +import java.util.concurrent.ConcurrentHashMap; + +/** + * The Catalog keeps track of all available tables in the database and their + * associated schemas. + * For now, this is a stub catalog that must be populated with tables by a + * user program before it can be used -- eventually, this should be converted + * to a catalog that reads a catalog table from disk. + * + * @Threadsafe + */ +public class Catalog { + + /** + * Constructor. + * Creates a new, empty catalog. + */ + public Catalog() { + // some code goes here + } + + /** + * Add a new table to the catalog. + * This table's contents are stored in the specified DbFile. + * @param file the contents of the table to add; file.getId() is the identfier of + * this file/tupledesc param for the calls getTupleDesc and getFile + * @param name the name of the table -- may be an empty string. May not be null. If a name + * conflict exists, use the last table to be added as the table for a given name. + * @param pkeyField the name of the primary key field + */ + public void addTable(DbFile file, String name, String pkeyField) { + // some code goes here + } + + public void addTable(DbFile file, String name) { + addTable(file, name, ""); + } + + /** + * Add a new table to the catalog. + * This table has tuples formatted using the specified TupleDesc and its + * contents are stored in the specified DbFile. + * @param file the contents of the table to add; file.getId() is the identfier of + * this file/tupledesc param for the calls getTupleDesc and getFile + */ + public void addTable(DbFile file) { + addTable(file, (UUID.randomUUID()).toString()); + } + + /** + * Return the id of the table with a specified name, + * @throws NoSuchElementException if the table doesn't exist + */ + public int getTableId(String name) throws NoSuchElementException { + // some code goes here + return 0; + } + + /** + * Returns the tuple descriptor (schema) of the specified table + * @param tableid The id of the table, as specified by the DbFile.getId() + * function passed to addTable + * @throws NoSuchElementException if the table doesn't exist + */ + public TupleDesc getTupleDesc(int tableid) throws NoSuchElementException { + // some code goes here + return null; + } + + /** + * Returns the DbFile that can be used to read the contents of the + * specified table. + * @param tableid The id of the table, as specified by the DbFile.getId() + * function passed to addTable + */ + public DbFile getDatabaseFile(int tableid) throws NoSuchElementException { + // some code goes here + return null; + } + + public String getPrimaryKey(int tableid) { + // some code goes here + return null; + } + + public Iterator<Integer> tableIdIterator() { + // some code goes here + return null; + } + + public String getTableName(int id) { + // some code goes here + return null; + } + + /** Delete all tables from the catalog */ + public void clear() { + // some code goes here + } + + /** + * Reads the schema from a file and creates the appropriate tables in the database. + * @param catalogFile + */ + public void loadSchema(String catalogFile) { + String line = ""; + String baseFolder=new File(new File(catalogFile).getAbsolutePath()).getParent(); + try { + BufferedReader br = new BufferedReader(new FileReader(new File(catalogFile))); + + while ((line = br.readLine()) != null) { + //assume line is of the format name (field type, field type, ...) + String name = line.substring(0, line.indexOf("(")).trim(); + //System.out.println("TABLE NAME: " + name); + String fields = line.substring(line.indexOf("(") + 1, line.indexOf(")")).trim(); + String[] els = fields.split(","); + ArrayList<String> names = new ArrayList<String>(); + ArrayList<Type> types = new ArrayList<Type>(); + String primaryKey = ""; + for (String e : els) { + String[] els2 = e.trim().split(" "); + names.add(els2[0].trim()); + if (els2[1].trim().toLowerCase().equals("int")) + types.add(Type.INT_TYPE); + else if (els2[1].trim().toLowerCase().equals("string")) + types.add(Type.STRING_TYPE); + else { + System.out.println("Unknown type " + els2[1]); + System.exit(0); + } + if (els2.length == 3) { + if (els2[2].trim().equals("pk")) + primaryKey = els2[0].trim(); + else { + System.out.println("Unknown annotation " + els2[2]); + System.exit(0); + } + } + } + Type[] typeAr = types.toArray(new Type[0]); + String[] namesAr = names.toArray(new String[0]); + TupleDesc t = new TupleDesc(typeAr, namesAr); + HeapFile tabHf = new HeapFile(new File(baseFolder+"/"+name + ".dat"), t); + addTable(tabHf,name,primaryKey); + System.out.println("Added table : " + name + " with schema " + t); + } + } catch (IOException e) { + e.printStackTrace(); + System.exit(0); + } catch (IndexOutOfBoundsException e) { + System.out.println ("Invalid catalog entry : " + line); + System.exit(0); + } + } +} + diff --git a/hw/hw3/starter-code/src/java/simpledb/CostCard.java b/hw/hw3/starter-code/src/java/simpledb/CostCard.java new file mode 100644 index 0000000000000000000000000000000000000000..8114a5fffceacbb6a5758005e6d51f69d4af5dcb --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/CostCard.java @@ -0,0 +1,14 @@ +package simpledb; +import java.util.Vector; + +/** Class returned by {@link JoinOptimizer#computeCostAndCardOfSubplan} specifying the + cost and cardinality of the optimal plan represented by plan. +*/ +public class CostCard { + /** The cost of the optimal subplan */ + public double cost; + /** The cardinality of the optimal subplan */ + public int card; + /** The optimal subplan */ + public Vector<LogicalJoinNode> plan; +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Database.java b/hw/hw3/starter-code/src/java/simpledb/Database.java new file mode 100644 index 0000000000000000000000000000000000000000..c9fc338bc7b17d06f826723bac7b06f1e57082b6 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Database.java @@ -0,0 +1,81 @@ +package simpledb; + +import java.io.*; +import java.util.concurrent.atomic.AtomicReference; + +/** + * Database is a class that initializes several static variables used by the + * database system (the catalog, the buffer pool, and the log files, in + * particular.) + * <p> + * Provides a set of methods that can be used to access these variables from + * anywhere. + * + * @Threadsafe + */ +public class Database { + private static AtomicReference<Database> _instance = new AtomicReference<Database>(new Database()); + private final Catalog _catalog; + private final BufferPool _bufferpool; + + private final static String LOGFILENAME = "log"; + private final LogFile _logfile; + + private Database() { + _catalog = new Catalog(); + _bufferpool = new BufferPool(BufferPool.DEFAULT_PAGES); + LogFile tmp = null; + try { + tmp = new LogFile(new File(LOGFILENAME)); + } catch (IOException e) { + e.printStackTrace(); + System.exit(1); + } + _logfile = tmp; + // startControllerThread(); + } + + /** Return the log file of the static Database instance */ + public static LogFile getLogFile() { + return _instance.get()._logfile; + } + + /** Return the buffer pool of the static Database instance */ + public static BufferPool getBufferPool() { + return _instance.get()._bufferpool; + } + + /** Return the catalog of the static Database instance */ + public static Catalog getCatalog() { + return _instance.get()._catalog; + } + + /** + * Method used for testing -- create a new instance of the buffer pool and + * return it + */ + public static BufferPool resetBufferPool(int pages) { + java.lang.reflect.Field bufferPoolF=null; + try { + bufferPoolF = Database.class.getDeclaredField("_bufferpool"); + bufferPoolF.setAccessible(true); + bufferPoolF.set(_instance.get(), new BufferPool(pages)); + } catch (NoSuchFieldException e) { + e.printStackTrace(); + } catch (SecurityException e) { + e.printStackTrace(); + } catch (IllegalArgumentException e) { + e.printStackTrace(); + } catch (IllegalAccessException e) { + e.printStackTrace(); + } +// _instance._bufferpool = new BufferPool(pages); + return _instance.get()._bufferpool; + } + + // reset the database, used for unit tests only. + public static void reset() { + _instance.set(new Database()); + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/DbException.java b/hw/hw3/starter-code/src/java/simpledb/DbException.java new file mode 100644 index 0000000000000000000000000000000000000000..fe23217edca03152250372f845766b0db892603f --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/DbException.java @@ -0,0 +1,12 @@ +package simpledb; + +import java.lang.Exception; + +/** Generic database exception class */ +public class DbException extends Exception { + private static final long serialVersionUID = 1L; + + public DbException(String s) { + super(s); + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/DbFile.java b/hw/hw3/starter-code/src/java/simpledb/DbFile.java new file mode 100644 index 0000000000000000000000000000000000000000..294bf05bf2cb31646230424ef904c9cc5ecc4b0f --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/DbFile.java @@ -0,0 +1,91 @@ + +package simpledb; + +import java.util.*; +import java.io.*; + +/** + * The interface for database files on disk. Each table is represented by a + * single DbFile. DbFiles can fetch pages and iterate through tuples. Each + * file has a unique id used to store metadata about the table in the Catalog. + * DbFiles are generally accessed through the buffer pool, rather than directly + * by operators. + */ +public interface DbFile { + /** + * Read the specified page from disk. + * + * @throws IllegalArgumentException if the page does not exist in this file. + */ + public Page readPage(PageId id); + + /** + * Push the specified page to disk. + * + * @param p The page to write. page.getId().pageno() specifies the offset into the file where the page should be written. + * @throws IOException if the write fails + * + */ + public void writePage(Page p) throws IOException; + + /** + * Inserts the specified tuple to the file on behalf of transaction. + * This method will acquire a lock on the affected pages of the file, and + * may block until the lock can be acquired. + * + * @param tid The transaction performing the update + * @param t The tuple to add. This tuple should be updated to reflect that + * it is now stored in this file. + * @return An ArrayList contain the pages that were modified + * @throws DbException if the tuple cannot be added + * @throws IOException if the needed file can't be read/written + */ + public ArrayList<Page> insertTuple(TransactionId tid, Tuple t) + throws DbException, IOException, TransactionAbortedException; + + /** + * Removes the specified tuple from the file on behalf of the specified + * transaction. + * This method will acquire a lock on the affected pages of the file, and + * may block until the lock can be acquired. + * + * @param tid The transaction performing the update + * @param t The tuple to delete. This tuple should be updated to reflect that + * it is no longer stored on any page. + * @return An ArrayList contain the pages that were modified + * @throws DbException if the tuple cannot be deleted or is not a member + * of the file + */ + public ArrayList<Page> deleteTuple(TransactionId tid, Tuple t) + throws DbException, IOException, TransactionAbortedException; + + /** + * Returns an iterator over all the tuples stored in this DbFile. The + * iterator must use {@link BufferPool#getPage}, rather than + * {@link #readPage} to iterate through the pages. + * + * @return an iterator over all the tuples stored in this DbFile. + */ + public DbFileIterator iterator(TransactionId tid); + + /** + * Returns a unique ID used to identify this DbFile in the Catalog. This id + * can be used to look up the table via {@link Catalog#getDatabaseFile} and + * {@link Catalog#getTupleDesc}. + * <p> + * Implementation note: you will need to generate this tableid somewhere, + * ensure that each HeapFile has a "unique id," and that you always + * return the same value for a particular HeapFile. A simple implementation + * is to use the hash code of the absolute path of the file underlying + * the HeapFile, i.e. <code>f.getAbsoluteFile().hashCode()</code>. + * + * @return an ID uniquely identifying this HeapFile. + */ + public int getId(); + + /** + * Returns the TupleDesc of the table stored in this DbFile. + * @return TupleDesc of this DbFile. + */ + public TupleDesc getTupleDesc(); +} diff --git a/hw/hw3/starter-code/src/java/simpledb/DbFileIterator.java b/hw/hw3/starter-code/src/java/simpledb/DbFileIterator.java new file mode 100644 index 0000000000000000000000000000000000000000..cb9161eaa0bdf50d462a21a59b5ca578b93778da --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/DbFileIterator.java @@ -0,0 +1,40 @@ +package simpledb; +import java.util.*; + +/** + * DbFileIterator is the iterator interface that all SimpleDB Dbfile should + * implement. + */ +public interface DbFileIterator{ + /** + * Opens the iterator + * @throws DbException when there are problems opening/accessing the database. + */ + public void open() + throws DbException, TransactionAbortedException; + + /** @return true if there are more tuples available. */ + public boolean hasNext() + throws DbException, TransactionAbortedException; + + /** + * Gets the next tuple from the operator (typically implementing by reading + * from a child operator or an access method). + * + * @return The next tuple in the iterator. + * @throws NoSuchElementException if there are no more tuples + */ + public Tuple next() + throws DbException, TransactionAbortedException, NoSuchElementException; + + /** + * Resets the iterator to the start. + * @throws DbException When rewind is unsupported. + */ + public void rewind() throws DbException, TransactionAbortedException; + + /** + * Closes the iterator. + */ + public void close(); +} diff --git a/hw/hw3/starter-code/src/java/simpledb/DbIterator.java b/hw/hw3/starter-code/src/java/simpledb/DbIterator.java new file mode 100644 index 0000000000000000000000000000000000000000..3831605ed1a2d7fb91c4ed41dfe793c1088129a4 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/DbIterator.java @@ -0,0 +1,56 @@ +package simpledb; +import java.io.Serializable; +import java.util.*; + +/** + * DbIterator is the iterator interface that all SimpleDB operators should + * implement. If the iterator is not open, none of the methods should work, + * and should throw an IllegalStateException. In addition to any + * resource allocation/deallocation, an open method should call any + * child iterator open methods, and in a close method, an iterator + * should call its children's close methods. + */ +public interface DbIterator extends Serializable{ + /** + * Opens the iterator. This must be called before any of the other methods. + * @throws DbException when there are problems opening/accessing the database. + */ + public void open() + throws DbException, TransactionAbortedException; + + /** Returns true if the iterator has more tuples. + * @return true f the iterator has more tuples. + * @throws IllegalStateException If the iterator has not been opened + */ + public boolean hasNext() throws DbException, TransactionAbortedException; + + /** + * Returns the next tuple from the operator (typically implementing by reading + * from a child operator or an access method). + * + * @return the next tuple in the iteration. + * @throws NoSuchElementException if there are no more tuples. + * @throws IllegalStateException If the iterator has not been opened + */ + public Tuple next() throws DbException, TransactionAbortedException, NoSuchElementException; + + /** + * Resets the iterator to the start. + * @throws DbException when rewind is unsupported. + * @throws IllegalStateException If the iterator has not been opened + */ + public void rewind() throws DbException, TransactionAbortedException; + + /** + * Returns the TupleDesc associated with this DbIterator. + * @return the TupleDesc associated with this DbIterator. + */ + public TupleDesc getTupleDesc(); + + /** + * Closes the iterator. When the iterator is closed, calling next(), + * hasNext(), or rewind() should fail by throwing IllegalStateException. + */ + public void close(); + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Debug.java b/hw/hw3/starter-code/src/java/simpledb/Debug.java new file mode 100644 index 0000000000000000000000000000000000000000..62281b5d87e6731e31663cc63fca62fdce79a6ff --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Debug.java @@ -0,0 +1,54 @@ + +package simpledb; + +/** + * Debug is a utility class that wraps println statements and allows + * more or less command line output to be turned on. + * <p> + * Change the value of the DEBUG_LEVEL constant using a system property: + * simpledb.Debug. For example, on the command line, use -Dsimpledb.Debug=x, + * or simply -Dsimpledb.Debug to enable it at level 0. + * The log(level, message, ...) method will print to standard output if the + * level number is less than or equal to the currently set DEBUG_LEVEL. + */ + +public class Debug { + private static final int DEBUG_LEVEL; + static { + String debug = System.getProperty("simpledb.Debug"); + if (debug == null) { + // No system property = disabled + DEBUG_LEVEL = -1; + } else if (debug == "") { + // Empty property = level 0 + DEBUG_LEVEL = 0; + } else { + DEBUG_LEVEL = Integer.parseInt(debug); + } + } + + private static final int DEFAULT_LEVEL = 0; + + /** Log message if the log level >= level. Uses printf. */ + public static void log(int level, String message, Object... args) { + if (isEnabled(level)) { + System.out.printf(message, args); + System.out.println(); + } + } + + /** @return true if level is being logged. */ + public static boolean isEnabled(int level) { + return level <= DEBUG_LEVEL; + } + + /** @return true if the default level is being logged. */ + public static boolean isEnabled() { + return isEnabled(DEFAULT_LEVEL); + } + + /** Logs message at the default log level. */ + public static void log(String message, Object... args) { + log(DEFAULT_LEVEL, message, args); + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Delete.java b/hw/hw3/starter-code/src/java/simpledb/Delete.java new file mode 100644 index 0000000000000000000000000000000000000000..5532cf59557615285e885dde9378da851734853a --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Delete.java @@ -0,0 +1,68 @@ +package simpledb; + +import java.io.IOException; + +/** + * The delete operator. Delete reads tuples from its child operator and removes + * them from the table they belong to. + */ +public class Delete extends Operator { + + private static final long serialVersionUID = 1L; + + /** + * Constructor specifying the transaction that this delete belongs to as + * well as the child to read from. + * + * @param t + * The transaction this delete runs in + * @param child + * The child operator from which to read tuples for deletion + */ + public Delete(TransactionId t, DbIterator child) { + // some code goes here + } + + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + public void open() throws DbException, TransactionAbortedException { + // some code goes here + } + + public void close() { + // some code goes here + } + + public void rewind() throws DbException, TransactionAbortedException { + // some code goes here + } + + /** + * Deletes tuples as they are read from the child operator. Deletes are + * processed via the buffer pool (which can be accessed via the + * Database.getBufferPool() method. + * + * @return A 1-field tuple containing the number of deleted records. + * @see Database#getBufferPool + * @see BufferPool#deleteTuple + */ + protected Tuple fetchNext() throws TransactionAbortedException, DbException { + // some code goes here + return null; + } + + @Override + public DbIterator[] getChildren() { + // some code goes here + return null; + } + + @Override + public void setChildren(DbIterator[] children) { + // some code goes here + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Field.java b/hw/hw3/starter-code/src/java/simpledb/Field.java new file mode 100644 index 0000000000000000000000000000000000000000..0b872ebc8796f20c39f8c730e6aefafbbd584e0d --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Field.java @@ -0,0 +1,40 @@ +package simpledb; + +import java.io.*; + +/** + * Interface for values of fields in tuples in SimpleDB. + */ +public interface Field extends Serializable{ + /** + * Write the bytes representing this field to the specified + * DataOutputStream. + * @see DataOutputStream + * @param dos The DataOutputStream to write to. + */ + void serialize(DataOutputStream dos) throws IOException; + + /** + * Compare the value of this field object to the passed in value. + * @param op The operator + * @param value The value to compare this Field to + * @return Whether or not the comparison yields true. + */ + public boolean compare(Predicate.Op op, Field value); + + /** + * Returns the type of this field (see {@link Type#INT_TYPE} or {@link Type#STRING_TYPE} + * @return type of this field + */ + public Type getType(); + + /** + * Hash code. + * Different Field objects representing the same value should probably + * return the same hashCode. + */ + public int hashCode(); + public boolean equals(Object field); + + public String toString(); +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Filter.java b/hw/hw3/starter-code/src/java/simpledb/Filter.java new file mode 100644 index 0000000000000000000000000000000000000000..dc192974ce14a583a9c98a8e4d0f60f1bcc2a414 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Filter.java @@ -0,0 +1,74 @@ +package simpledb; + +import java.util.*; + +/** + * Filter is an operator that implements a relational select. + */ +public class Filter extends Operator { + + private static final long serialVersionUID = 1L; + + /** + * Constructor accepts a predicate to apply and a child operator to read + * tuples to filter from. + * + * @param p + * The predicate to filter tuples with + * @param child + * The child operator + */ + public Filter(Predicate p, DbIterator child) { + // some code goes here + } + + public Predicate getPredicate() { + // some code goes here + return null; + } + + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + public void open() throws DbException, NoSuchElementException, + TransactionAbortedException { + // some code goes here + } + + public void close() { + // some code goes here + } + + public void rewind() throws DbException, TransactionAbortedException { + // some code goes here + } + + /** + * AbstractDbIterator.readNext implementation. Iterates over tuples from the + * child operator, applying the predicate to them and returning those that + * pass the predicate (i.e. for which the Predicate.filter() returns true.) + * + * @return The next tuple that passes the filter, or null if there are no + * more tuples + * @see Predicate#filter + */ + protected Tuple fetchNext() throws NoSuchElementException, + TransactionAbortedException, DbException { + // some code goes here + return null; + } + + @Override + public DbIterator[] getChildren() { + // some code goes here + return null; + } + + @Override + public void setChildren(DbIterator[] children) { + // some code goes here + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/HeapFile.java b/hw/hw3/starter-code/src/java/simpledb/HeapFile.java new file mode 100644 index 0000000000000000000000000000000000000000..dfa07b576dd0b608f9086ff82a4ef25bbb29b3d8 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/HeapFile.java @@ -0,0 +1,106 @@ +package simpledb; + +import java.io.*; +import java.util.*; + +/** + * HeapFile is an implementation of a DbFile that stores a collection of tuples + * in no particular order. Tuples are stored on pages, each of which is a fixed + * size, and the file is simply a collection of those pages. HeapFile works + * closely with HeapPage. The format of HeapPages is described in the HeapPage + * constructor. + * + * @see simpledb.HeapPage#HeapPage + * @author Sam Madden + */ +public class HeapFile implements DbFile { + + /** + * Constructs a heap file backed by the specified file. + * + * @param f + * the file that stores the on-disk backing store for this heap + * file. + */ + public HeapFile(File f, TupleDesc td) { + // some code goes here + } + + /** + * Returns the File backing this HeapFile on disk. + * + * @return the File backing this HeapFile on disk. + */ + public File getFile() { + // some code goes here + return null; + } + + /** + * Returns an ID uniquely identifying this HeapFile. Implementation note: + * you will need to generate this tableid somewhere ensure that each + * HeapFile has a "unique id," and that you always return the same value for + * a particular HeapFile. We suggest hashing the absolute file name of the + * file underlying the heapfile, i.e. f.getAbsoluteFile().hashCode(). + * + * @return an ID uniquely identifying this HeapFile. + */ + public int getId() { + // some code goes here + throw new UnsupportedOperationException("implement this"); + } + + /** + * Returns the TupleDesc of the table stored in this DbFile. + * + * @return TupleDesc of this DbFile. + */ + public TupleDesc getTupleDesc() { + // some code goes here + throw new UnsupportedOperationException("implement this"); + } + + // see DbFile.java for javadocs + public Page readPage(PageId pid) { + // some code goes here + return null; + } + + // see DbFile.java for javadocs + public void writePage(Page page) throws IOException { + // some code goes here + // not necessary for this assignment + } + + /** + * Returns the number of pages in this HeapFile. + */ + public int numPages() { + // some code goes here + return 0; + } + + // see DbFile.java for javadocs + public ArrayList<Page> insertTuple(TransactionId tid, Tuple t) + throws DbException, IOException, TransactionAbortedException { + // some code goes here + // not necessary for this assignment + return null; + } + + // see DbFile.java for javadocs + public ArrayList<Page> deleteTuple(TransactionId tid, Tuple t) throws DbException, + TransactionAbortedException { + // some code goes here + // not necessary for this assignment + return null; + } + + // see DbFile.java for javadocs + public DbFileIterator iterator(TransactionId tid) { + // some code goes here + return null; + } + +} + diff --git a/hw/hw3/starter-code/src/java/simpledb/HeapFileEncoder.java b/hw/hw3/starter-code/src/java/simpledb/HeapFileEncoder.java new file mode 100644 index 0000000000000000000000000000000000000000..dc575f9f1c9a464bfc329801de28941952bfadad --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/HeapFileEncoder.java @@ -0,0 +1,226 @@ +package simpledb; + +import java.io.*; +import java.util.ArrayList; + +/** + * HeapFileEncoder reads a comma delimited text file or accepts + * an array of tuples and converts it to + * pages of binary data in the appropriate format for simpledb heap pages + * Pages are padded out to a specified length, and written consecutive in a + * data file. + */ + +public class HeapFileEncoder { + + /** Convert the specified tuple list (with only integer fields) into a binary + * page file. <br> + * + * The format of the output file will be as specified in HeapPage and + * HeapFile. + * + * @see HeapPage + * @see HeapFile + * @param tuples the tuples - a list of tuples, each represented by a list of integers that are + * the field values for that tuple. + * @param outFile The output file to write data to + * @param npagebytes The number of bytes per page in the output file + * @param numFields the number of fields in each input tuple + * @throws IOException if the temporary/output file can't be opened + */ + public static void convert(ArrayList<ArrayList<Integer>> tuples, File outFile, int npagebytes, int numFields) throws IOException { + File tempInput = File.createTempFile("tempTable", ".txt"); + tempInput.deleteOnExit(); + BufferedWriter bw = new BufferedWriter(new FileWriter(tempInput)); + for (ArrayList<Integer> tuple : tuples) { + int writtenFields = 0; + for (Integer field : tuple) { + writtenFields++; + if (writtenFields > numFields) { + throw new RuntimeException("Tuple has more than " + numFields + " fields: (" + + Utility.listToString(tuple) + ")"); + } + bw.write(String.valueOf(field)); + if (writtenFields < numFields) { + bw.write(','); + } + } + bw.write('\n'); + } + bw.close(); + convert(tempInput, outFile, npagebytes, numFields); + } + + public static void convert(File inFile, File outFile, int npagebytes, + int numFields) throws IOException { + Type[] ts = new Type[numFields]; + for (int i = 0; i < ts.length; i++) { + ts[i] = Type.INT_TYPE; + } + convert(inFile,outFile,npagebytes,numFields,ts); + } + + public static void convert(File inFile, File outFile, int npagebytes, + int numFields, Type[] typeAr) + throws IOException { + convert(inFile,outFile,npagebytes,numFields,typeAr,','); + } + + /** Convert the specified input text file into a binary + * page file. <br> + * Assume format of the input file is (note that only integer fields are + * supported):<br> + * int,...,int\n<br> + * int,...,int\n<br> + * ...<br> + * where each row represents a tuple.<br> + * <p> + * The format of the output file will be as specified in HeapPage and + * HeapFile. + * + * @see HeapPage + * @see HeapFile + * @param inFile The input file to read data from + * @param outFile The output file to write data to + * @param npagebytes The number of bytes per page in the output file + * @param numFields the number of fields in each input line/output tuple + * @throws IOException if the input/output file can't be opened or a + * malformed input line is encountered + */ + public static void convert(File inFile, File outFile, int npagebytes, + int numFields, Type[] typeAr, char fieldSeparator) + throws IOException { + + int nrecbytes = 0; + for (int i = 0; i < numFields ; i++) { + nrecbytes += typeAr[i].getLen(); + } + int nrecords = (npagebytes * 8) / (nrecbytes * 8 + 1); //floor comes for free + + // per record, we need one bit; there are nrecords per page, so we need + // nrecords bits, i.e., ((nrecords/32)+1) integers. + int nheaderbytes = (nrecords / 8); + if (nheaderbytes * 8 < nrecords) + nheaderbytes++; //ceiling + int nheaderbits = nheaderbytes * 8; + + BufferedReader br = new BufferedReader(new FileReader(inFile)); + FileOutputStream os = new FileOutputStream(outFile); + + // our numbers probably won't be much larger than 1024 digits + char buf[] = new char[1024]; + + int curpos = 0; + int recordcount = 0; + int npages = 0; + int fieldNo = 0; + + ByteArrayOutputStream headerBAOS = new ByteArrayOutputStream(nheaderbytes); + DataOutputStream headerStream = new DataOutputStream(headerBAOS); + ByteArrayOutputStream pageBAOS = new ByteArrayOutputStream(npagebytes); + DataOutputStream pageStream = new DataOutputStream(pageBAOS); + + boolean done = false; + boolean first = true; + while (!done) { + int c = br.read(); + + // Ignore Windows/Notepad special line endings + if (c == '\r') + continue; + + if (c == '\n') { + if (first) + continue; + recordcount++; + first = true; + } else + first = false; + if (c == fieldSeparator || c == '\n' || c == '\r') { + String s = new String(buf, 0, curpos); + if (typeAr[fieldNo] == Type.INT_TYPE) { + try { + pageStream.writeInt(Integer.parseInt(s.trim())); + } catch (NumberFormatException e) { + System.out.println ("BAD LINE : " + s); + } + } + else if (typeAr[fieldNo] == Type.STRING_TYPE) { + s = s.trim(); + int overflow = Type.STRING_LEN - s.length(); + if (overflow < 0) { + String news = s.substring(0,Type.STRING_LEN); + s = news; + } + pageStream.writeInt(s.length()); + pageStream.writeBytes(s); + while (overflow-- > 0) + pageStream.write((byte)0); + } + curpos = 0; + if (c == '\n') + fieldNo = 0; + else + fieldNo++; + + } else if (c == -1) { + done = true; + + } else { + buf[curpos++] = (char)c; + continue; + } + + // if we wrote a full page of records, or if we're done altogether, + // write out the header of the page. + // + // in the header, write a 1 for bits that correspond to records we've + // written and 0 for empty slots. + // + // when we're done, also flush the page to disk, but only if it has + // records on it. however, if this file is empty, do flush an empty + // page to disk. + if (recordcount >= nrecords + || done && recordcount > 0 + || done && npages == 0) { + int i = 0; + byte headerbyte = 0; + + for (i=0; i<nheaderbits; i++) { + if (i < recordcount) + headerbyte |= (1 << (i % 8)); + + if (((i+1) % 8) == 0) { + headerStream.writeByte(headerbyte); + headerbyte = 0; + } + } + + if (i % 8 > 0) + headerStream.writeByte(headerbyte); + + // pad the rest of the page with zeroes + + for (i=0; i<(npagebytes - (recordcount * nrecbytes + nheaderbytes)); i++) + pageStream.writeByte(0); + + // write header and body to file + headerStream.flush(); + headerBAOS.writeTo(os); + pageStream.flush(); + pageBAOS.writeTo(os); + + // reset header and body for next page + headerBAOS = new ByteArrayOutputStream(nheaderbytes); + headerStream = new DataOutputStream(headerBAOS); + pageBAOS = new ByteArrayOutputStream(npagebytes); + pageStream = new DataOutputStream(pageBAOS); + + recordcount = 0; + npages++; + } + } + br.close(); + os.close(); + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/HeapPage.java b/hw/hw3/starter-code/src/java/simpledb/HeapPage.java new file mode 100644 index 0000000000000000000000000000000000000000..126e85430d3c6d3598267787545002801295f53b --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/HeapPage.java @@ -0,0 +1,303 @@ +package simpledb; + +import java.util.*; +import java.io.*; + +/** + * Each instance of HeapPage stores data for one page of HeapFiles and + * implements the Page interface that is used by BufferPool. + * + * @see HeapFile + * @see BufferPool + * + */ +public class HeapPage implements Page { + + final HeapPageId pid; + final TupleDesc td; + final byte header[]; + final Tuple tuples[]; + final int numSlots; + + byte[] oldData; + private final Byte oldDataLock=new Byte((byte)0); + + /** + * Create a HeapPage from a set of bytes of data read from disk. + * The format of a HeapPage is a set of header bytes indicating + * the slots of the page that are in use, some number of tuple slots. + * Specifically, the number of tuples is equal to: <p> + * floor((BufferPool.getPageSize()*8) / (tuple size * 8 + 1)) + * <p> where tuple size is the size of tuples in this + * database table, which can be determined via {@link Catalog#getTupleDesc}. + * The number of 8-bit header words is equal to: + * <p> + * ceiling(no. tuple slots / 8) + * <p> + * @see Database#getCatalog + * @see Catalog#getTupleDesc + * @see BufferPool#getPageSize() + */ + public HeapPage(HeapPageId id, byte[] data) throws IOException { + this.pid = id; + this.td = Database.getCatalog().getTupleDesc(id.getTableId()); + this.numSlots = getNumTuples(); + DataInputStream dis = new DataInputStream(new ByteArrayInputStream(data)); + + // allocate and read the header slots of this page + header = new byte[getHeaderSize()]; + for (int i=0; i<header.length; i++) + header[i] = dis.readByte(); + + tuples = new Tuple[numSlots]; + try{ + // allocate and read the actual records of this page + for (int i=0; i<tuples.length; i++) + tuples[i] = readNextTuple(dis,i); + }catch(NoSuchElementException e){ + e.printStackTrace(); + } + dis.close(); + + setBeforeImage(); + } + + /** Retrieve the number of tuples on this page. + @return the number of tuples on this page + */ + private int getNumTuples() { + // some code goes here + return 0; + + } + + /** + * Computes the number of bytes in the header of a page in a HeapFile with each tuple occupying tupleSize bytes + * @return the number of bytes in the header of a page in a HeapFile with each tuple occupying tupleSize bytes + */ + private int getHeaderSize() { + + // some code goes here + return 0; + + } + + /** Return a view of this page before it was modified + -- used by recovery */ + public HeapPage getBeforeImage(){ + // not necessary for this assignment + return null; + } + + public void setBeforeImage() { + synchronized(oldDataLock) + { + oldData = getPageData().clone(); + } + } + + /** + * @return the PageId associated with this page. + */ + public HeapPageId getId() { + // some code goes here + throw new UnsupportedOperationException("implement this"); + } + + /** + * Suck up tuples from the source file. + */ + private Tuple readNextTuple(DataInputStream dis, int slotId) throws NoSuchElementException { + // if associated bit is not set, read forward to the next tuple, and + // return null. + if (!isSlotUsed(slotId)) { + for (int i=0; i<td.getSize(); i++) { + try { + dis.readByte(); + } catch (IOException e) { + throw new NoSuchElementException("error reading empty tuple"); + } + } + return null; + } + + // read fields in the tuple + Tuple t = new Tuple(td); + RecordId rid = new RecordId(pid, slotId); + t.setRecordId(rid); + try { + for (int j=0; j<td.numFields(); j++) { + Field f = td.getFieldType(j).parse(dis); + t.setField(j, f); + } + } catch (java.text.ParseException e) { + e.printStackTrace(); + throw new NoSuchElementException("parsing error!"); + } + + return t; + } + + /** + * Generates a byte array representing the contents of this page. + * Used to serialize this page to disk. + * <p> + * The invariant here is that it should be possible to pass the byte + * array generated by getPageData to the HeapPage constructor and + * have it produce an identical HeapPage object. + * + * @see #HeapPage + * @return A byte array correspond to the bytes of this page. + */ + public byte[] getPageData() { + int len = BufferPool.getPageSize(); + ByteArrayOutputStream baos = new ByteArrayOutputStream(len); + DataOutputStream dos = new DataOutputStream(baos); + + // create the header of the page + for (int i=0; i<header.length; i++) { + try { + dos.writeByte(header[i]); + } catch (IOException e) { + // this really shouldn't happen + e.printStackTrace(); + } + } + + // create the tuples + for (int i=0; i<tuples.length; i++) { + + // empty slot + if (!isSlotUsed(i)) { + for (int j=0; j<td.getSize(); j++) { + try { + dos.writeByte(0); + } catch (IOException e) { + e.printStackTrace(); + } + + } + continue; + } + + // non-empty slot + for (int j=0; j<td.numFields(); j++) { + Field f = tuples[i].getField(j); + try { + f.serialize(dos); + + } catch (IOException e) { + e.printStackTrace(); + } + } + } + + // padding + int zerolen = BufferPool.getPageSize() - (header.length + td.getSize() * tuples.length); //- numSlots * td.getSize(); + byte[] zeroes = new byte[zerolen]; + try { + dos.write(zeroes, 0, zerolen); + } catch (IOException e) { + e.printStackTrace(); + } + + try { + dos.flush(); + } catch (IOException e) { + e.printStackTrace(); + } + + return baos.toByteArray(); + } + + /** + * Static method to generate a byte array corresponding to an empty + * HeapPage. + * Used to add new, empty pages to the file. Passing the results of + * this method to the HeapPage constructor will create a HeapPage with + * no valid tuples in it. + * + * @return The returned ByteArray. + */ + public static byte[] createEmptyPageData() { + int len = BufferPool.getPageSize(); + return new byte[len]; //all 0 + } + + /** + * Delete the specified tuple from the page; the tuple should be updated to reflect + * that it is no longer stored on any page. + * @throws DbException if this tuple is not on this page, or tuple slot is + * already empty. + * @param t The tuple to delete + */ + public void deleteTuple(Tuple t) throws DbException { + // some code goes here + // not necessary for this assignment + } + + /** + * Adds the specified tuple to the page; the tuple should be updated to reflect + * that it is now stored on this page. + * @throws DbException if the page is full (no empty slots) or tupledesc + * is mismatch. + * @param t The tuple to add. + */ + public void insertTuple(Tuple t) throws DbException { + // some code goes here + // not necessary for this assignment + } + + /** + * Marks this page as dirty/not dirty and record that transaction + * that did the dirtying + */ + public void markDirty(boolean dirty, TransactionId tid) { + // some code goes here + // not necessary for this assignment + } + + /** + * Returns the tid of the transaction that last dirtied this page, or null if the page is not dirty + */ + public TransactionId isDirty() { + // some code goes here + // not necessary for this assignment + return null; + } + + /** + * Returns the number of empty slots on this page. + */ + public int getNumEmptySlots() { + // some code goes here + return 0; + } + + /** + * Returns true if associated slot on this page is filled. + */ + public boolean isSlotUsed(int i) { + // some code goes here + return false; + } + + /** + * Abstraction to fill or clear a slot on this page. + */ + private void markSlotUsed(int i, boolean value) { + // some code goes here + // not necessary for this assignment + } + + /** + * @return an iterator over all tuples on this page (calling remove on this iterator throws an UnsupportedOperationException) + * (note that this iterator shouldn't return tuples in empty slots!) + */ + public Iterator<Tuple> iterator() { + // some code goes here + return null; + } + +} + diff --git a/hw/hw3/starter-code/src/java/simpledb/HeapPageId.java b/hw/hw3/starter-code/src/java/simpledb/HeapPageId.java new file mode 100644 index 0000000000000000000000000000000000000000..670a4764153a04c00dc1cf7a749602b0f1dfdc76 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/HeapPageId.java @@ -0,0 +1,70 @@ +package simpledb; + +/** Unique identifier for HeapPage objects. */ +public class HeapPageId implements PageId { + + /** + * Constructor. Create a page id structure for a specific page of a + * specific table. + * + * @param tableId The table that is being referenced + * @param pgNo The page number in that table. + */ + public HeapPageId(int tableId, int pgNo) { + // some code goes here + } + + /** @return the table associated with this PageId */ + public int getTableId() { + // some code goes here + return 0; + } + + /** + * @return the page number in the table getTableId() associated with + * this PageId + */ + public int pageNumber() { + // some code goes here + return 0; + } + + /** + * @return a hash code for this page, represented by the concatenation of + * the table number and the page number (needed if a PageId is used as a + * key in a hash table in the BufferPool, for example.) + * @see BufferPool + */ + public int hashCode() { + // some code goes here + throw new UnsupportedOperationException("implement this"); + } + + /** + * Compares one PageId to another. + * + * @param o The object to compare against (must be a PageId) + * @return true if the objects are equal (e.g., page numbers and table + * ids are the same) + */ + public boolean equals(Object o) { + // some code goes here + return false; + } + + /** + * Return a representation of this object as an array of + * integers, for writing to disk. Size of returned array must contain + * number of integers that corresponds to number of args to one of the + * constructors. + */ + public int[] serialize() { + int data[] = new int[2]; + + data[0] = getTableId(); + data[1] = pageNumber(); + + return data; + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Insert.java b/hw/hw3/starter-code/src/java/simpledb/Insert.java new file mode 100644 index 0000000000000000000000000000000000000000..4d7c587a20e4b1181db74ce4b31c95371fc72b84 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Insert.java @@ -0,0 +1,74 @@ +package simpledb; + +/** + * Inserts tuples read from the child operator into the tableid specified in the + * constructor + */ +public class Insert extends Operator { + + private static final long serialVersionUID = 1L; + + /** + * Constructor. + * + * @param t + * The transaction running the insert. + * @param child + * The child operator from which to read tuples to be inserted. + * @param tableid + * The table in which to insert tuples. + * @throws DbException + * if TupleDesc of child differs from table into which we are to + * insert. + */ + public Insert(TransactionId t,DbIterator child, int tableid) + throws DbException { + // some code goes here + } + + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + public void open() throws DbException, TransactionAbortedException { + // some code goes here + } + + public void close() { + // some code goes here + } + + public void rewind() throws DbException, TransactionAbortedException { + // some code goes here + } + + /** + * Inserts tuples read from child into the tableid specified by the + * constructor. It returns a one field tuple containing the number of + * inserted records. Inserts should be passed through BufferPool. An + * instances of BufferPool is available via Database.getBufferPool(). Note + * that insert DOES NOT need check to see if a particular tuple is a + * duplicate before inserting it. + * + * @return A 1-field tuple containing the number of inserted records, or + * null if called more than once. + * @see Database#getBufferPool + * @see BufferPool#insertTuple + */ + protected Tuple fetchNext() throws TransactionAbortedException, DbException { + // some code goes here + return null; + } + + @Override + public DbIterator[] getChildren() { + // some code goes here + return null; + } + + @Override + public void setChildren(DbIterator[] children) { + // some code goes here + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/IntField.java b/hw/hw3/starter-code/src/java/simpledb/IntField.java new file mode 100644 index 0000000000000000000000000000000000000000..e4fbd5c48bce4c7424872b7f2f7933394575fcf4 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/IntField.java @@ -0,0 +1,86 @@ +package simpledb; + +import java.io.*; + +/** + * Instance of Field that stores a single integer. + */ +public class IntField implements Field { + + private static final long serialVersionUID = 1L; + + private final int value; + + public int getValue() { + return value; + } + + /** + * Constructor. + * + * @param i The value of this field. + */ + public IntField(int i) { + value = i; + } + + public String toString() { + return Integer.toString(value); + } + + public int hashCode() { + return value; + } + + public boolean equals(Object field) { + return ((IntField) field).value == value; + } + + public void serialize(DataOutputStream dos) throws IOException { + dos.writeInt(value); + } + + /** + * Compare the specified field to the value of this Field. + * Return semantics are as specified by Field.compare + * + * @throws IllegalCastException if val is not an IntField + * @see Field#compare + */ + public boolean compare(Predicate.Op op, Field val) { + + IntField iVal = (IntField) val; + + switch (op) { + case EQUALS: + return value == iVal.value; + case NOT_EQUALS: + return value != iVal.value; + + case GREATER_THAN: + return value > iVal.value; + + case GREATER_THAN_OR_EQ: + return value >= iVal.value; + + case LESS_THAN: + return value < iVal.value; + + case LESS_THAN_OR_EQ: + return value <= iVal.value; + + case LIKE: + return value == iVal.value; + } + + return false; + } + + /** + * Return the Type of this field. + * @return Type.INT_TYPE + */ + public Type getType() { + return Type.INT_TYPE; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/IntegerAggregator.java b/hw/hw3/starter-code/src/java/simpledb/IntegerAggregator.java new file mode 100644 index 0000000000000000000000000000000000000000..ce2dcff4c5d14a5325ca9bb5574c495d5fd253b0 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/IntegerAggregator.java @@ -0,0 +1,54 @@ +package simpledb; + +/** + * Knows how to compute some aggregate over a set of IntFields. + */ +public class IntegerAggregator implements Aggregator { + + private static final long serialVersionUID = 1L; + + /** + * Aggregate constructor + * + * @param gbfield + * the 0-based index of the group-by field in the tuple, or + * NO_GROUPING if there is no grouping + * @param gbfieldtype + * the type of the group by field (e.g., Type.INT_TYPE), or null + * if there is no grouping + * @param afield + * the 0-based index of the aggregate field in the tuple + * @param what + * the aggregation operator + */ + + public IntegerAggregator(int gbfield, Type gbfieldtype, int afield, Op what) { + // some code goes here + } + + /** + * Merge a new tuple into the aggregate, grouping as indicated in the + * constructor + * + * @param tup + * the Tuple containing an aggregate field and a group-by field + */ + public void mergeTupleIntoGroup(Tuple tup) { + // some code goes here + } + + /** + * Create a DbIterator over group aggregate results. + * + * @return a DbIterator whose tuples are the pair (groupVal, aggregateVal) + * if using group, or a single (aggregateVal) if no grouping. The + * aggregateVal is determined by the type of aggregate specified in + * the constructor. + */ + public DbIterator iterator() { + // some code goes here + throw new + UnsupportedOperationException("please implement me for lab2"); + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Join.java b/hw/hw3/starter-code/src/java/simpledb/Join.java new file mode 100644 index 0000000000000000000000000000000000000000..d85c7ba6746b6ce67e46ae515ccb9c9e9aebdeca --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Join.java @@ -0,0 +1,108 @@ +package simpledb; + +import java.util.*; + +/** + * The Join operator implements the relational join operation. + */ +public class Join extends Operator { + + private static final long serialVersionUID = 1L; + + /** + * Constructor. Accepts to children to join and the predicate to join them + * on + * + * @param p + * The predicate to use to join the children + * @param child1 + * Iterator for the left(outer) relation to join + * @param child2 + * Iterator for the right(inner) relation to join + */ + public Join(JoinPredicate p, DbIterator child1, DbIterator child2) { + // some code goes here + } + + public JoinPredicate getJoinPredicate() { + // some code goes here + return null; + } + + /** + * @return + * the field name of join field1. Should be quantified by + * alias or table name. + * */ + public String getJoinField1Name() { + // some code goes here + return null; + } + + /** + * @return + * the field name of join field2. Should be quantified by + * alias or table name. + * */ + public String getJoinField2Name() { + // some code goes here + return null; + } + + /** + * @see simpledb.TupleDesc#merge(TupleDesc, TupleDesc) for possible + * implementation logic. + */ + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + public void open() throws DbException, NoSuchElementException, + TransactionAbortedException { + // some code goes here + } + + public void close() { + // some code goes here + } + + public void rewind() throws DbException, TransactionAbortedException { + // some code goes here + } + + /** + * Returns the next tuple generated by the join, or null if there are no + * more tuples. Logically, this is the next tuple in r1 cross r2 that + * satisfies the join predicate. There are many possible implementations; + * the simplest is a nested loops join. + * <p> + * Note that the tuples returned from this particular implementation of Join + * are simply the concatenation of joining tuples from the left and right + * relation. Therefore, if an equality predicate is used there will be two + * copies of the join attribute in the results. (Removing such duplicate + * columns can be done with an additional projection operator if needed.) + * <p> + * For example, if one tuple is {1,2,3} and the other tuple is {1,5,6}, + * joined on equality of the first column, then this returns {1,2,3,1,5,6}. + * + * @return The next matching tuple. + * @see JoinPredicate#filter + */ + protected Tuple fetchNext() throws TransactionAbortedException, DbException { + // some code goes here + return null; + } + + @Override + public DbIterator[] getChildren() { + // some code goes here + return null; + } + + @Override + public void setChildren(DbIterator[] children) { + // some code goes here + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/JoinOptimizer.java b/hw/hw3/starter-code/src/java/simpledb/JoinOptimizer.java new file mode 100644 index 0000000000000000000000000000000000000000..d3ad836a4ac0f513f921145094e4aa2bf05727b7 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/JoinOptimizer.java @@ -0,0 +1,557 @@ +package simpledb; + +import java.util.*; + +import javax.swing.*; +import javax.swing.tree.*; + +/** + * The JoinOptimizer class is responsible for ordering a series of joins + * optimally, and for selecting the best instantiation of a join for a given + * logical plan. + */ +public class JoinOptimizer { + LogicalPlan p; + Vector<LogicalJoinNode> joins; + + /** + * Constructor + * + * @param p + * the logical plan being optimized + * @param joins + * the list of joins being performed + */ + public JoinOptimizer(LogicalPlan p, Vector<LogicalJoinNode> joins) { + this.p = p; + this.joins = joins; + } + + /** + * Return best iterator for computing a given logical join, given the + * specified statistics, and the provided left and right subplans. Note that + * there is insufficient information to determine which plan should be the + * inner/outer here -- because DbIterator's don't provide any cardinality + * estimates, and stats only has information about the base tables. For this + * reason, the plan1 + * + * @param lj + * The join being considered + * @param plan1 + * The left join node's child + * @param plan2 + * The right join node's child + */ + public static DbIterator instantiateJoin(LogicalJoinNode lj, + DbIterator plan1, DbIterator plan2) throws ParsingException { + + int t1id = 0, t2id = 0; + DbIterator j; + + try { + t1id = plan1.getTupleDesc().fieldNameToIndex(lj.f1QuantifiedName); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown field " + lj.f1QuantifiedName); + } + + if (lj instanceof LogicalSubplanJoinNode) { + t2id = 0; + } else { + try { + t2id = plan2.getTupleDesc().fieldNameToIndex( + lj.f2QuantifiedName); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown field " + + lj.f2QuantifiedName); + } + } + + JoinPredicate p = new JoinPredicate(t1id, lj.p, t2id); + + j = new Join(p,plan1,plan2); + + return j; + + } + + /** + * Estimate the cost of a join. + * + * The cost of the join should be calculated based on the join algorithm (or + * algorithms) that you implemented for Lab 2. It should be a function of + * the amount of data that must be read over the course of the query, as + * well as the number of CPU opertions performed by your join. Assume that + * the cost of a single predicate application is roughly 1. + * + * + * @param j + * A LogicalJoinNode representing the join operation being + * performed. + * @param card1 + * Estimated cardinality of the left-hand side of the query + * @param card2 + * Estimated cardinality of the right-hand side of the query + * @param cost1 + * Estimated cost of one full scan of the table on the left-hand + * side of the query + * @param cost2 + * Estimated cost of one full scan of the table on the right-hand + * side of the query + * @return An estimate of the cost of this query, in terms of cost1 and + * cost2 + */ + public double estimateJoinCost(LogicalJoinNode j, int card1, int card2, + double cost1, double cost2) { + if (j instanceof LogicalSubplanJoinNode) { + // A LogicalSubplanJoinNode represents a subquery. + // You do not need to implement proper support for these for Lab 4. + return card1 + cost1 + cost2; + } else { + // Insert your code here. + // HINT: You may need to use the variable "j" if you implemented + // a join algorithm that's more complicated than a basic + // nested-loops join. + return -1.0; + } + } + + /** + * Estimate the cardinality of a join. The cardinality of a join is the + * number of tuples produced by the join. + * + * @param j + * A LogicalJoinNode representing the join operation being + * performed. + * @param card1 + * Cardinality of the left-hand table in the join + * @param card2 + * Cardinality of the right-hand table in the join + * @param t1pkey + * Is the left-hand table a primary-key table? + * @param t2pkey + * Is the right-hand table a primary-key table? + * @param stats + * The table stats, referenced by table names, not alias + * @return The cardinality of the join + */ + public int estimateJoinCardinality(LogicalJoinNode j, int card1, int card2, + boolean t1pkey, boolean t2pkey, Map<String, TableStats> stats) { + if (j instanceof LogicalSubplanJoinNode) { + // A LogicalSubplanJoinNode represents a subquery. + // You do not need to implement proper support for these for Lab 4. + return card1; + } else { + return estimateTableJoinCardinality(j.p, j.t1Alias, j.t2Alias, + j.f1PureName, j.f2PureName, card1, card2, t1pkey, t2pkey, + stats, p.getTableAliasToIdMapping()); + } + } + + /** + * Estimate the join cardinality of two tables. + * */ + public static int estimateTableJoinCardinality(Predicate.Op joinOp, + String table1Alias, String table2Alias, String field1PureName, + String field2PureName, int card1, int card2, boolean t1pkey, + boolean t2pkey, Map<String, TableStats> stats, + Map<String, Integer> tableAliasToId) { + int card = 1; + // some code goes here + return card <= 0 ? 1 : card; + } + + /** + * Helper method to enumerate all of the subsets of a given size of a + * specified vector. + * + * @param v + * The vector whose subsets are desired + * @param size + * The size of the subsets of interest + * @return a set of all subsets of the specified size + */ + @SuppressWarnings("unchecked") + public <T> Set<Set<T>> enumerateSubsets(Vector<T> v, int size) { + Set<Set<T>> els = new HashSet<Set<T>>(); + els.add(new HashSet<T>()); + // Iterator<Set> it; + // long start = System.currentTimeMillis(); + + for (int i = 0; i < size; i++) { + Set<Set<T>> newels = new HashSet<Set<T>>(); + for (Set<T> s : els) { + for (T t : v) { + Set<T> news = (Set<T>) (((HashSet<T>) s).clone()); + if (news.add(t)) + newels.add(news); + } + } + els = newels; + } + + return els; + + } + + /** + * Compute a logical, reasonably efficient join on the specified tables. See + * PS4 for hints on how this should be implemented. + * + * @param stats + * Statistics for each table involved in the join, referenced by + * base table names, not alias + * @param filterSelectivities + * Selectivities of the filter predicates on each table in the + * join, referenced by table alias (if no alias, the base table + * name) + * @param explain + * Indicates whether your code should explain its query plan or + * simply execute it + * @return A Vector<LogicalJoinNode> that stores joins in the left-deep + * order in which they should be executed. + * @throws ParsingException + * when stats or filter selectivities is missing a table in the + * join, or or when another internal error occurs + */ + public Vector<LogicalJoinNode> orderJoins( + HashMap<String, TableStats> stats, + HashMap<String, Double> filterSelectivities, boolean explain) + throws ParsingException { + //Not necessary for labs 1--3 + + // See the Lab 4 writeup for some hints as to how this function + // should work. + + // some code goes here + //Replace the following + return joins; + } + + // ===================== Private Methods ================================= + + /** + * This is a helper method that computes the cost and cardinality of joining + * joinToRemove to joinSet (joinSet should contain joinToRemove), given that + * all of the subsets of size joinSet.size() - 1 have already been computed + * and stored in PlanCache pc. + * + * @param stats + * table stats for all of the tables, referenced by table names + * rather than alias (see {@link #orderJoins}) + * @param filterSelectivities + * the selectivities of the filters over each of the tables + * (where tables are indentified by their alias or name if no + * alias is given) + * @param joinToRemove + * the join to remove from joinSet + * @param joinSet + * the set of joins being considered + * @param bestCostSoFar + * the best way to join joinSet so far (minimum of previous + * invocations of computeCostAndCardOfSubplan for this joinSet, + * from returned CostCard) + * @param pc + * the PlanCache for this join; should have subplans for all + * plans of size joinSet.size()-1 + * @return A {@link CostCard} objects desribing the cost, cardinality, + * optimal subplan + * @throws ParsingException + * when stats, filterSelectivities, or pc object is missing + * tables involved in join + */ + @SuppressWarnings("unchecked") + private CostCard computeCostAndCardOfSubplan( + HashMap<String, TableStats> stats, + HashMap<String, Double> filterSelectivities, + LogicalJoinNode joinToRemove, Set<LogicalJoinNode> joinSet, + double bestCostSoFar, PlanCache pc) throws ParsingException { + + LogicalJoinNode j = joinToRemove; + + Vector<LogicalJoinNode> prevBest; + + if (this.p.getTableId(j.t1Alias) == null) + throw new ParsingException("Unknown table " + j.t1Alias); + if (this.p.getTableId(j.t2Alias) == null) + throw new ParsingException("Unknown table " + j.t2Alias); + + String table1Name = Database.getCatalog().getTableName( + this.p.getTableId(j.t1Alias)); + String table2Name = Database.getCatalog().getTableName( + this.p.getTableId(j.t2Alias)); + String table1Alias = j.t1Alias; + String table2Alias = j.t2Alias; + + Set<LogicalJoinNode> news = (Set<LogicalJoinNode>) ((HashSet<LogicalJoinNode>) joinSet) + .clone(); + news.remove(j); + + double t1cost, t2cost; + int t1card, t2card; + boolean leftPkey, rightPkey; + + if (news.isEmpty()) { // base case -- both are base relations + prevBest = new Vector<LogicalJoinNode>(); + t1cost = stats.get(table1Name).estimateScanCost(); + t1card = stats.get(table1Name).estimateTableCardinality( + filterSelectivities.get(j.t1Alias)); + leftPkey = isPkey(j.t1Alias, j.f1PureName); + + t2cost = table2Alias == null ? 0 : stats.get(table2Name) + .estimateScanCost(); + t2card = table2Alias == null ? 0 : stats.get(table2Name) + .estimateTableCardinality( + filterSelectivities.get(j.t2Alias)); + rightPkey = table2Alias == null ? false : isPkey(table2Alias, + j.f2PureName); + } else { + // news is not empty -- figure best way to join j to news + prevBest = pc.getOrder(news); + + // possible that we have not cached an answer, if subset + // includes a cross product + if (prevBest == null) { + return null; + } + + double prevBestCost = pc.getCost(news); + int bestCard = pc.getCard(news); + + // estimate cost of right subtree + if (doesJoin(prevBest, table1Alias)) { // j.t1 is in prevBest + t1cost = prevBestCost; // left side just has cost of whatever + // left + // subtree is + t1card = bestCard; + leftPkey = hasPkey(prevBest); + + t2cost = j.t2Alias == null ? 0 : stats.get(table2Name) + .estimateScanCost(); + t2card = j.t2Alias == null ? 0 : stats.get(table2Name) + .estimateTableCardinality( + filterSelectivities.get(j.t2Alias)); + rightPkey = j.t2Alias == null ? false : isPkey(j.t2Alias, + j.f2PureName); + } else if (doesJoin(prevBest, j.t2Alias)) { // j.t2 is in prevbest + // (both + // shouldn't be) + t2cost = prevBestCost; // left side just has cost of whatever + // left + // subtree is + t2card = bestCard; + rightPkey = hasPkey(prevBest); + + t1cost = stats.get(table1Name).estimateScanCost(); + t1card = stats.get(table1Name).estimateTableCardinality( + filterSelectivities.get(j.t1Alias)); + leftPkey = isPkey(j.t1Alias, j.f1PureName); + + } else { + // don't consider this plan if one of j.t1 or j.t2 + // isn't a table joined in prevBest (cross product) + return null; + } + } + + // case where prevbest is left + double cost1 = estimateJoinCost(j, t1card, t2card, t1cost, t2cost); + + LogicalJoinNode j2 = j.swapInnerOuter(); + double cost2 = estimateJoinCost(j2, t2card, t1card, t2cost, t1cost); + if (cost2 < cost1) { + boolean tmp; + j = j2; + cost1 = cost2; + tmp = rightPkey; + rightPkey = leftPkey; + leftPkey = tmp; + } + if (cost1 >= bestCostSoFar) + return null; + + CostCard cc = new CostCard(); + + cc.card = estimateJoinCardinality(j, t1card, t2card, leftPkey, + rightPkey, stats); + cc.cost = cost1; + cc.plan = (Vector<LogicalJoinNode>) prevBest.clone(); + cc.plan.addElement(j); // prevbest is left -- add new join to end + return cc; + } + + /** + * Return true if the specified table is in the list of joins, false + * otherwise + */ + private boolean doesJoin(Vector<LogicalJoinNode> joinlist, String table) { + for (LogicalJoinNode j : joinlist) { + if (j.t1Alias.equals(table) + || (j.t2Alias != null && j.t2Alias.equals(table))) + return true; + } + return false; + } + + /** + * Return true if field is a primary key of the specified table, false + * otherwise + * + * @param tableAlias + * The alias of the table in the query + * @param field + * The pure name of the field + */ + private boolean isPkey(String tableAlias, String field) { + int tid1 = p.getTableId(tableAlias); + String pkey1 = Database.getCatalog().getPrimaryKey(tid1); + + return pkey1.equals(field); + } + + /** + * Return true if a primary key field is joined by one of the joins in + * joinlist + */ + private boolean hasPkey(Vector<LogicalJoinNode> joinlist) { + for (LogicalJoinNode j : joinlist) { + if (isPkey(j.t1Alias, j.f1PureName) + || (j.t2Alias != null && isPkey(j.t2Alias, j.f2PureName))) + return true; + } + return false; + + } + + /** + * Helper function to display a Swing window with a tree representation of + * the specified list of joins. See {@link #orderJoins}, which may want to + * call this when the analyze flag is true. + * + * @param js + * the join plan to visualize + * @param pc + * the PlanCache accumulated whild building the optimal plan + * @param stats + * table statistics for base tables + * @param selectivities + * the selectivities of the filters over each of the tables + * (where tables are indentified by their alias or name if no + * alias is given) + */ + private void printJoins(Vector<LogicalJoinNode> js, PlanCache pc, + HashMap<String, TableStats> stats, + HashMap<String, Double> selectivities) { + + JFrame f = new JFrame("Join Plan for " + p.getQuery()); + + // Set the default close operation for the window, + // or else the program won't exit when clicking close button + f.setDefaultCloseOperation(WindowConstants.DISPOSE_ON_CLOSE); + + f.setVisible(true); + + f.setSize(300, 500); + + HashMap<String, DefaultMutableTreeNode> m = new HashMap<String, DefaultMutableTreeNode>(); + + // int numTabs = 0; + + // int k; + DefaultMutableTreeNode root = null, treetop = null; + HashSet<LogicalJoinNode> pathSoFar = new HashSet<LogicalJoinNode>(); + boolean neither; + + System.out.println(js); + for (LogicalJoinNode j : js) { + pathSoFar.add(j); + System.out.println("PATH SO FAR = " + pathSoFar); + + String table1Name = Database.getCatalog().getTableName( + this.p.getTableId(j.t1Alias)); + String table2Name = Database.getCatalog().getTableName( + this.p.getTableId(j.t2Alias)); + + // Double c = pc.getCost(pathSoFar); + neither = true; + + root = new DefaultMutableTreeNode("Join " + j + " (Cost =" + + pc.getCost(pathSoFar) + ", card = " + + pc.getCard(pathSoFar) + ")"); + DefaultMutableTreeNode n = m.get(j.t1Alias); + if (n == null) { // never seen this table before + n = new DefaultMutableTreeNode(j.t1Alias + + " (Cost = " + + stats.get(table1Name).estimateScanCost() + + ", card = " + + stats.get(table1Name).estimateTableCardinality( + selectivities.get(j.t1Alias)) + ")"); + root.add(n); + } else { + // make left child root n + root.add(n); + neither = false; + } + m.put(j.t1Alias, root); + + n = m.get(j.t2Alias); + if (n == null) { // never seen this table before + + n = new DefaultMutableTreeNode( + j.t2Alias == null ? "Subplan" + : (j.t2Alias + + " (Cost = " + + stats.get(table2Name) + .estimateScanCost() + + ", card = " + + stats.get(table2Name) + .estimateTableCardinality( + selectivities + .get(j.t2Alias)) + ")")); + root.add(n); + } else { + // make right child root n + root.add(n); + neither = false; + } + m.put(j.t2Alias, root); + + // unless this table doesn't join with other tables, + // all tables are accessed from root + if (!neither) { + for (String key : m.keySet()) { + m.put(key, root); + } + } + + treetop = root; + } + + JTree tree = new JTree(treetop); + JScrollPane treeView = new JScrollPane(tree); + + tree.setShowsRootHandles(true); + + // Set the icon for leaf nodes. + ImageIcon leafIcon = new ImageIcon("join.jpg"); + DefaultTreeCellRenderer renderer = new DefaultTreeCellRenderer(); + renderer.setOpenIcon(leafIcon); + renderer.setClosedIcon(leafIcon); + + tree.setCellRenderer(renderer); + + f.setSize(300, 500); + + f.add(treeView); + for (int i = 0; i < tree.getRowCount(); i++) { + tree.expandRow(i); + } + + if (js.size() == 0) { + f.add(new JLabel("No joins in plan.")); + } + + f.pack(); + + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/JoinPredicate.java b/hw/hw3/starter-code/src/java/simpledb/JoinPredicate.java new file mode 100644 index 0000000000000000000000000000000000000000..2ca767245d9eca7db49a34422fee7ff4925f330e --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/JoinPredicate.java @@ -0,0 +1,59 @@ +package simpledb; + +import java.io.Serializable; + +/** + * JoinPredicate compares fields of two tuples using a predicate. JoinPredicate + * is most likely used by the Join operator. + */ +public class JoinPredicate implements Serializable { + + private static final long serialVersionUID = 1L; + + /** + * Constructor -- create a new predicate over two fields of two tuples. + * + * @param field1 + * The field index into the first tuple in the predicate + * @param field2 + * The field index into the second tuple in the predicate + * @param op + * The operation to apply (as defined in Predicate.Op); either + * Predicate.Op.GREATER_THAN, Predicate.Op.LESS_THAN, + * Predicate.Op.EQUAL, Predicate.Op.GREATER_THAN_OR_EQ, or + * Predicate.Op.LESS_THAN_OR_EQ + * @see Predicate + */ + public JoinPredicate(int field1, Predicate.Op op, int field2) { + // some code goes here + } + + /** + * Apply the predicate to the two specified tuples. The comparison can be + * made through Field's compare method. + * + * @return true if the tuples satisfy the predicate. + */ + public boolean filter(Tuple t1, Tuple t2) { + // some code goes here + return false; + } + + public int getField1() + { + // some code goes here + return -1; + } + + public int getField2() + { + // some code goes here + return -1; + } + + public Predicate.Op getOperator() + { + // some code goes here + return null; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/LogFile.java b/hw/hw3/starter-code/src/java/simpledb/LogFile.java new file mode 100644 index 0000000000000000000000000000000000000000..f46053601d3245c7a29dc3696e61fe71bbb24575 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogFile.java @@ -0,0 +1,510 @@ + +package simpledb; + +import java.io.*; +import java.util.*; +import java.lang.reflect.*; + +/** +LogFile implements the recovery subsystem of SimpleDb. This class is +able to write different log records as needed, but it is the +responsibility of the caller to ensure that write ahead logging and +two-phase locking discipline are followed. <p> + +<u> Locking note: </u> +<p> + +Many of the methods here are synchronized (to prevent concurrent log +writes from happening); many of the methods in BufferPool are also +synchronized (for similar reasons.) Problem is that BufferPool writes +log records (on page flushed) and the log file flushes BufferPool +pages (on checkpoints and recovery.) This can lead to deadlock. For +that reason, any LogFile operation that needs to access the BufferPool +must not be declared synchronized and must begin with a block like: + +<p> +<pre> + synchronized (Database.getBufferPool()) { + synchronized (this) { + + .. + + } + } +</pre> +*/ + +/** +<p> The format of the log file is as follows: + +<ul> + +<li> The first long integer of the file represents the offset of the +last written checkpoint, or -1 if there are no checkpoints + +<li> All additional data in the log consists of log records. Log +records are variable length. + +<li> Each log record begins with an integer type and a long integer +transaction id. + +<li> Each log record ends with a long integer file offset representing +the position in the log file where the record began. + +<li> There are five record types: ABORT, COMMIT, UPDATE, BEGIN, and +CHECKPOINT + +<li> ABORT, COMMIT, and BEGIN records contain no additional data + +<li>UPDATE RECORDS consist of two entries, a before image and an +after image. These images are serialized Page objects, and can be +accessed with the LogFile.readPageData() and LogFile.writePageData() +methods. See LogFile.print() for an example. + +<li> CHECKPOINT records consist of active transactions at the time +the checkpoint was taken and their first log record on disk. The format +of the record is an integer count of the number of transactions, as well +as a long integer transaction id and a long integer first record offset +for each active transaction. + +</ul> + +*/ + +public class LogFile { + + final File logFile; + private RandomAccessFile raf; + Boolean recoveryUndecided; // no call to recover() and no append to log + + static final int ABORT_RECORD = 1; + static final int COMMIT_RECORD = 2; + static final int UPDATE_RECORD = 3; + static final int BEGIN_RECORD = 4; + static final int CHECKPOINT_RECORD = 5; + static final long NO_CHECKPOINT_ID = -1; + + final static int INT_SIZE = 4; + final static int LONG_SIZE = 8; + + long currentOffset = -1;//protected by this +// int pageSize; + int totalRecords = 0; // for PatchTest //protected by this + + HashMap<Long,Long> tidToFirstLogRecord = new HashMap<Long,Long>(); + + /** Constructor. + Initialize and back the log file with the specified file. + We're not sure yet whether the caller is creating a brand new DB, + in which case we should ignore the log file, or whether the caller + will eventually want to recover (after populating the Catalog). + So we make this decision lazily: if someone calls recover(), then + do it, while if someone starts adding log file entries, then first + throw out the initial log file contents. + + @param f The log file's name + */ + public LogFile(File f) throws IOException { + this.logFile = f; + raf = new RandomAccessFile(f, "rw"); + recoveryUndecided = true; + + // install shutdown hook to force cleanup on close + // Runtime.getRuntime().addShutdownHook(new Thread() { + // public void run() { shutdown(); } + // }); + + //XXX WARNING -- there is nothing that verifies that the specified + // log file actually corresponds to the current catalog. + // This could cause problems since we log tableids, which may or + // may not match tableids in the current catalog. + } + + // we're about to append a log record. if we weren't sure whether the + // DB wants to do recovery, we're sure now -- it didn't. So truncate + // the log. + void preAppend() throws IOException { + totalRecords++; + if(recoveryUndecided){ + recoveryUndecided = false; + raf.seek(0); + raf.setLength(0); + raf.writeLong(NO_CHECKPOINT_ID); + raf.seek(raf.length()); + currentOffset = raf.getFilePointer(); + } + } + + public synchronized int getTotalRecords() { + return totalRecords; + } + + /** Write an abort record to the log for the specified tid, force + the log to disk, and perform a rollback + @param tid The aborting transaction. + */ + public void logAbort(TransactionId tid) throws IOException { + // must have buffer pool lock before proceeding, since this + // calls rollback + + synchronized (Database.getBufferPool()) { + + synchronized(this) { + preAppend(); + //Debug.log("ABORT"); + //should we verify that this is a live transaction? + + // must do this here, since rollback only works for + // live transactions (needs tidToFirstLogRecord) + rollback(tid); + + raf.writeInt(ABORT_RECORD); + raf.writeLong(tid.getId()); + raf.writeLong(currentOffset); + currentOffset = raf.getFilePointer(); + force(); + tidToFirstLogRecord.remove(tid.getId()); + } + } + } + + /** Write a commit record to disk for the specified tid, + and force the log to disk. + + @param tid The committing transaction. + */ + public synchronized void logCommit(TransactionId tid) throws IOException { + preAppend(); + Debug.log("COMMIT " + tid.getId()); + //should we verify that this is a live transaction? + + raf.writeInt(COMMIT_RECORD); + raf.writeLong(tid.getId()); + raf.writeLong(currentOffset); + currentOffset = raf.getFilePointer(); + force(); + tidToFirstLogRecord.remove(tid.getId()); + } + + /** Write an UPDATE record to disk for the specified tid and page + (with provided before and after images.) + @param tid The transaction performing the write + @param before The before image of the page + @param after The after image of the page + + @see simpledb.Page#getBeforeImage + */ + public synchronized void logWrite(TransactionId tid, Page before, + Page after) + throws IOException { + Debug.log("WRITE, offset = " + raf.getFilePointer()); + preAppend(); + /* update record conists of + + record type + transaction id + before page data (see writePageData) + after page data + start offset + */ + raf.writeInt(UPDATE_RECORD); + raf.writeLong(tid.getId()); + + writePageData(raf,before); + writePageData(raf,after); + raf.writeLong(currentOffset); + currentOffset = raf.getFilePointer(); + + Debug.log("WRITE OFFSET = " + currentOffset); + } + + void writePageData(RandomAccessFile raf, Page p) throws IOException{ + PageId pid = p.getId(); + int pageInfo[] = pid.serialize(); + + //page data is: + // page class name + // id class name + // id class bytes + // id class data + // page class bytes + // page class data + + String pageClassName = p.getClass().getName(); + String idClassName = pid.getClass().getName(); + + raf.writeUTF(pageClassName); + raf.writeUTF(idClassName); + + raf.writeInt(pageInfo.length); + for (int i = 0; i < pageInfo.length; i++) { + raf.writeInt(pageInfo[i]); + } + byte[] pageData = p.getPageData(); + raf.writeInt(pageData.length); + raf.write(pageData); + // Debug.log ("WROTE PAGE DATA, CLASS = " + pageClassName + ", table = " + pid.getTableId() + ", page = " + pid.pageno()); + } + + Page readPageData(RandomAccessFile raf) throws IOException { + PageId pid; + Page newPage = null; + + String pageClassName = raf.readUTF(); + String idClassName = raf.readUTF(); + + try { + Class<?> idClass = Class.forName(idClassName); + Class<?> pageClass = Class.forName(pageClassName); + + Constructor<?>[] idConsts = idClass.getDeclaredConstructors(); + int numIdArgs = raf.readInt(); + Object idArgs[] = new Object[numIdArgs]; + for (int i = 0; i<numIdArgs;i++) { + idArgs[i] = new Integer(raf.readInt()); + } + pid = (PageId)idConsts[0].newInstance(idArgs); + + Constructor<?>[] pageConsts = pageClass.getDeclaredConstructors(); + int pageSize = raf.readInt(); + + byte[] pageData = new byte[pageSize]; + raf.read(pageData); //read before image + + Object[] pageArgs = new Object[2]; + pageArgs[0] = pid; + pageArgs[1] = pageData; + + newPage = (Page)pageConsts[0].newInstance(pageArgs); + + // Debug.log("READ PAGE OF TYPE " + pageClassName + ", table = " + newPage.getId().getTableId() + ", page = " + newPage.getId().pageno()); + } catch (ClassNotFoundException e){ + e.printStackTrace(); + throw new IOException(); + } catch (InstantiationException e) { + e.printStackTrace(); + throw new IOException(); + } catch (IllegalAccessException e) { + e.printStackTrace(); + throw new IOException(); + } catch (InvocationTargetException e) { + e.printStackTrace(); + throw new IOException(); + } + return newPage; + + } + + /** Write a BEGIN record for the specified transaction + @param tid The transaction that is beginning + + */ + public synchronized void logXactionBegin(TransactionId tid) + throws IOException { + Debug.log("BEGIN"); + if(tidToFirstLogRecord.get(tid.getId()) != null){ + System.err.printf("logXactionBegin: already began this tid\n"); + throw new IOException("double logXactionBegin()"); + } + preAppend(); + raf.writeInt(BEGIN_RECORD); + raf.writeLong(tid.getId()); + raf.writeLong(currentOffset); + tidToFirstLogRecord.put(tid.getId(), currentOffset); + currentOffset = raf.getFilePointer(); + + Debug.log("BEGIN OFFSET = " + currentOffset); + } + + /** Checkpoint the log and write a checkpoint record. */ + public void logCheckpoint() throws IOException { + //make sure we have buffer pool lock before proceeding + synchronized (Database.getBufferPool()) { + synchronized (this) { + //Debug.log("CHECKPOINT, offset = " + raf.getFilePointer()); + preAppend(); + long startCpOffset, endCpOffset; + Set<Long> keys = tidToFirstLogRecord.keySet(); + Iterator<Long> els = keys.iterator(); + force(); + Database.getBufferPool().flushAllPages(); + startCpOffset = raf.getFilePointer(); + raf.writeInt(CHECKPOINT_RECORD); + raf.writeLong(-1); //no tid , but leave space for convenience + + //write list of outstanding transactions + raf.writeInt(keys.size()); + while (els.hasNext()) { + Long key = els.next(); + Debug.log("WRITING CHECKPOINT TRANSACTION ID: " + key); + raf.writeLong(key); + //Debug.log("WRITING CHECKPOINT TRANSACTION OFFSET: " + tidToFirstLogRecord.get(key)); + raf.writeLong(tidToFirstLogRecord.get(key)); + } + + //once the CP is written, make sure the CP location at the + // beginning of the log file is updated + endCpOffset = raf.getFilePointer(); + raf.seek(0); + raf.writeLong(startCpOffset); + raf.seek(endCpOffset); + raf.writeLong(currentOffset); + currentOffset = raf.getFilePointer(); + //Debug.log("CP OFFSET = " + currentOffset); + } + } + + logTruncate(); + } + + /** Truncate any unneeded portion of the log to reduce its space + consumption */ + public synchronized void logTruncate() throws IOException { + preAppend(); + raf.seek(0); + long cpLoc = raf.readLong(); + + long minLogRecord = cpLoc; + + if (cpLoc != -1L) { + raf.seek(cpLoc); + int cpType = raf.readInt(); + @SuppressWarnings("unused") + long cpTid = raf.readLong(); + + if (cpType != CHECKPOINT_RECORD) { + throw new RuntimeException("Checkpoint pointer does not point to checkpoint record"); + } + + int numOutstanding = raf.readInt(); + + for (int i = 0; i < numOutstanding; i++) { + @SuppressWarnings("unused") + long tid = raf.readLong(); + long firstLogRecord = raf.readLong(); + if (firstLogRecord < minLogRecord) { + minLogRecord = firstLogRecord; + } + } + } + + // we can truncate everything before minLogRecord + File newFile = new File("logtmp" + System.currentTimeMillis()); + RandomAccessFile logNew = new RandomAccessFile(newFile, "rw"); + logNew.seek(0); + logNew.writeLong((cpLoc - minLogRecord) + LONG_SIZE); + + raf.seek(minLogRecord); + + //have to rewrite log records since offsets are different after truncation + while (true) { + try { + int type = raf.readInt(); + long record_tid = raf.readLong(); + long newStart = logNew.getFilePointer(); + + Debug.log("NEW START = " + newStart); + + logNew.writeInt(type); + logNew.writeLong(record_tid); + + switch (type) { + case UPDATE_RECORD: + Page before = readPageData(raf); + Page after = readPageData(raf); + + writePageData(logNew, before); + writePageData(logNew, after); + break; + case CHECKPOINT_RECORD: + int numXactions = raf.readInt(); + logNew.writeInt(numXactions); + while (numXactions-- > 0) { + long xid = raf.readLong(); + long xoffset = raf.readLong(); + logNew.writeLong(xid); + logNew.writeLong((xoffset - minLogRecord) + LONG_SIZE); + } + break; + case BEGIN_RECORD: + tidToFirstLogRecord.put(record_tid,newStart); + break; + } + + //all xactions finish with a pointer + logNew.writeLong(newStart); + raf.readLong(); + + } catch (EOFException e) { + break; + } + } + + Debug.log("TRUNCATING LOG; WAS " + raf.length() + " BYTES ; NEW START : " + minLogRecord + " NEW LENGTH: " + (raf.length() - minLogRecord)); + + raf.close(); + logFile.delete(); + newFile.renameTo(logFile); + raf = new RandomAccessFile(logFile, "rw"); + raf.seek(raf.length()); + newFile.delete(); + + currentOffset = raf.getFilePointer(); + //print(); + } + + /** Rollback the specified transaction, setting the state of any + of pages it updated to their pre-updated state. To preserve + transaction semantics, this should not be called on + transactions that have already committed (though this may not + be enforced by this method.) + + @param tid The transaction to rollback + */ + public void rollback(TransactionId tid) + throws NoSuchElementException, IOException { + synchronized (Database.getBufferPool()) { + synchronized(this) { + preAppend(); + // some code goes here + } + } + } + + /** Shutdown the logging system, writing out whatever state + is necessary so that start up can happen quickly (without + extensive recovery.) + */ + public synchronized void shutdown() { + try { + logCheckpoint(); //simple way to shutdown is to write a checkpoint record + raf.close(); + } catch (IOException e) { + System.out.println("ERROR SHUTTING DOWN -- IGNORING."); + e.printStackTrace(); + } + } + + /** Recover the database system by ensuring that the updates of + committed transactions are installed and that the + updates of uncommitted transactions are not installed. + */ + public void recover() throws IOException { + synchronized (Database.getBufferPool()) { + synchronized (this) { + recoveryUndecided = false; + // some code goes here + } + } + } + + /** Print out a human readable represenation of the log */ + public void print() throws IOException { + // some code goes here + } + + public synchronized void force() throws IOException { + raf.getChannel().force(true); + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/LogicalFilterNode.java b/hw/hw3/starter-code/src/java/simpledb/LogicalFilterNode.java new file mode 100644 index 0000000000000000000000000000000000000000..d3b59085efa714a0fba6dd788f0772f1814f39b9 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogicalFilterNode.java @@ -0,0 +1,35 @@ +package simpledb; + +/** A LogicalFilterNode represents the parameters of a filter in the WHERE clause of a query. + <p> + Filter is of the form t.f p c + <p> + Where t is a table, f is a field in t, p is a predicate, and c is a constant +*/ +public class LogicalFilterNode { + /** The alias of a table (or the name if no alias) over which the filter ranges */ + public String tableAlias; + + /** The predicate in the filter */ + public Predicate.Op p; + + /* The constant on the right side of the filter */ + public String c; + + /** The field from t which is in the filter. The pure name, without alias or tablename*/ + public String fieldPureName; + + public String fieldQuantifiedName; + + public LogicalFilterNode(String table, String field, Predicate.Op pred, String constant) { + tableAlias = table; + p = pred; + c = constant; + String[] tmps = field.split("[.]"); + if (tmps.length>1) + fieldPureName = tmps[tmps.length-1]; + else + fieldPureName=field; + this.fieldQuantifiedName = tableAlias+"."+fieldPureName; + } +} \ No newline at end of file diff --git a/hw/hw3/starter-code/src/java/simpledb/LogicalJoinNode.java b/hw/hw3/starter-code/src/java/simpledb/LogicalJoinNode.java new file mode 100644 index 0000000000000000000000000000000000000000..f81cb26ac19286ef2ecbae45aa95b0078f71a3cf --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogicalJoinNode.java @@ -0,0 +1,79 @@ +package simpledb; + +/** A LogicalJoinNode represens the state needed of a join of two + * tables in a LogicalQueryPlan */ +public class LogicalJoinNode { + + /** The first table to join (may be null). It's the alias of the table (if no alias, the true table name) */ + public String t1Alias; + + /** The second table to join (may be null). It's the alias of the table, (if no alias, the true table name).*/ + public String t2Alias; + + /** The name of the field in t1 to join with. It's the pure name of a field, rather that alias.field. */ + public String f1PureName; + + public String f1QuantifiedName; + + /** The name of the field in t2 to join with. It's the pure name of a field.*/ + public String f2PureName; + + public String f2QuantifiedName; + + /** The join predicate */ + public Predicate.Op p; + + public LogicalJoinNode() { + } + + public LogicalJoinNode(String table1, String table2, String joinField1, String joinField2, Predicate.Op pred) { + t1Alias = table1; + t2Alias = table2; + String[] tmps = joinField1.split("[.]"); + if (tmps.length>1) + f1PureName = tmps[tmps.length-1]; + else + f1PureName=joinField1; + tmps = joinField2.split("[.]"); + if (tmps.length>1) + f2PureName = tmps[tmps.length-1]; + else + f2PureName = joinField2; + p = pred; + this.f1QuantifiedName = t1Alias+"."+this.f1PureName; + this.f2QuantifiedName = t2Alias+"."+this.f2PureName; + } + + /** Return a new LogicalJoinNode with the inner and outer (t1.f1 + * and t2.f2) tables swapped. */ + public LogicalJoinNode swapInnerOuter() { + Predicate.Op newp; + if (p == Predicate.Op.GREATER_THAN) + newp = Predicate.Op.LESS_THAN; + else if (p == Predicate.Op.GREATER_THAN_OR_EQ) + newp = Predicate.Op.LESS_THAN_OR_EQ; + else if (p == Predicate.Op.LESS_THAN) + newp = Predicate.Op.GREATER_THAN; + else if (p == Predicate.Op.LESS_THAN_OR_EQ) + newp = Predicate.Op.GREATER_THAN_OR_EQ; + else + newp = p; + + LogicalJoinNode j2 = new LogicalJoinNode(t2Alias,t1Alias,f2PureName,f1PureName, newp); + return j2; + } + + @Override public boolean equals(Object o) { + LogicalJoinNode j2 =(LogicalJoinNode)o; + return (j2.t1Alias.equals(t1Alias) || j2.t1Alias.equals(t2Alias)) && (j2.t2Alias.equals(t1Alias) || j2.t2Alias.equals(t2Alias)); + } + + @Override public String toString() { + return t1Alias + ":" + t2Alias ;//+ ";" + f1 + " " + p + " " + f2; + } + + @Override public int hashCode() { + return t1Alias.hashCode() + t2Alias.hashCode() + f1PureName.hashCode() + f2PureName.hashCode(); + } +} + diff --git a/hw/hw3/starter-code/src/java/simpledb/LogicalPlan.java b/hw/hw3/starter-code/src/java/simpledb/LogicalPlan.java new file mode 100644 index 0000000000000000000000000000000000000000..6809a8e287997f38b13e3f9a04680d4e3df5586c --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogicalPlan.java @@ -0,0 +1,546 @@ +package simpledb; +import java.util.Map; +import java.util.Vector; +import java.util.HashMap; +import java.util.Iterator; +import java.io.File; +import java.util.ArrayList; +import java.util.NoSuchElementException; + +/** + * LogicalPlan represents a logical query plan that has been through + * the parser and is ready to be processed by the optimizer. + * <p> + * A LogicalPlan consits of a collection of table scan nodes, join + * nodes, filter nodes, a select list, and a group by field. + * LogicalPlans can only represent queries with one aggregation field + * and one group by field. + * <p> + * LogicalPlans can be converted to physical (optimized) plans using + * the {@link #physicalPlan} method, which uses the + * {@link JoinOptimizer} to order joins optimally and to select the + * best implementations for joins. + */ +public class LogicalPlan { + private Vector<LogicalJoinNode> joins; + private Vector<LogicalScanNode> tables; + private Vector<LogicalFilterNode> filters; + private HashMap<String,DbIterator> subplanMap; + private HashMap<String,Integer> tableMap; + + private Vector<LogicalSelectListNode> selectList; + private String groupByField = null; + private boolean hasAgg = false; + private String aggOp; + private String aggField; + private boolean oByAsc, hasOrderBy = false; + private String oByField; + private String query; +// private Query owner; + + /** Constructor -- generate an empty logical plan */ + public LogicalPlan() { + joins = new Vector<LogicalJoinNode>(); + filters = new Vector<LogicalFilterNode>(); + tables = new Vector<LogicalScanNode>(); + subplanMap = new HashMap<String,DbIterator>(); + tableMap = new HashMap<String,Integer>(); + + selectList = new Vector<LogicalSelectListNode>(); + this.query = ""; + } + + /** Set the text of the query representing this logical plan. Does NOT parse the + specified query -- this method is just used so that the object can print the + SQL it represents. + + @param query the text of the query associated with this plan + */ + public void setQuery(String query) { + this.query = query; + } + + /** Get the query text associated with this plan via {@link #setQuery}. + */ + public String getQuery() { + return query; + } + + /** Given a table alias, return id of the table object (this id can be supplied to {@link Catalog#getDatabaseFile(int)}). + Aliases are added as base tables are added via {@link #addScan}. + + @param alias the table alias to return a table id for + @return the id of the table corresponding to alias, or null if the alias is unknown + */ + public Integer getTableId(String alias) { + return tableMap.get(alias); + } + + public HashMap<String,Integer> getTableAliasToIdMapping() + { + return this.tableMap; + } + + /** Add a new filter to the logical plan + * @param field The name of the over which the filter applies; + * this can be a fully qualified field (tablename.field or + * alias.field), or can be a unique field name without a + * tablename qualifier. If it is an ambiguous name, it will + * throw a ParsingException + * @param p The predicate for the filter + * @param constantValue the constant to compare the predicate + * against; if field is an integer field, this should be a + * String representing an integer + * @throws ParsingException if field is not in one of the tables + * added via {@link #addScan} or if field is ambiguous (e.g., two + * tables contain a field named field.) + */ + public void addFilter(String field, Predicate.Op p, String + constantValue) throws ParsingException{ + + field = disambiguateName(field); + String table = field.split("[.]")[0]; + + LogicalFilterNode lf = new LogicalFilterNode(table, field.split("[.]")[1], p, constantValue); + filters.addElement(lf); + } + + /** Add a join between two fields of two different tables. + * @param joinField1 The name of the first join field; this can + * be a fully qualified name (e.g., tableName.field or + * alias.field) or may be an unqualified unique field name. If + * the name is ambiguous or unknown, a ParsingException will be + * thrown. + * @param joinField2 The name of the second join field + * @param pred The join predicate + * @throws ParsingException if either of the fields is ambiguous, + * or is not in one of the tables added via {@link #addScan} + */ + + public void addJoin( String joinField1, String joinField2, Predicate.Op pred) throws ParsingException { + joinField1 = disambiguateName(joinField1); + joinField2 = disambiguateName(joinField2); + String table1Alias = joinField1.split("[.]")[0]; + String table2Alias = joinField2.split("[.]")[0]; + String pureField1 = joinField1.split("[.]")[1]; + String pureField2 = joinField2.split("[.]")[1]; + + if (table1Alias.equals(table2Alias)) + throw new ParsingException("Cannot join on two fields from same table"); + LogicalJoinNode lj = new LogicalJoinNode(table1Alias,table2Alias,pureField1, pureField2, pred); + System.out.println("Added join between " + joinField1 + " and " + joinField2); + joins.addElement(lj); + + } + + /** Add a join between a field and a subquery. + * @param joinField1 The name of the first join field; this can + * be a fully qualified name (e.g., tableName.field or + * alias.field) or may be an unqualified unique field name. If + * the name is ambiguous or unknown, a ParsingException will be + * thrown. + * @param joinField2 the subquery to join with -- the join field + * of the subquery is the first field in the result set of the query + * @param pred The join predicate. + * @throws ParsingException if either of the fields is ambiguous, + * or is not in one of the tables added via {@link #addScan} + */ + public void addJoin( String joinField1, DbIterator joinField2, Predicate.Op pred) throws ParsingException { + joinField1 = disambiguateName(joinField1); + + String table1 = joinField1.split("[.]")[0]; + String pureField = joinField1.split("[.]")[1]; + + LogicalSubplanJoinNode lj = new LogicalSubplanJoinNode(table1,pureField, joinField2, pred); + System.out.println("Added subplan join on " + joinField1); + joins.addElement(lj); + } + + /** Add a scan to the plan. One scan node needs to be added for each alias of a table + accessed by the plan. + @param table the id of the table accessed by the plan (can be resolved to a DbFile using {@link Catalog#getDatabaseFile} + @param name the alias of the table in the plan + */ + + public void addScan(int table, String name) { + System.out.println("Added scan of table " + name); + tables.addElement(new LogicalScanNode(table,name)); + tableMap.put(name,table); + } + + /** Add a specified field/aggregate combination to the select list of the query. + Fields are output by the query such that the rightmost field is the first added via addProjectField. + @param fname the field to add to the output + @param aggOp the aggregate operation over the field. + * @throws ParsingException + */ + public void addProjectField(String fname, String aggOp) throws ParsingException { + fname=disambiguateName(fname); + if (fname.equals("*")) + fname="null.*"; + System.out.println("Added select list field " + fname); + if (aggOp != null) { + System.out.println("\t with aggregator " + aggOp); + } + selectList.addElement(new LogicalSelectListNode(aggOp, fname)); + } + + /** Add an aggregate over the field with the specified grouping to + the query. SimpleDb only supports a single aggregate + expression and GROUP BY field. + @param op the aggregation operator + @param afield the field to aggregate over + @param gfield the field to group by + * @throws ParsingException + */ + public void addAggregate(String op, String afield, String gfield) throws ParsingException { + afield=disambiguateName(afield); + if (gfield!=null) + gfield=disambiguateName(gfield); + aggOp = op; + aggField = afield; + groupByField = gfield; + hasAgg = true; + } + + /** Add an ORDER BY expression in the specified order on the specified field. SimpleDb only supports + a single ORDER BY field. + @param field the field to order by + @param asc true if should be ordered in ascending order, false for descending order + * @throws ParsingException + */ + public void addOrderBy(String field, boolean asc) throws ParsingException { + field=disambiguateName(field); + oByField = field; + oByAsc = asc; + hasOrderBy = true; + } + + /** Given a name of a field, try to figure out what table it belongs to by looking + * through all of the tables added via {@link #addScan}. + * @return A fully qualified name of the form tableAlias.name. If the name parameter is already qualified + * with a table name, simply returns name. + * @throws ParsingException if the field cannot be found in any of the tables, or if the + * field is ambiguous (appears in multiple tables) + */ + String disambiguateName(String name) throws ParsingException { + + String[] fields = name.split("[.]"); + if (fields.length == 2 && (!fields[0].equals("null"))) + return name; + if (fields.length > 2) + throw new ParsingException("Field " + name + " is not a valid field reference."); + if (fields.length == 2) + name = fields[1]; + if (name.equals("*")) return name; + //now look for occurrences of name in all of the tables + Iterator<LogicalScanNode> tableIt = tables.iterator(); + String tableName = null; + while (tableIt.hasNext()) { + LogicalScanNode table = tableIt.next(); + try { + TupleDesc td = Database.getCatalog().getDatabaseFile(table.t).getTupleDesc(); +// int id = + td.fieldNameToIndex(name); + if (tableName == null) { + tableName = table.alias; + } else { + throw new ParsingException("Field " + name + " appears in multiple tables; disambiguate by referring to it as tablename." + name); + } + } catch (NoSuchElementException e) { + //ignore + } + } + if (tableName != null) + return tableName + "." + name; + else + throw new ParsingException("Field " + name + " does not appear in any tables."); + + } + + /** Convert the aggregate operator name s into an Aggregator.op operation. + * @throws ParsingException if s is not a valid operator name + */ + static Aggregator.Op getAggOp(String s) throws ParsingException { + s = s.toUpperCase(); + if (s.equals("AVG")) return Aggregator.Op.AVG; + if (s.equals("SUM")) return Aggregator.Op.SUM; + if (s.equals("COUNT")) return Aggregator.Op.COUNT; + if (s.equals("MIN")) return Aggregator.Op.MIN; + if (s.equals("MAX")) return Aggregator.Op.MAX; + throw new ParsingException("Unknown predicate " + s); + } + + /** Convert this LogicalPlan into a physicalPlan represented by a {@link DbIterator}. Attempts to + * find the optimal plan by using {@link JoinOptimizer#orderJoins} to order the joins in the plan. + * @param t The transaction that the returned DbIterator will run as a part of + * @param baseTableStats a HashMap providing a {@link TableStats} + * object for each table used in the LogicalPlan. This should + * have one entry for each table referenced by the plan, not one + * entry for each table alias (so a table t aliases as t1 and + * t2 would have just one entry with key 't' in this HashMap). + * @param explain flag indicating whether output visualizing the physical + * query plan should be given. + * @throws ParsingException if the logical plan is not valid + * @return A DbIterator representing this plan. + */ + public DbIterator physicalPlan(TransactionId t, Map<String,TableStats> baseTableStats, boolean explain) throws ParsingException { + Iterator<LogicalScanNode> tableIt = tables.iterator(); + HashMap<String,String> equivMap = new HashMap<String,String>(); + HashMap<String,Double> filterSelectivities = new HashMap<String, Double>(); + HashMap<String,TableStats> statsMap = new HashMap<String,TableStats>(); + + while (tableIt.hasNext()) { + LogicalScanNode table = tableIt.next(); + SeqScan ss = null; + try { + ss = new SeqScan(t, Database.getCatalog().getDatabaseFile(table.t).getId(), table.alias); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown table " + table.t); + } + + subplanMap.put(table.alias,ss); + String baseTableName = Database.getCatalog().getTableName(table.t); + statsMap.put(baseTableName, baseTableStats.get(baseTableName)); + filterSelectivities.put(table.alias, 1.0); + + } + + Iterator<LogicalFilterNode> filterIt = filters.iterator(); + while (filterIt.hasNext()) { + LogicalFilterNode lf = filterIt.next(); + DbIterator subplan = subplanMap.get(lf.tableAlias); + if (subplan == null) { + throw new ParsingException("Unknown table in WHERE clause " + lf.tableAlias); + } + + Field f; + Type ftyp; + TupleDesc td = subplanMap.get(lf.tableAlias).getTupleDesc(); + + try {//td.fieldNameToIndex(disambiguateName(lf.fieldPureName)) + ftyp = td.getFieldType(td.fieldNameToIndex(lf.fieldQuantifiedName)); + } catch (java.util.NoSuchElementException e) { + throw new ParsingException("Unknown field in filter expression " + lf.fieldQuantifiedName); + } + if (ftyp == Type.INT_TYPE) + f = new IntField(new Integer(lf.c).intValue()); + else + f = new StringField(lf.c, Type.STRING_LEN); + + Predicate p = null; + try { + p = new Predicate(subplan.getTupleDesc().fieldNameToIndex(lf.fieldQuantifiedName), lf.p,f); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown field " + lf.fieldQuantifiedName); + } + subplanMap.put(lf.tableAlias, new Filter(p, subplan)); + + TableStats s = statsMap.get(Database.getCatalog().getTableName(this.getTableId(lf.tableAlias))); + + double sel= s.estimateSelectivity(subplan.getTupleDesc().fieldNameToIndex(lf.fieldQuantifiedName), lf.p, f); + filterSelectivities.put(lf.tableAlias, filterSelectivities.get(lf.tableAlias) * sel); + + //s.addSelectivityFactor(estimateFilterSelectivity(lf,statsMap)); + } + + JoinOptimizer jo = new JoinOptimizer(this,joins); + + joins = jo.orderJoins(statsMap,filterSelectivities,explain); + + Iterator<LogicalJoinNode> joinIt = joins.iterator(); + while (joinIt.hasNext()) { + LogicalJoinNode lj = joinIt.next(); + DbIterator plan1; + DbIterator plan2; + boolean isSubqueryJoin = lj instanceof LogicalSubplanJoinNode; + String t1name, t2name; + + if (equivMap.get(lj.t1Alias)!=null) + t1name = equivMap.get(lj.t1Alias); + else + t1name = lj.t1Alias; + + if (equivMap.get(lj.t2Alias)!=null) + t2name = equivMap.get(lj.t2Alias); + else + t2name = lj.t2Alias; + + plan1 = subplanMap.get(t1name); + + if (isSubqueryJoin) { + plan2 = ((LogicalSubplanJoinNode)lj).subPlan; + if (plan2 == null) + throw new ParsingException("Invalid subquery."); + } else { + plan2 = subplanMap.get(t2name); + } + + if (plan1 == null) + throw new ParsingException("Unknown table in WHERE clause " + lj.t1Alias); + if (plan2 == null) + throw new ParsingException("Unknown table in WHERE clause " + lj.t2Alias); + + DbIterator j; + j = jo.instantiateJoin(lj,plan1,plan2); + subplanMap.put(t1name, j); + + if (!isSubqueryJoin) { + subplanMap.remove(t2name); + equivMap.put(t2name,t1name); //keep track of the fact that this new node contains both tables + //make sure anything that was equiv to lj.t2 (which we are just removed) is + // marked as equiv to lj.t1 (which we are replacing lj.t2 with.) + for (java.util.Map.Entry<String, String> s: equivMap.entrySet()) { + String val = s.getValue(); + if (val.equals(t2name)) { + s.setValue(t1name); + } + } + + // subplanMap.put(lj.t2, j); + } + + } + + if (subplanMap.size() > 1) { + throw new ParsingException("Query does not include join expressions joining all nodes!"); + } + + DbIterator node = (DbIterator)(subplanMap.entrySet().iterator().next().getValue()); + + //walk the select list, to determine order in which to project output fields + ArrayList<Integer> outFields = new ArrayList<Integer>(); + ArrayList<Type> outTypes = new ArrayList<Type>(); + for (int i = 0; i < selectList.size(); i++) { + LogicalSelectListNode si = selectList.elementAt(i); + if (si.aggOp != null) { + outFields.add(groupByField!=null?1:0); + TupleDesc td = node.getTupleDesc(); +// int id; + try { +// id = + td.fieldNameToIndex(si.fname); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown field " + si.fname + " in SELECT list"); + } + outTypes.add(Type.INT_TYPE); //the type of all aggregate functions is INT + + } else if (hasAgg) { + if (groupByField == null) { + throw new ParsingException("Field " + si.fname + " does not appear in GROUP BY list"); + } + outFields.add(0); + TupleDesc td = node.getTupleDesc(); + int id; + try { + id = td.fieldNameToIndex(groupByField); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown field " + groupByField + " in GROUP BY statement"); + } + outTypes.add(td.getFieldType(id)); + } else if (si.fname.equals("null.*")) { + TupleDesc td = node.getTupleDesc(); + for ( i = 0; i < td.numFields(); i++) { + outFields.add(i); + outTypes.add(td.getFieldType(i)); + } + } else { + TupleDesc td = node.getTupleDesc(); + int id; + try { + id = td.fieldNameToIndex(si.fname); + } catch (NoSuchElementException e) { + throw new ParsingException("Unknown field " + si.fname + " in SELECT list"); + } + outFields.add(id); + outTypes.add(td.getFieldType(id)); + + } + } + + if (hasAgg) { + TupleDesc td = node.getTupleDesc(); + Aggregate aggNode; + try { + aggNode = new Aggregate(node, + td.fieldNameToIndex(aggField), + groupByField == null?Aggregator.NO_GROUPING:td.fieldNameToIndex(groupByField), + getAggOp(aggOp)); + } catch (NoSuchElementException e) { + throw new simpledb.ParsingException(e); + } catch (IllegalArgumentException e) { + throw new simpledb.ParsingException(e); + } + node = aggNode; + } + + if (hasOrderBy) { + node = new OrderBy(node.getTupleDesc().fieldNameToIndex(oByField), oByAsc, node); + } + + return new Project(outFields, outTypes, node); + } + + public static void main(String argv[]) { + // construct a 3-column table schema + Type types[] = new Type[]{ Type.INT_TYPE, Type.INT_TYPE, Type.INT_TYPE }; + String names[] = new String[]{ "field0", "field1", "field2" }; + + TupleDesc td = new TupleDesc(types, names); + TableStats ts; + HashMap<String, TableStats> tableMap = new HashMap<String,TableStats>(); + + // create the tables, associate them with the data files + // and tell the catalog about the schema the tables. + HeapFile table1 = new HeapFile(new File("some_data_file1.dat"), td); + Database.getCatalog().addTable(table1, "t1"); + ts = new TableStats(table1.getId(), 1); + tableMap.put("t1", ts); + + TransactionId tid = new TransactionId(); + + LogicalPlan lp = new LogicalPlan(); + + lp.addScan(table1.getId(), "t1"); + + try { + lp.addFilter("t1.field0", Predicate.Op.GREATER_THAN, "1"); + } catch (Exception e) { + } + + /* + SeqScan ss1 = new SeqScan(tid, table1.getId(), "t1"); + SeqScan ss2 = new SeqScan(tid, table2.getId(), "t2"); + + // create a filter for the where condition + Filter sf1 = new Filter( + new Predicate(0, + Predicate.Op.GREATER_THAN, new IntField(1)), ss1); + + JoinPredicate p = new JoinPredicate(1, Predicate.Op.EQUALS, 1); + Join j = new Join(p, sf1, ss2); + */ + DbIterator j = null; + try { + j = lp.physicalPlan(tid,tableMap, false); + } catch (ParsingException e) { + e.printStackTrace(); + System.exit(0); + } + // and run it + try { + j.open(); + while (j.hasNext()) { + Tuple tup = j.next(); + System.out.println(tup); + } + j.close(); + Database.getBufferPool().transactionComplete(tid); + + } catch (Exception e) { + e.printStackTrace(); + } + + } + +} \ No newline at end of file diff --git a/hw/hw3/starter-code/src/java/simpledb/LogicalScanNode.java b/hw/hw3/starter-code/src/java/simpledb/LogicalScanNode.java new file mode 100644 index 0000000000000000000000000000000000000000..476a25d82df93a08840b21bf452c55408ff24259 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogicalScanNode.java @@ -0,0 +1,19 @@ +package simpledb; + +/** A LogicalScanNode represents table in the FROM list in a + * LogicalQueryPlan */ +public class LogicalScanNode { + + /** The name (alias) of the table as it is used in the query */ + public String alias; + + /** The table identifier (can be passed to {@link Catalog#getDatabaseFile}) + * to retrieve a DbFile */ + public int t; + + public LogicalScanNode(int table, String tableAlias) { + this.alias = tableAlias; + this.t = table; + } +} + diff --git a/hw/hw3/starter-code/src/java/simpledb/LogicalSelectListNode.java b/hw/hw3/starter-code/src/java/simpledb/LogicalSelectListNode.java new file mode 100644 index 0000000000000000000000000000000000000000..c32e718e1bfaea53419da3fdf85ac74e4ee2b1e5 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogicalSelectListNode.java @@ -0,0 +1,19 @@ +package simpledb; + +/** A LogicalSelectListNode represents a clause in the select list in + * a LogicalQueryPlan +*/ +public class LogicalSelectListNode { + /** The field name being selected; the name may be (optionally) be + * qualified with a table name or alias. + */ + public String fname; + + /** The aggregation operation over the field (if any) */ + public String aggOp; + + public LogicalSelectListNode(String aggOp, String fname) { + this.aggOp = aggOp; + this.fname = fname; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/LogicalSubplanJoinNode.java b/hw/hw3/starter-code/src/java/simpledb/LogicalSubplanJoinNode.java new file mode 100644 index 0000000000000000000000000000000000000000..be7f803fd74ffc3b0c2fa86ba5e0ee6be51a4711 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/LogicalSubplanJoinNode.java @@ -0,0 +1,41 @@ +package simpledb; + +/** A LogicalSubplanJoinNode represens the state needed of a join of a + * table to a subplan in a LogicalQueryPlan -- inherits state from + * {@link LogicalJoinNode}; t2 and f2 should always be null + */ +public class LogicalSubplanJoinNode extends LogicalJoinNode { + + /** The subplan (used on the inner) of the join */ + DbIterator subPlan; + + public LogicalSubplanJoinNode(String table1, String joinField1, DbIterator sp, Predicate.Op pred) { + t1Alias = table1; + String[] tmps = joinField1.split("[.]"); + if (tmps.length>1) + f1PureName = tmps[tmps.length-1]; + else + f1PureName=joinField1; + f1QuantifiedName=t1Alias+"."+f1PureName; + subPlan = sp; + p = pred; + } + + @Override public int hashCode() { + return t1Alias.hashCode() + f1PureName.hashCode() + subPlan.hashCode(); + } + + @Override public boolean equals(Object o) { + LogicalJoinNode j2 =(LogicalJoinNode)o; + if (!(o instanceof LogicalSubplanJoinNode)) + return false; + + return (j2.t1Alias.equals(t1Alias) && j2.f1PureName.equals(f1PureName) && ((LogicalSubplanJoinNode)o).subPlan.equals(subPlan)); + } + + public LogicalSubplanJoinNode swapInnerOuter() { + LogicalSubplanJoinNode j2 = new LogicalSubplanJoinNode(t1Alias,f1PureName,subPlan, p); + return j2; + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Operator.java b/hw/hw3/starter-code/src/java/simpledb/Operator.java new file mode 100644 index 0000000000000000000000000000000000000000..356107811a2852b91c1e81e63ef279bb1a501d24 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Operator.java @@ -0,0 +1,107 @@ +package simpledb; + +import java.util.NoSuchElementException; + +/** + * Abstract class for implementing operators. It handles <code>close</code>, + * <code>next</code> and <code>hasNext</code>. Subclasses only need to implement + * <code>open</code> and <code>readNext</code>. + */ +public abstract class Operator implements DbIterator { + + private static final long serialVersionUID = 1L; + + public boolean hasNext() throws DbException, TransactionAbortedException { + if (!this.open) + throw new IllegalStateException("Operator not yet open"); + + if (next == null) + next = fetchNext(); + return next != null; + } + + public Tuple next() throws DbException, TransactionAbortedException, + NoSuchElementException { + if (next == null) { + next = fetchNext(); + if (next == null) + throw new NoSuchElementException(); + } + + Tuple result = next; + next = null; + return result; + } + + /** + * Returns the next Tuple in the iterator, or null if the iteration is + * finished. Operator uses this method to implement both <code>next</code> + * and <code>hasNext</code>. + * + * @return the next Tuple in the iterator, or null if the iteration is + * finished. + */ + protected abstract Tuple fetchNext() throws DbException, + TransactionAbortedException; + + /** + * Closes this iterator. If overridden by a subclass, they should call + * super.close() in order for Operator's internal state to be consistent. + */ + public void close() { + // Ensures that a future call to next() will fail + next = null; + this.open = false; + } + + private Tuple next = null; + private boolean open = false; + private int estimatedCardinality = 0; + + public void open() throws DbException, TransactionAbortedException { + this.open = true; + } + + /** + * @return return the children DbIterators of this operator. If there is + * only one child, return an array of only one element. For join + * operators, the order of the children is not important. But they + * should be consistent among multiple calls. + * */ + public abstract DbIterator[] getChildren(); + + /** + * Set the children(child) of this operator. If the operator has only one + * child, children[0] should be used. If the operator is a join, children[0] + * and children[1] should be used. + * + * + * @param children + * the DbIterators which are to be set as the children(child) of + * this operator + * */ + public abstract void setChildren(DbIterator[] children); + + /** + * @return return the TupleDesc of the output tuples of this operator + * */ + public abstract TupleDesc getTupleDesc(); + + /** + * @return The estimated cardinality of this operator. Will only be used in + * lab6 + * */ + public int getEstimatedCardinality() { + return this.estimatedCardinality; + } + + /** + * @param card + * The estimated cardinality of this operator Will only be used + * in lab6 + * */ + protected void setEstimatedCardinality(int card) { + this.estimatedCardinality = card; + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/OrderBy.java b/hw/hw3/starter-code/src/java/simpledb/OrderBy.java new file mode 100644 index 0000000000000000000000000000000000000000..0874c152950d1da66477ccf73ebae74cb74b8310 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/OrderBy.java @@ -0,0 +1,123 @@ +package simpledb; + +import java.util.*; + +/** + * OrderBy is an operator that implements a relational ORDER BY. + */ +public class OrderBy extends Operator { + + private static final long serialVersionUID = 1L; + private DbIterator child; + private TupleDesc td; + private ArrayList<Tuple> childTups = new ArrayList<Tuple>(); + private int orderByField; + private String orderByFieldName; + private Iterator<Tuple> it; + private boolean asc; + + /** + * Creates a new OrderBy node over the tuples from the iterator. + * + * @param orderbyField + * the field to which the sort is applied. + * @param asc + * true if the sort order is ascending. + * @param child + * the tuples to sort. + */ + public OrderBy(int orderbyField, boolean asc, DbIterator child) { + this.child = child; + td = child.getTupleDesc(); + this.orderByField = orderbyField; + this.orderByFieldName = td.getFieldName(orderbyField); + this.asc = asc; + } + + public boolean isASC() + { + return this.asc; + } + + public int getOrderByField() + { + return this.orderByField; + } + + public String getOrderFieldName() + { + return this.orderByFieldName; + } + + public TupleDesc getTupleDesc() { + return td; + } + + public void open() throws DbException, NoSuchElementException, + TransactionAbortedException { + child.open(); + // load all the tuples in a collection, and sort it + while (child.hasNext()) + childTups.add((Tuple) child.next()); + Collections.sort(childTups, new TupleComparator(orderByField, asc)); + it = childTups.iterator(); + super.open(); + } + + public void close() { + super.close(); + it = null; + } + + public void rewind() throws DbException, TransactionAbortedException { + it = childTups.iterator(); + } + + /** + * Operator.fetchNext implementation. Returns tuples from the child operator + * in order + * + * @return The next tuple in the ordering, or null if there are no more + * tuples + */ + protected Tuple fetchNext() throws NoSuchElementException, + TransactionAbortedException, DbException { + if (it != null && it.hasNext()) { + return it.next(); + } else + return null; + } + + @Override + public DbIterator[] getChildren() { + return new DbIterator[] { this.child }; + } + + @Override + public void setChildren(DbIterator[] children) { + this.child = children[0]; + } + +} + +class TupleComparator implements Comparator<Tuple> { + int field; + boolean asc; + + public TupleComparator(int field, boolean asc) { + this.field = field; + this.asc = asc; + } + + public int compare(Tuple o1, Tuple o2) { + Field t1 = (o1).getField(field); + Field t2 = (o2).getField(field); + if (t1.compare(Predicate.Op.EQUALS, t2)) + return 0; + if (t1.compare(Predicate.Op.GREATER_THAN, t2)) + return asc ? 1 : -1; + else + return asc ? -1 : 1; + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Page.java b/hw/hw3/starter-code/src/java/simpledb/Page.java new file mode 100644 index 0000000000000000000000000000000000000000..1f7ab022db7424f853f0ba41ef710b56408bace7 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Page.java @@ -0,0 +1,59 @@ +package simpledb; + +/** + * Page is the interface used to represent pages that are resident in the + * BufferPool. Typically, DbFiles will read and write pages from disk. + * <p> + * Pages may be "dirty", indicating that they have been modified since they + * were last written out to disk. + * + * For recovery purposes, pages MUST have a single constructor of the form: + * Page(PageId id, byte[] data) + */ +public interface Page { + + /** + * Return the id of this page. The id is a unique identifier for a page + * that can be used to look up the page on disk or determine if the page + * is resident in the buffer pool. + * + * @return the id of this page + */ + public PageId getId(); + + /** + * Get the id of the transaction that last dirtied this page, or null if the page is clean.. + * + * @return The id of the transaction that last dirtied this page, or null + */ + public TransactionId isDirty(); + + /** + * Set the dirty state of this page as dirtied by a particular transaction + */ + public void markDirty(boolean dirty, TransactionId tid); + + /** + * Generates a byte array representing the contents of this page. + * Used to serialize this page to disk. + * <p> + * The invariant here is that it should be possible to pass the byte array + * generated by getPageData to the Page constructor and have it produce + * an identical Page object. + * + * @return A byte array correspond to the bytes of this page. + */ + + public byte[] getPageData(); + + /** Provide a representation of this page before any modifications were made + to it. Used by recovery. + */ + public Page getBeforeImage(); + + /* + * a transaction that wrote this page just committed it. + * copy current content to the before image. + */ + public void setBeforeImage(); +} diff --git a/hw/hw3/starter-code/src/java/simpledb/PageId.java b/hw/hw3/starter-code/src/java/simpledb/PageId.java new file mode 100644 index 0000000000000000000000000000000000000000..679ffaacf1b1b6a2c089b82058d34c065dcb33d0 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/PageId.java @@ -0,0 +1,36 @@ +package simpledb; + +/** PageId is an interface to a specific page of a specific table. */ +public interface PageId { + + /** Return a representation of this page id object as a collection of + integers (used for logging) + + This class MUST have a constructor that accepts n integer parameters, + where n is the number of integers returned in the array from serialize. + */ + public int[] serialize(); + + /** @return the unique tableid hashcode with this PageId */ + public int getTableId(); + + /** + * @return a hash code for this page, represented by the concatenation of + * the table number and the page number (needed if a PageId is used as a + * key in a hash table in the BufferPool, for example.) + * @see BufferPool + */ + public int hashCode(); + + /** + * Compares one PageId to another. + * + * @param o The object to compare against (must be a PageId) + * @return true if the objects are equal (e.g., page numbers and table + * ids are the same) + */ + public boolean equals(Object o); + + public int pageNumber(); +} + diff --git a/hw/hw3/starter-code/src/java/simpledb/Parser.java b/hw/hw3/starter-code/src/java/simpledb/Parser.java new file mode 100644 index 0000000000000000000000000000000000000000..e372f7598d6ab04b8bf9f090dc3d7facba88abac --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Parser.java @@ -0,0 +1,754 @@ +package simpledb; + +import Zql.*; +import java.io.*; +import java.lang.reflect.InvocationTargetException; +import java.util.*; + +import jline.ArgumentCompletor; +import jline.ConsoleReader; +import jline.SimpleCompletor; + +public class Parser { + static boolean explain = false; + + public static Predicate.Op getOp(String s) throws simpledb.ParsingException { + if (s.equals("=")) + return Predicate.Op.EQUALS; + if (s.equals(">")) + return Predicate.Op.GREATER_THAN; + if (s.equals(">=")) + return Predicate.Op.GREATER_THAN_OR_EQ; + if (s.equals("<")) + return Predicate.Op.LESS_THAN; + if (s.equals("<=")) + return Predicate.Op.LESS_THAN_OR_EQ; + if (s.equals("LIKE")) + return Predicate.Op.LIKE; + if (s.equals("~")) + return Predicate.Op.LIKE; + if (s.equals("<>")) + return Predicate.Op.NOT_EQUALS; + if (s.equals("!=")) + return Predicate.Op.NOT_EQUALS; + + throw new simpledb.ParsingException("Unknown predicate " + s); + } + + void processExpression(TransactionId tid, ZExpression wx, LogicalPlan lp) + throws simpledb.ParsingException { + if (wx.getOperator().equals("AND")) { + for (int i = 0; i < wx.nbOperands(); i++) { + if (!(wx.getOperand(i) instanceof ZExpression)) { + throw new simpledb.ParsingException( + "Nested queries are currently unsupported."); + } + ZExpression newWx = (ZExpression) wx.getOperand(i); + processExpression(tid, newWx, lp); + + } + } else if (wx.getOperator().equals("OR")) { + throw new simpledb.ParsingException( + "OR expressions currently unsupported."); + } else { + // this is a binary expression comparing two constants + @SuppressWarnings("unchecked") + Vector<ZExp> ops = wx.getOperands(); + if (ops.size() != 2) { + throw new simpledb.ParsingException( + "Only simple binary expresssions of the form A op B are currently supported."); + } + + boolean isJoin = false; + Predicate.Op op = getOp(wx.getOperator()); + + boolean op1const = ops.elementAt(0) instanceof ZConstant; // otherwise + // is a + // Query + boolean op2const = ops.elementAt(1) instanceof ZConstant; // otherwise + // is a + // Query + if (op1const && op2const) { + isJoin = ((ZConstant) ops.elementAt(0)).getType() == ZConstant.COLUMNNAME + && ((ZConstant) ops.elementAt(1)).getType() == ZConstant.COLUMNNAME; + } else if (ops.elementAt(0) instanceof ZQuery + || ops.elementAt(1) instanceof ZQuery) { + isJoin = true; + } else if (ops.elementAt(0) instanceof ZExpression + || ops.elementAt(1) instanceof ZExpression) { + throw new simpledb.ParsingException( + "Only simple binary expresssions of the form A op B are currently supported, where A or B are fields, constants, or subqueries."); + } else + isJoin = false; + + if (isJoin) { // join node + + String tab1field = "", tab2field = ""; + + if (!op1const) { // left op is a nested query + // generate a virtual table for the left op + // this isn't a valid ZQL query + } else { + tab1field = ((ZConstant) ops.elementAt(0)).getValue(); + + } + + if (!op2const) { // right op is a nested query + try { + LogicalPlan sublp = parseQueryLogicalPlan(tid, + (ZQuery) ops.elementAt(1)); + DbIterator pp = sublp.physicalPlan(tid, + TableStats.getStatsMap(), explain); + lp.addJoin(tab1field, pp, op); + } catch (IOException e) { + throw new simpledb.ParsingException("Invalid subquery " + + ops.elementAt(1)); + } catch (Zql.ParseException e) { + throw new simpledb.ParsingException("Invalid subquery " + + ops.elementAt(1)); + } + } else { + tab2field = ((ZConstant) ops.elementAt(1)).getValue(); + lp.addJoin(tab1field, tab2field, op); + } + + } else { // select node + String column; + String compValue; + ZConstant op1 = (ZConstant) ops.elementAt(0); + ZConstant op2 = (ZConstant) ops.elementAt(1); + if (op1.getType() == ZConstant.COLUMNNAME) { + column = op1.getValue(); + compValue = new String(op2.getValue()); + } else { + column = op2.getValue(); + compValue = new String(op1.getValue()); + } + + lp.addFilter(column, op, compValue); + + } + } + + } + + public LogicalPlan parseQueryLogicalPlan(TransactionId tid, ZQuery q) + throws IOException, Zql.ParseException, simpledb.ParsingException { + @SuppressWarnings("unchecked") + Vector<ZFromItem> from = q.getFrom(); + LogicalPlan lp = new LogicalPlan(); + lp.setQuery(q.toString()); + // walk through tables in the FROM clause + for (int i = 0; i < from.size(); i++) { + ZFromItem fromIt = from.elementAt(i); + try { + + int id = Database.getCatalog().getTableId(fromIt.getTable()); // will + // fall + // through + // if + // table + // doesn't + // exist + String name; + + if (fromIt.getAlias() != null) + name = fromIt.getAlias(); + else + name = fromIt.getTable(); + + lp.addScan(id, name); + + // XXX handle subquery? + } catch (NoSuchElementException e) { + e.printStackTrace(); + throw new simpledb.ParsingException("Table " + + fromIt.getTable() + " is not in catalog"); + } + } + + // now parse the where clause, creating Filter and Join nodes as needed + ZExp w = q.getWhere(); + if (w != null) { + + if (!(w instanceof ZExpression)) { + throw new simpledb.ParsingException( + "Nested queries are currently unsupported."); + } + ZExpression wx = (ZExpression) w; + processExpression(tid, wx, lp); + + } + + // now look for group by fields + ZGroupBy gby = q.getGroupBy(); + String groupByField = null; + if (gby != null) { + @SuppressWarnings("unchecked") + Vector<ZExp> gbs = gby.getGroupBy(); + if (gbs.size() > 1) { + throw new simpledb.ParsingException( + "At most one grouping field expression supported."); + } + if (gbs.size() == 1) { + ZExp gbe = gbs.elementAt(0); + if (!(gbe instanceof ZConstant)) { + throw new simpledb.ParsingException( + "Complex grouping expressions (" + gbe + + ") not supported."); + } + groupByField = ((ZConstant) gbe).getValue(); + System.out.println("GROUP BY FIELD : " + groupByField); + } + + } + + // walk the select list, pick out aggregates, and check for query + // validity + @SuppressWarnings("unchecked") + Vector<ZSelectItem> selectList = q.getSelect(); + String aggField = null; + String aggFun = null; + + for (int i = 0; i < selectList.size(); i++) { + ZSelectItem si = selectList.elementAt(i); + if (si.getAggregate() == null + && (si.isExpression() && !(si.getExpression() instanceof ZConstant))) { + throw new simpledb.ParsingException( + "Expressions in SELECT list are not supported."); + } + if (si.getAggregate() != null) { + if (aggField != null) { + throw new simpledb.ParsingException( + "Aggregates over multiple fields not supported."); + } + aggField = ((ZConstant) ((ZExpression) si.getExpression()) + .getOperand(0)).getValue(); + aggFun = si.getAggregate(); + System.out.println("Aggregate field is " + aggField + + ", agg fun is : " + aggFun); + lp.addProjectField(aggField, aggFun); + } else { + if (groupByField != null + && !(groupByField.equals(si.getTable() + "." + + si.getColumn()) || groupByField.equals(si + .getColumn()))) { + throw new simpledb.ParsingException("Non-aggregate field " + + si.getColumn() + + " does not appear in GROUP BY list."); + } + lp.addProjectField(si.getTable() + "." + si.getColumn(), null); + } + } + + if (groupByField != null && aggFun == null) { + throw new simpledb.ParsingException("GROUP BY without aggregation."); + } + + if (aggFun != null) { + lp.addAggregate(aggFun, aggField, groupByField); + } + // sort the data + + if (q.getOrderBy() != null) { + @SuppressWarnings("unchecked") + Vector<ZOrderBy> obys = q.getOrderBy(); + if (obys.size() > 1) { + throw new simpledb.ParsingException( + "Multi-attribute ORDER BY is not supported."); + } + ZOrderBy oby = obys.elementAt(0); + if (!(oby.getExpression() instanceof ZConstant)) { + throw new simpledb.ParsingException( + "Complex ORDER BY's are not supported"); + } + ZConstant f = (ZConstant) oby.getExpression(); + + lp.addOrderBy(f.getValue(), oby.getAscOrder()); + + } + return lp; + } + + private Transaction curtrans = null; + private boolean inUserTrans = false; + + public Query handleQueryStatement(ZQuery s, TransactionId tId) + throws TransactionAbortedException, DbException, IOException, + simpledb.ParsingException, Zql.ParseException { + Query query = new Query(tId); + + LogicalPlan lp = parseQueryLogicalPlan(tId, s); + DbIterator physicalPlan = lp.physicalPlan(tId, + TableStats.getStatsMap(), explain); + query.setPhysicalPlan(physicalPlan); + query.setLogicalPlan(lp); + + if (physicalPlan != null) { + Class<?> c; + try { + c = Class.forName("simpledb.OperatorCardinality"); + + Class<?> p = Operator.class; + Class<?> h = Map.class; + + java.lang.reflect.Method m = c.getMethod( + "updateOperatorCardinality", p, h, h); + + System.out.println("The query plan is:"); + m.invoke(null, (Operator) physicalPlan, + lp.getTableAliasToIdMapping(), TableStats.getStatsMap()); + c = Class.forName("simpledb.QueryPlanVisualizer"); + m = c.getMethod( + "printQueryPlanTree", DbIterator.class, System.out.getClass()); + m.invoke(c.newInstance(), physicalPlan,System.out); + } catch (ClassNotFoundException e) { + } catch (SecurityException e) { + } catch (NoSuchMethodException e) { + e.printStackTrace(); + } catch (IllegalArgumentException e) { + e.printStackTrace(); + } catch (IllegalAccessException e) { + e.printStackTrace(); + } catch (InvocationTargetException e) { + e.printStackTrace(); + } catch (InstantiationException e) { + e.printStackTrace(); + } + } + + return query; + } + + public Query handleInsertStatement(ZInsert s, TransactionId tId) + throws TransactionAbortedException, DbException, IOException, + simpledb.ParsingException, Zql.ParseException { + int tableId; + try { + tableId = Database.getCatalog().getTableId(s.getTable()); // will + // fall + // through if + // table + // doesn't + // exist + } catch (NoSuchElementException e) { + throw new simpledb.ParsingException("Unknown table : " + + s.getTable()); + } + + TupleDesc td = Database.getCatalog().getTupleDesc(tableId); + + Tuple t = new Tuple(td); + int i = 0; + DbIterator newTups; + + if (s.getValues() != null) { + @SuppressWarnings("unchecked") + Vector<ZExp> values = (Vector<ZExp>) s.getValues(); + if (td.numFields() != values.size()) { + throw new simpledb.ParsingException( + "INSERT statement does not contain same number of fields as table " + + s.getTable()); + } + for (ZExp e : values) { + + if (!(e instanceof ZConstant)) + throw new simpledb.ParsingException( + "Complex expressions not allowed in INSERT statements."); + ZConstant zc = (ZConstant) e; + if (zc.getType() == ZConstant.NUMBER) { + if (td.getFieldType(i) != Type.INT_TYPE) { + throw new simpledb.ParsingException("Value " + + zc.getValue() + + " is not an integer, expected a string."); + } + IntField f = new IntField(new Integer(zc.getValue())); + t.setField(i, f); + } else if (zc.getType() == ZConstant.STRING) { + if (td.getFieldType(i) != Type.STRING_TYPE) { + throw new simpledb.ParsingException("Value " + + zc.getValue() + + " is a string, expected an integer."); + } + StringField f = new StringField(zc.getValue(), + Type.STRING_LEN); + t.setField(i, f); + } else { + throw new simpledb.ParsingException( + "Only string or int fields are supported."); + } + + i++; + } + ArrayList<Tuple> tups = new ArrayList<Tuple>(); + tups.add(t); + newTups = new TupleArrayIterator(tups); + + } else { + ZQuery zq = (ZQuery) s.getQuery(); + LogicalPlan lp = parseQueryLogicalPlan(tId, zq); + newTups = lp.physicalPlan(tId, TableStats.getStatsMap(), explain); + } + Query insertQ = new Query(tId); + insertQ.setPhysicalPlan(new Insert(tId, newTups, tableId)); + return insertQ; + } + + public Query handleDeleteStatement(ZDelete s, TransactionId tid) + throws TransactionAbortedException, DbException, IOException, + simpledb.ParsingException, Zql.ParseException { + int id; + try { + id = Database.getCatalog().getTableId(s.getTable()); // will fall + // through if + // table + // doesn't + // exist + } catch (NoSuchElementException e) { + throw new simpledb.ParsingException("Unknown table : " + + s.getTable()); + } + String name = s.getTable(); + Query sdbq = new Query(tid); + + LogicalPlan lp = new LogicalPlan(); + lp.setQuery(s.toString()); + + lp.addScan(id, name); + if (s.getWhere() != null) + processExpression(tid, (ZExpression) s.getWhere(), lp); + lp.addProjectField("null.*", null); + + DbIterator op = new Delete(tid, lp.physicalPlan(tid, + TableStats.getStatsMap(), false)); + sdbq.setPhysicalPlan(op); + + return sdbq; + + } + + public void handleTransactStatement(ZTransactStmt s) + throws TransactionAbortedException, DbException, IOException, + simpledb.ParsingException, Zql.ParseException { + if (s.getStmtType().equals("COMMIT")) { + if (curtrans == null) + throw new simpledb.ParsingException( + "No transaction is currently running"); + curtrans.commit(); + curtrans = null; + inUserTrans = false; + System.out.println("Transaction " + curtrans.getId().getId() + + " committed."); + } else if (s.getStmtType().equals("ROLLBACK")) { + if (curtrans == null) + throw new simpledb.ParsingException( + "No transaction is currently running"); + curtrans.abort(); + curtrans = null; + inUserTrans = false; + System.out.println("Transaction " + curtrans.getId().getId() + + " aborted."); + + } else if (s.getStmtType().equals("SET TRANSACTION")) { + if (curtrans != null) + throw new simpledb.ParsingException( + "Can't start new transactions until current transaction has been committed or rolledback."); + curtrans = new Transaction(); + curtrans.start(); + inUserTrans = true; + System.out.println("Started a new transaction tid = " + + curtrans.getId().getId()); + } else { + throw new simpledb.ParsingException("Unsupported operation"); + } + } + + public LogicalPlan generateLogicalPlan(TransactionId tid, String s) + throws simpledb.ParsingException { + ByteArrayInputStream bis = new ByteArrayInputStream(s.getBytes()); + ZqlParser p = new ZqlParser(bis); + try { + ZStatement stmt = p.readStatement(); + if (stmt instanceof ZQuery) { + LogicalPlan lp = parseQueryLogicalPlan(tid, (ZQuery) stmt); + return lp; + } + } catch (Zql.ParseException e) { + throw new simpledb.ParsingException( + "Invalid SQL expression: \n \t " + e); + } catch (IOException e) { + throw new simpledb.ParsingException(e); + } + + throw new simpledb.ParsingException( + "Cannot generate logical plan for expression : " + s); + } + + public void setTransaction(Transaction t) { + curtrans = t; + } + + public Transaction getTransaction() { + return curtrans; + } + + public void processNextStatement(String s) { + try { + processNextStatement(new ByteArrayInputStream(s.getBytes("UTF-8"))); + } catch (UnsupportedEncodingException e) { + e.printStackTrace(); + throw new RuntimeException(e); + } + } + + public void processNextStatement(InputStream is) { + try { + ZqlParser p = new ZqlParser(is); + ZStatement s = p.readStatement(); + + Query query = null; + if (s instanceof ZTransactStmt) + handleTransactStatement((ZTransactStmt) s); + else { + if (!this.inUserTrans) { + curtrans = new Transaction(); + curtrans.start(); + System.out.println("Started a new transaction tid = " + + curtrans.getId().getId()); + } + try { + if (s instanceof ZInsert) + query = handleInsertStatement((ZInsert) s, + curtrans.getId()); + else if (s instanceof ZDelete) + query = handleDeleteStatement((ZDelete) s, + curtrans.getId()); + else if (s instanceof ZQuery) + query = handleQueryStatement((ZQuery) s, + curtrans.getId()); + else { + System.out + .println("Can't parse " + + s + + "\n -- parser only handles SQL transactions, insert, delete, and select statements"); + } + if (query != null) + query.execute(); + + if (!inUserTrans && curtrans != null) { + curtrans.commit(); + System.out.println("Transaction " + + curtrans.getId().getId() + " committed."); + } + } catch (Throwable a) { + // Whenever error happens, abort the current transaction + if (curtrans != null) { + curtrans.abort(); + System.out.println("Transaction " + + curtrans.getId().getId() + + " aborted because of unhandled error"); + } + this.inUserTrans = false; + + if (a instanceof simpledb.ParsingException + || a instanceof Zql.ParseException) + throw new ParsingException((Exception) a); + if (a instanceof Zql.TokenMgrError) + throw (Zql.TokenMgrError) a; + throw new DbException(a.getMessage()); + } finally { + if (!inUserTrans) + curtrans = null; + } + } + + } catch (TransactionAbortedException e) { + e.printStackTrace(); + } catch (DbException e) { + e.printStackTrace(); + } catch (IOException e) { + e.printStackTrace(); + } catch (simpledb.ParsingException e) { + System.out + .println("Invalid SQL expression: \n \t" + e.getMessage()); + } catch (Zql.ParseException e) { + System.out.println("Invalid SQL expression: \n \t " + e); + } catch (Zql.TokenMgrError e) { + System.out.println("Invalid SQL expression: \n \t " + e); + } + } + + // Basic SQL completions + public static final String[] SQL_COMMANDS = { "select", "from", "where", + "group by", "max(", "min(", "avg(", "count", "rollback", "commit", + "insert", "delete", "values", "into" }; + + public static void main(String argv[]) throws IOException { + + if (argv.length < 1 || argv.length > 4) { + System.out.println("Invalid number of arguments.\n" + usage); + System.exit(0); + } + + Parser p = new Parser(); + + p.start(argv); + } + + static final String usage = "Usage: parser catalogFile [-explain] [-f queryFile]"; + + protected void shutdown() { + System.out.println("Bye"); + } + + protected boolean interactive = true; + + protected void start(String[] argv) throws IOException { + // first add tables to database + Database.getCatalog().loadSchema(argv[0]); + TableStats.computeStatistics(); + + String queryFile = null; + + if (argv.length > 1) { + for (int i = 1; i < argv.length; i++) { + if (argv[i].equals("-explain")) { + explain = true; + System.out.println("Explain mode enabled."); + } else if (argv[i].equals("-f")) { + interactive = false; + if (i++ == argv.length) { + System.out.println("Expected file name after -f\n" + + usage); + System.exit(0); + } + queryFile = argv[i]; + + } else { + System.out.println("Unknown argument " + argv[i] + "\n " + + usage); + } + } + } + if (!interactive) { + try { + // curtrans = new Transaction(); + // curtrans.start(); + long startTime = System.currentTimeMillis(); + processNextStatement(new FileInputStream(new File(queryFile))); + long time = System.currentTimeMillis() - startTime; + System.out.printf("----------------\n%.2f seconds\n\n", + ((double) time / 1000.0)); + System.out.println("Press Enter to exit"); + System.in.read(); + this.shutdown(); + } catch (FileNotFoundException e) { + System.out.println("Unable to find query file" + queryFile); + e.printStackTrace(); + } + } else { // no query file, run interactive prompt + ConsoleReader reader = new ConsoleReader(); + + // Add really stupid tab completion for simple SQL + ArgumentCompletor completor = new ArgumentCompletor( + new SimpleCompletor(SQL_COMMANDS)); + completor.setStrict(false); // match at any position + reader.addCompletor(completor); + + StringBuilder buffer = new StringBuilder(); + String line; + boolean quit = false; + while (!quit && (line = reader.readLine("SimpleDB> ")) != null) { + // Split statements at ';': handles multiple statements on one + // line, or one + // statement spread across many lines + while (line.indexOf(';') >= 0) { + int split = line.indexOf(';'); + buffer.append(line.substring(0, split + 1)); + String cmd = buffer.toString().trim(); + cmd = cmd.substring(0, cmd.length() - 1).trim() + ";"; + byte[] statementBytes = cmd.getBytes("UTF-8"); + if (cmd.equalsIgnoreCase("quit;") + || cmd.equalsIgnoreCase("exit;")) { + shutdown(); + quit = true; + break; + } + + long startTime = System.currentTimeMillis(); + processNextStatement(new ByteArrayInputStream( + statementBytes)); + long time = System.currentTimeMillis() - startTime; + System.out.printf("----------------\n%.2f seconds\n\n", + ((double) time / 1000.0)); + + // Grab the remainder of the line + line = line.substring(split + 1); + buffer = new StringBuilder(); + } + if (line.length() > 0) { + buffer.append(line); + buffer.append("\n"); + } + } + } + } +} + +class TupleArrayIterator implements DbIterator { + /** + * + */ + private static final long serialVersionUID = 1L; + ArrayList<Tuple> tups; + Iterator<Tuple> it = null; + + public TupleArrayIterator(ArrayList<Tuple> tups) { + this.tups = tups; + } + + public void open() throws DbException, TransactionAbortedException { + it = tups.iterator(); + } + + /** @return true if the iterator has more items. */ + public boolean hasNext() throws DbException, TransactionAbortedException { + return it.hasNext(); + } + + /** + * Gets the next tuple from the operator (typically implementing by reading + * from a child operator or an access method). + * + * @return The next tuple in the iterator, or null if there are no more + * tuples. + */ + public Tuple next() throws DbException, TransactionAbortedException, + NoSuchElementException { + return it.next(); + } + + /** + * Resets the iterator to the start. + * + * @throws DbException + * When rewind is unsupported. + */ + public void rewind() throws DbException, TransactionAbortedException { + it = tups.iterator(); + } + + /** + * Returns the TupleDesc associated with this DbIterator. + */ + public TupleDesc getTupleDesc() { + return tups.get(0).getTupleDesc(); + } + + /** + * Closes the iterator. + */ + public void close() { + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/ParsingException.java b/hw/hw3/starter-code/src/java/simpledb/ParsingException.java new file mode 100644 index 0000000000000000000000000000000000000000..be8e93667f8c83c864c6d70ca88f71678c14c35c --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/ParsingException.java @@ -0,0 +1,17 @@ +package simpledb; +import java.lang.Exception; + +public class ParsingException extends Exception { + public ParsingException(String string) { + super(string); + } + + public ParsingException(Exception e) { + super(e); + } + + /** + * + */ + private static final long serialVersionUID = 1L; +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Permissions.java b/hw/hw3/starter-code/src/java/simpledb/Permissions.java new file mode 100644 index 0000000000000000000000000000000000000000..3cdb3f6bc6180061b86637e5cad0b004e77367ad --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Permissions.java @@ -0,0 +1,26 @@ +package simpledb; + +/** + * Class representing requested permissions to a relation/file. + * Private constructor with two static objects READ_ONLY and READ_WRITE that + * represent the two levels of permission. + */ +public class Permissions { + int permLevel; + + private Permissions(int permLevel) { + this.permLevel = permLevel; + } + + public String toString() { + if (permLevel == 0) + return "READ_ONLY"; + if (permLevel == 1) + return "READ_WRITE"; + return "UNKNOWN"; + } + + public static final Permissions READ_ONLY = new Permissions(0); + public static final Permissions READ_WRITE = new Permissions(1); + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/PlanCache.java b/hw/hw3/starter-code/src/java/simpledb/PlanCache.java new file mode 100644 index 0000000000000000000000000000000000000000..43ccfe49931a84308f49f50d7443669f2c538113 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/PlanCache.java @@ -0,0 +1,50 @@ +package simpledb; +import java.util.HashMap; +import java.util.Set; +import java.util.Vector; + +/** A PlanCache is a helper class that can be used to store the best + * way to order a given set of joins */ +public class PlanCache { + HashMap<Set<LogicalJoinNode>,Vector<LogicalJoinNode>> bestOrders= new HashMap<Set<LogicalJoinNode>,Vector<LogicalJoinNode>>(); + HashMap<Set<LogicalJoinNode>,Double> bestCosts= new HashMap<Set<LogicalJoinNode>,Double>(); + HashMap<Set<LogicalJoinNode>,Integer> bestCardinalities = new HashMap<Set<LogicalJoinNode>,Integer>(); + + /** Add a new cost, cardinality and ordering for a particular join set. Does not verify that the + new cost is less than any previously added cost -- simply adds or replaces an existing plan for the + specified join set + @param s the set of joins for which a new ordering (plan) is being added + @param cost the estimated cost of the specified plan + @param card the estimatied cardinality of the specified plan + @param order the ordering of the joins in the plan + */ + void addPlan(Set<LogicalJoinNode> s, double cost, int card, Vector<LogicalJoinNode> order) { + bestOrders.put(s,order); + bestCosts.put(s,cost); + bestCardinalities.put(s,card); + } + + /** Find the best join order in the cache for the specified plan + @param s the set of joins to look up the best order for + @return the best order for s in the cache + */ + Vector<LogicalJoinNode> getOrder(Set<LogicalJoinNode> s) { + return bestOrders.get(s); + } + + /** Find the cost of the best join order in the cache for the specified plan + @param s the set of joins to look up the best cost for + @return the cost of the best order for s in the cache + */ + double getCost(Set<LogicalJoinNode> s) { + return bestCosts.get(s); + } + + /** Find the cardinality of the best join order in the cache for the specified plan + @param s the set of joins to look up the best cardinality for + @return the cardinality of the best order for s in the cache + */ + int getCard(Set<LogicalJoinNode> s) { + return bestCardinalities.get(s); + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Predicate.java b/hw/hw3/starter-code/src/java/simpledb/Predicate.java new file mode 100644 index 0000000000000000000000000000000000000000..c751a2c2dd2f6ac2a6e8b5e493c5955f0f91b603 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Predicate.java @@ -0,0 +1,111 @@ +package simpledb; + +import java.io.Serializable; + +/** + * Predicate compares tuples to a specified Field value. + */ +public class Predicate implements Serializable { + + private static final long serialVersionUID = 1L; + + /** Constants used for return codes in Field.compare */ + public enum Op implements Serializable { + EQUALS, GREATER_THAN, LESS_THAN, LESS_THAN_OR_EQ, GREATER_THAN_OR_EQ, LIKE, NOT_EQUALS; + + /** + * Interface to access operations by integer value for command-line + * convenience. + * + * @param i + * a valid integer Op index + */ + public static Op getOp(int i) { + return values()[i]; + } + + public String toString() { + if (this == EQUALS) + return "="; + if (this == GREATER_THAN) + return ">"; + if (this == LESS_THAN) + return "<"; + if (this == LESS_THAN_OR_EQ) + return "<="; + if (this == GREATER_THAN_OR_EQ) + return ">="; + if (this == LIKE) + return "LIKE"; + if (this == NOT_EQUALS) + return "<>"; + throw new IllegalStateException("impossible to reach here"); + } + + } + + /** + * Constructor. + * + * @param field + * field number of passed in tuples to compare against. + * @param op + * operation to use for comparison + * @param operand + * field value to compare passed in tuples to + */ + public Predicate(int field, Op op, Field operand) { + // some code goes here + } + + /** + * @return the field number + */ + public int getField() + { + // some code goes here + return -1; + } + + /** + * @return the operator + */ + public Op getOp() + { + // some code goes here + return null; + } + + /** + * @return the operand + */ + public Field getOperand() + { + // some code goes here + return null; + } + + /** + * Compares the field number of t specified in the constructor to the + * operand field specified in the constructor using the operator specific in + * the constructor. The comparison can be made through Field's compare + * method. + * + * @param t + * The tuple to compare against + * @return true if the comparison is true, false otherwise. + */ + public boolean filter(Tuple t) { + // some code goes here + return false; + } + + /** + * Returns something useful, like "f = field_id op = op_string operand = + * operand_string + */ + public String toString() { + // some code goes here + return ""; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Project.java b/hw/hw3/starter-code/src/java/simpledb/Project.java new file mode 100644 index 0000000000000000000000000000000000000000..e54fef74fbaedee45f4913dd8f126ee8e5348a5a --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Project.java @@ -0,0 +1,96 @@ +package simpledb; + +import java.util.*; + +/** + * Project is an operator that implements a relational projection. + */ +public class Project extends Operator { + + private static final long serialVersionUID = 1L; + private DbIterator child; + private TupleDesc td; + private ArrayList<Integer> outFieldIds; + + /** + * Constructor accepts a child operator to read tuples to apply projection + * to and a list of fields in output tuple + * + * @param fieldList + * The ids of the fields child's tupleDesc to project out + * @param typesList + * the types of the fields in the final projection + * @param child + * The child operator + */ + public Project(ArrayList<Integer> fieldList, ArrayList<Type> typesList, + DbIterator child) { + this(fieldList,typesList.toArray(new Type[]{}),child); + } + + public Project(ArrayList<Integer> fieldList, Type[] types, + DbIterator child) { + this.child = child; + outFieldIds = fieldList; + String[] fieldAr = new String[fieldList.size()]; + TupleDesc childtd = child.getTupleDesc(); + + for (int i = 0; i < fieldAr.length; i++) { + fieldAr[i] = childtd.getFieldName(fieldList.get(i)); + } + td = new TupleDesc(types, fieldAr); + } + + public TupleDesc getTupleDesc() { + return td; + } + + public void open() throws DbException, NoSuchElementException, + TransactionAbortedException { + child.open(); + super.open(); + } + + public void close() { + super.close(); + child.close(); + } + + public void rewind() throws DbException, TransactionAbortedException { + child.rewind(); + } + + /** + * Operator.fetchNext implementation. Iterates over tuples from the child + * operator, projecting out the fields from the tuple + * + * @return The next tuple, or null if there are no more tuples + */ + protected Tuple fetchNext() throws NoSuchElementException, + TransactionAbortedException, DbException { + while (child.hasNext()) { + Tuple t = child.next(); + Tuple newTuple = new Tuple(td); + newTuple.setRecordId(t.getRecordId()); + for (int i = 0; i < td.numFields(); i++) { + newTuple.setField(i, t.getField(outFieldIds.get(i))); + } + return newTuple; + } + return null; + } + + @Override + public DbIterator[] getChildren() { + return new DbIterator[] { this.child }; + } + + @Override + public void setChildren(DbIterator[] children) { + if (this.child!=children[0]) + { + this.child = children[0]; + } + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Query.java b/hw/hw3/starter-code/src/java/simpledb/Query.java new file mode 100644 index 0000000000000000000000000000000000000000..45567f8db29599291bb3ea6d47c9646e71b3f219 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Query.java @@ -0,0 +1,118 @@ +package simpledb; + +import java.io.*; +import java.util.*; + +/** + * Query is a wrapper class to manage the execution of queries. It takes a query + * plan in the form of a high level DbIterator (built by initiating the + * constructors of query plans) and runs it as a part of a specified + * transaction. + * + * @author Sam Madden + */ + +public class Query implements Serializable { + + private static final long serialVersionUID = 1L; + + transient private DbIterator op; + transient private LogicalPlan logicalPlan; + TransactionId tid; + transient private boolean started = false; + + public TransactionId getTransactionId() { + return this.tid; + } + + public void setLogicalPlan(LogicalPlan lp) { + this.logicalPlan = lp; + } + + public LogicalPlan getLogicalPlan() { + return this.logicalPlan; + } + + public void setPhysicalPlan(DbIterator pp) { + this.op = pp; + } + + public DbIterator getPhysicalPlan() { + return this.op; + } + + public Query(TransactionId t) { + tid = t; + } + + public Query(DbIterator root, TransactionId t) { + op = root; + tid = t; + } + + public void start() throws IOException, DbException, + TransactionAbortedException { + op.open(); + + started = true; + } + + public TupleDesc getOutputTupleDesc() { + return this.op.getTupleDesc(); + } + + /** @return true if there are more tuples remaining. */ + public boolean hasNext() throws DbException, TransactionAbortedException { + return op.hasNext(); + } + + /** + * Returns the next tuple, or throws NoSuchElementException if the iterator + * is closed. + * + * @return The next tuple in the iterator + * @throws DbException + * If there is an error in the database system + * @throws NoSuchElementException + * If the iterator has finished iterating + * @throws TransactionAbortedException + * If the transaction is aborted (e.g., due to a deadlock) + */ + public Tuple next() throws DbException, NoSuchElementException, + TransactionAbortedException { + if (!started) + throw new DbException("Database not started."); + + return op.next(); + } + + /** Close the iterator */ + public void close() throws IOException { + op.close(); + started = false; + } + + public void execute() throws IOException, DbException, TransactionAbortedException { + TupleDesc td = this.getOutputTupleDesc(); + + String names = ""; + for (int i = 0; i < td.numFields(); i++) { + names += td.getFieldName(i) + "\t"; + } + System.out.println(names); + for (int i = 0; i < names.length() + td.numFields() * 4; i++) { + System.out.print("-"); + } + System.out.println(""); + + this.start(); + int cnt = 0; + while (this.hasNext()) { + Tuple tup = this.next(); + System.out.println(tup); + cnt++; + } + System.out.println("\n " + cnt + " rows."); + this.close(); + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/RecordId.java b/hw/hw3/starter-code/src/java/simpledb/RecordId.java new file mode 100644 index 0000000000000000000000000000000000000000..d87865384181c0e34f203fd47a8f03daf0819e24 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/RecordId.java @@ -0,0 +1,67 @@ +package simpledb; + +import java.io.Serializable; + +/** + * A RecordId is a reference to a specific tuple on a specific page of a + * specific table. + */ +public class RecordId implements Serializable { + + private static final long serialVersionUID = 1L; + + /** + * Creates a new RecordId referring to the specified PageId and tuple + * number. + * + * @param pid + * the pageid of the page on which the tuple resides + * @param tupleno + * the tuple number within the page. + */ + public RecordId(PageId pid, int tupleno) { + // some code goes here + } + + /** + * @return the tuple number this RecordId references. + */ + public int tupleno() { + // some code goes here + return 0; + } + + /** + * @return the page id this RecordId references. + */ + public PageId getPageId() { + // some code goes here + return null; + } + + /** + * Two RecordId objects are considered equal if they represent the same + * tuple. + * + * @return True if this and o represent the same tuple + */ + @Override + public boolean equals(Object o) { + // some code goes here + throw new UnsupportedOperationException("implement this"); + } + + /** + * You should implement the hashCode() so that two equal RecordId instances + * (with respect to equals()) have the same hashCode(). + * + * @return An int that is the same for equal RecordId objects. + */ + @Override + public int hashCode() { + // some code goes here + throw new UnsupportedOperationException("implement this"); + + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/SeqScan.java b/hw/hw3/starter-code/src/java/simpledb/SeqScan.java new file mode 100644 index 0000000000000000000000000000000000000000..5f5f539d5134ba09dc48f296e1f17d6a27359489 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/SeqScan.java @@ -0,0 +1,109 @@ +package simpledb; + +import java.util.*; + +/** + * SeqScan is an implementation of a sequential scan access method that reads + * each tuple of a table in no particular order (e.g., as they are laid out on + * disk). + */ +public class SeqScan implements DbIterator { + + private static final long serialVersionUID = 1L; + + /** + * Creates a sequential scan over the specified table as a part of the + * specified transaction. + * + * @param tid + * The transaction this scan is running as a part of. + * @param tableid + * the table to scan. + * @param tableAlias + * the alias of this table (needed by the parser); the returned + * tupleDesc should have fields with name tableAlias.fieldName + * (note: this class is not responsible for handling a case where + * tableAlias or fieldName are null. It shouldn't crash if they + * are, but the resulting name can be null.fieldName, + * tableAlias.null, or null.null). + */ + public SeqScan(TransactionId tid, int tableid, String tableAlias) { + // some code goes here + } + + /** + * @return + * return the table name of the table the operator scans. This should + * be the actual name of the table in the catalog of the database + * */ + public String getTableName() { + return null; + } + + /** + * @return Return the alias of the table this operator scans. + * */ + public String getAlias() + { + // some code goes here + return null; + } + + /** + * Reset the tableid, and tableAlias of this operator. + * @param tableid + * the table to scan. + * @param tableAlias + * the alias of this table (needed by the parser); the returned + * tupleDesc should have fields with name tableAlias.fieldName + * (note: this class is not responsible for handling a case where + * tableAlias or fieldName are null. It shouldn't crash if they + * are, but the resulting name can be null.fieldName, + * tableAlias.null, or null.null). + */ + public void reset(int tableid, String tableAlias) { + // some code goes here + } + + public SeqScan(TransactionId tid, int tableid) { + this(tid, tableid, Database.getCatalog().getTableName(tableid)); + } + + public void open() throws DbException, TransactionAbortedException { + // some code goes here + } + + /** + * Returns the TupleDesc with field names from the underlying HeapFile, + * prefixed with the tableAlias string from the constructor. This prefix + * becomes useful when joining tables containing a field(s) with the same + * name. + * + * @return the TupleDesc with field names from the underlying HeapFile, + * prefixed with the tableAlias string from the constructor. + */ + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + public boolean hasNext() throws TransactionAbortedException, DbException { + // some code goes here + return false; + } + + public Tuple next() throws NoSuchElementException, + TransactionAbortedException, DbException { + // some code goes here + return null; + } + + public void close() { + // some code goes here + } + + public void rewind() throws DbException, NoSuchElementException, + TransactionAbortedException { + // some code goes here + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/SimpleDb.java b/hw/hw3/starter-code/src/java/simpledb/SimpleDb.java new file mode 100644 index 0000000000000000000000000000000000000000..02f7636d4151125e2a123ef08022434040a40ce8 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/SimpleDb.java @@ -0,0 +1,99 @@ +package simpledb; +import java.io.*; + +public class SimpleDb { + public static void main (String args[]) + throws DbException, TransactionAbortedException, IOException { + // convert a file + if(args[0].equals("convert")) { + try { + if (args.length<3 || args.length>5){ + System.err.println("Unexpected number of arguments to convert "); + return; + } + File sourceTxtFile=new File(args[1]); + File targetDatFile=new File(args[1].replaceAll(".txt", ".dat")); + int numOfAttributes=Integer.parseInt(args[2]); + Type[] ts = new Type[numOfAttributes]; + char fieldSeparator=','; + + if (args.length == 3) + for (int i=0;i<numOfAttributes;i++) + ts[i]=Type.INT_TYPE; + else { + String typeString=args[3]; + String[] typeStringAr = typeString.split(","); + if (typeStringAr.length!=numOfAttributes) + { + System.err.println("The number of types does not agree with the number of columns"); + return; + } + int index=0; + for (String s: typeStringAr) { + if (s.toLowerCase().equals("int")) + ts[index++]=Type.INT_TYPE; + else if (s.toLowerCase().equals("string")) + ts[index++]=Type.STRING_TYPE; + else { + System.err.println("Unknown type " + s); + return; + } + } + if (args.length==5) + fieldSeparator=args[4].charAt(0); + } + + HeapFileEncoder.convert(sourceTxtFile,targetDatFile, + BufferPool.getPageSize(),numOfAttributes,ts,fieldSeparator); + + } catch (IOException e) { + throw new RuntimeException(e); + } + } else if (args[0].equals("print")) { + File tableFile = new File(args[1]); + int columns = Integer.parseInt(args[2]); + DbFile table = Utility.openHeapFile(columns, tableFile); + TransactionId tid = new TransactionId(); + DbFileIterator it = table.iterator(tid); + + if(null == it){ + System.out.println("Error: method HeapFile.iterator(TransactionId tid) not yet implemented!"); + } else { + it.open(); + while (it.hasNext()) { + Tuple t = it.next(); + System.out.println(t); + } + it.close(); + } + } + else if (args[0].equals("parser")) { + // Strip the first argument and call the parser + String[] newargs = new String[args.length-1]; + for (int i = 1; i < args.length; ++i) { + newargs[i-1] = args[i]; + } + + try { + //dynamically load Parser -- if it doesn't exist, print error message + Class<?> c = Class.forName("simpledb.Parser"); + Class<?> s = String[].class; + + java.lang.reflect.Method m = c.getMethod("main", s); + m.invoke(null, (java.lang.Object)newargs); + } catch (ClassNotFoundException cne) { + System.out.println("Class Parser not found -- perhaps you are trying to run the parser as a part of lab1?"); + } + catch (Exception e) { + System.out.println("Error in parser."); + e.printStackTrace(); + } + + } + else { + System.err.println("Unknown command: " + args[0]); + System.exit(1); + } + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/StringAggregator.java b/hw/hw3/starter-code/src/java/simpledb/StringAggregator.java new file mode 100644 index 0000000000000000000000000000000000000000..04fb6579aa2b87efc46d44f9aaae8ea890e284a4 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/StringAggregator.java @@ -0,0 +1,44 @@ +package simpledb; + +/** + * Knows how to compute some aggregate over a set of StringFields. + */ +public class StringAggregator implements Aggregator { + + private static final long serialVersionUID = 1L; + + /** + * Aggregate constructor + * @param gbfield the 0-based index of the group-by field in the tuple, or NO_GROUPING if there is no grouping + * @param gbfieldtype the type of the group by field (e.g., Type.INT_TYPE), or null if there is no grouping + * @param afield the 0-based index of the aggregate field in the tuple + * @param what aggregation operator to use -- only supports COUNT + * @throws IllegalArgumentException if what != COUNT + */ + + public StringAggregator(int gbfield, Type gbfieldtype, int afield, Op what) { + // some code goes here + } + + /** + * Merge a new tuple into the aggregate, grouping as indicated in the constructor + * @param tup the Tuple containing an aggregate field and a group-by field + */ + public void mergeTupleIntoGroup(Tuple tup) { + // some code goes here + } + + /** + * Create a DbIterator over group aggregate results. + * + * @return a DbIterator whose tuples are the pair (groupVal, + * aggregateVal) if using group, or a single (aggregateVal) if no + * grouping. The aggregateVal is determined by the type of + * aggregate specified in the constructor. + */ + public DbIterator iterator() { + // some code goes here + throw new UnsupportedOperationException("please implement me for lab2"); + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/StringField.java b/hw/hw3/starter-code/src/java/simpledb/StringField.java new file mode 100644 index 0000000000000000000000000000000000000000..309ce07ab787957695e389f6d99d0742816f3edf --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/StringField.java @@ -0,0 +1,115 @@ +package simpledb; + +import java.io.*; + +/** + * Instance of Field that stores a single String of a fixed length. + */ +public class StringField implements Field { + + private static final long serialVersionUID = 1L; + + private final String value; + private final int maxSize; + + public String getValue() { + return value; + } + + /** + * Constructor. + * + * @param s + * The value of this field. + * @param maxSize + * The maximum size of this string + */ + public StringField(String s, int maxSize) { + this.maxSize = maxSize; + + if (s.length() > maxSize) + value = s.substring(0, maxSize); + else + value = s; + } + + public String toString() { + return value; + } + + public int hashCode() { + return value.hashCode(); + } + + public boolean equals(Object field) { + return ((StringField) field).value.equals(value); + } + + /** + * Write this string to dos. Always writes maxSize + 4 bytes to the passed + * in dos. First four bytes are string length, next bytes are string, with + * remainder padded with 0 to maxSize. + * + * @param dos + * Where the string is written + */ + public void serialize(DataOutputStream dos) throws IOException { + String s = value; + int overflow = maxSize - s.length(); + if (overflow < 0) { + String news = s.substring(0, maxSize); + s = news; + } + dos.writeInt(s.length()); + dos.writeBytes(s); + while (overflow-- > 0) + dos.write((byte) 0); + } + + /** + * Compare the specified field to the value of this Field. Return semantics + * are as specified by Field.compare + * + * @throws IllegalCastException + * if val is not a StringField + * @see Field#compare + */ + public boolean compare(Predicate.Op op, Field val) { + + StringField iVal = (StringField) val; + int cmpVal = value.compareTo(iVal.value); + + switch (op) { + case EQUALS: + return cmpVal == 0; + + case NOT_EQUALS: + return cmpVal != 0; + + case GREATER_THAN: + return cmpVal > 0; + + case GREATER_THAN_OR_EQ: + return cmpVal >= 0; + + case LESS_THAN: + return cmpVal < 0; + + case LESS_THAN_OR_EQ: + return cmpVal <= 0; + + case LIKE: + return value.indexOf(iVal.value) >= 0; + } + + return false; + } + + /** + * @return the Type for this Field + */ + public Type getType() { + + return Type.STRING_TYPE; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/TableStats.java b/hw/hw3/starter-code/src/java/simpledb/TableStats.java new file mode 100644 index 0000000000000000000000000000000000000000..9829cfe4c98d8099b59927d3c9505df05cda8e4e --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/TableStats.java @@ -0,0 +1,162 @@ +package simpledb; + +import java.util.HashMap; +import java.util.Iterator; +import java.util.Map; +import java.util.concurrent.ConcurrentHashMap; + +/** + * TableStats represents statistics (e.g., histograms) about base tables in a + * query. + * + * This class is not needed in implementing lab1 and lab2. + */ +public class TableStats { + + private static final ConcurrentHashMap<String, TableStats> statsMap = new ConcurrentHashMap<String, TableStats>(); + + static final int IOCOSTPERPAGE = 1000; + + public static TableStats getTableStats(String tablename) { + return statsMap.get(tablename); + } + + public static void setTableStats(String tablename, TableStats stats) { + statsMap.put(tablename, stats); + } + + public static void setStatsMap(HashMap<String,TableStats> s) + { + try { + java.lang.reflect.Field statsMapF = TableStats.class.getDeclaredField("statsMap"); + statsMapF.setAccessible(true); + statsMapF.set(null, s); + } catch (NoSuchFieldException e) { + e.printStackTrace(); + } catch (SecurityException e) { + e.printStackTrace(); + } catch (IllegalArgumentException e) { + e.printStackTrace(); + } catch (IllegalAccessException e) { + e.printStackTrace(); + } + + } + + public static Map<String, TableStats> getStatsMap() { + return statsMap; + } + + public static void computeStatistics() { + Iterator<Integer> tableIt = Database.getCatalog().tableIdIterator(); + + System.out.println("Computing table stats."); + while (tableIt.hasNext()) { + int tableid = tableIt.next(); + TableStats s = new TableStats(tableid, IOCOSTPERPAGE); + setTableStats(Database.getCatalog().getTableName(tableid), s); + } + System.out.println("Done."); + } + + /** + * Number of bins for the histogram. Feel free to increase this value over + * 100, though our tests assume that you have at least 100 bins in your + * histograms. + */ + static final int NUM_HIST_BINS = 100; + + /** + * Create a new TableStats object, that keeps track of statistics on each + * column of a table + * + * @param tableid + * The table over which to compute statistics + * @param ioCostPerPage + * The cost per page of IO. This doesn't differentiate between + * sequential-scan IO and disk seeks. + */ + public TableStats(int tableid, int ioCostPerPage) { + // For this function, you'll have to get the + // DbFile for the table in question, + // then scan through its tuples and calculate + // the values that you need. + // You should try to do this reasonably efficiently, but you don't + // necessarily have to (for example) do everything + // in a single scan of the table. + // some code goes here + } + + /** + * Estimates the cost of sequentially scanning the file, given that the cost + * to read a page is costPerPageIO. You can assume that there are no seeks + * and that no pages are in the buffer pool. + * + * Also, assume that your hard drive can only read entire pages at once, so + * if the last page of the table only has one tuple on it, it's just as + * expensive to read as a full page. (Most real hard drives can't + * efficiently address regions smaller than a page at a time.) + * + * @return The estimated cost of scanning the table. + */ + public double estimateScanCost() { + // some code goes here + return 0; + } + + /** + * This method returns the number of tuples in the relation, given that a + * predicate with selectivity selectivityFactor is applied. + * + * @param selectivityFactor + * The selectivity of any predicates over the table + * @return The estimated cardinality of the scan with the specified + * selectivityFactor + */ + public int estimateTableCardinality(double selectivityFactor) { + // some code goes here + return 0; + } + + /** + * The average selectivity of the field under op. + * @param field + * the index of the field + * @param op + * the operator in the predicate + * The semantic of the method is that, given the table, and then given a + * tuple, of which we do not know the value of the field, return the + * expected selectivity. You may estimate this value from the histograms. + * */ + public double avgSelectivity(int field, Predicate.Op op) { + // some code goes here + return 1.0; + } + + /** + * Estimate the selectivity of predicate <tt>field op constant</tt> on the + * table. + * + * @param field + * The field over which the predicate ranges + * @param op + * The logical operation in the predicate + * @param constant + * The value against which the field is compared + * @return The estimated selectivity (fraction of tuples that satisfy) the + * predicate + */ + public double estimateSelectivity(int field, Predicate.Op op, Field constant) { + // some code goes here + return 1.0; + } + + /** + * return the total number of tuples in this table + * */ + public int totalTuples() { + // some code goes here + return 0; + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Transaction.java b/hw/hw3/starter-code/src/java/simpledb/Transaction.java new file mode 100644 index 0000000000000000000000000000000000000000..8db3b1842540d80700823b0334937096354cbcff --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Transaction.java @@ -0,0 +1,69 @@ +package simpledb; + +import java.io.*; + +/** + * Transaction encapsulates information about the state of + * a transaction and manages transaction commit / abort. + */ + +public class Transaction { + private final TransactionId tid; + volatile boolean started = false; + + public Transaction() { + tid = new TransactionId(); + } + + /** Start the transaction running */ + public void start() { + started = true; + try { + Database.getLogFile().logXactionBegin(tid); + } catch (IOException e) { + e.printStackTrace(); + } + } + + public TransactionId getId() { + return tid; + } + + /** Finish the transaction */ + public void commit() throws IOException { + transactionComplete(false); + } + + /** Finish the transaction */ + public void abort() throws IOException { + transactionComplete(true); + } + + /** Handle the details of transaction commit / abort */ + public void transactionComplete(boolean abort) throws IOException { + + if (started) { + //write commit / abort records + if (abort) { + Database.getLogFile().logAbort(tid); //does rollback too + } else { + //write all the dirty pages for this transaction out + Database.getBufferPool().flushPages(tid); + Database.getLogFile().logCommit(tid); + } + + try { + + Database.getBufferPool().transactionComplete(tid, !abort); // release locks + + } catch (IOException e) { + e.printStackTrace(); + } + + //setting this here means we could possibly write multiple abort records -- OK? + started = false; + } + + } + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/TransactionAbortedException.java b/hw/hw3/starter-code/src/java/simpledb/TransactionAbortedException.java new file mode 100644 index 0000000000000000000000000000000000000000..fb8c38f1bff27265d5708e6e56c8294c5a3c411f --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/TransactionAbortedException.java @@ -0,0 +1,11 @@ +package simpledb; + +import java.lang.Exception; + +/** Exception that is thrown when a transaction has aborted. */ +public class TransactionAbortedException extends Exception { + private static final long serialVersionUID = 1L; + + public TransactionAbortedException() { + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/TransactionId.java b/hw/hw3/starter-code/src/java/simpledb/TransactionId.java new file mode 100644 index 0000000000000000000000000000000000000000..ebdb182a1b8cf14a3fe598398acd3c76cd5c745c --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/TransactionId.java @@ -0,0 +1,31 @@ +package simpledb; + +import java.io.Serializable; +import java.util.concurrent.atomic.AtomicLong; + +/** + * TransactionId is a class that contains the identifier of a transaction. + */ +public class TransactionId implements Serializable { + + private static final long serialVersionUID = 1L; + + static AtomicLong counter = new AtomicLong(0); + final long myid; + + public TransactionId() { + myid = counter.getAndIncrement(); + } + + public long getId() { + return myid; + } + + public boolean equals(Object tid) { + return ((TransactionId) tid).myid == myid; + } + + public int hashCode() { + return (int) myid; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Tuple.java b/hw/hw3/starter-code/src/java/simpledb/Tuple.java new file mode 100644 index 0000000000000000000000000000000000000000..2ca624a8704c7cfbdf2bccc4a3042479b4225c0c --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Tuple.java @@ -0,0 +1,107 @@ +package simpledb; + +import java.io.Serializable; +import java.util.Arrays; +import java.util.Iterator; + +/** + * Tuple maintains information about the contents of a tuple. Tuples have a + * specified schema specified by a TupleDesc object and contain Field objects + * with the data for each field. + */ +public class Tuple implements Serializable { + + private static final long serialVersionUID = 1L; + + /** + * Create a new tuple with the specified schema (type). + * + * @param td + * the schema of this tuple. It must be a valid TupleDesc + * instance with at least one field. + */ + public Tuple(TupleDesc td) { + // some code goes here + } + + /** + * @return The TupleDesc representing the schema of this tuple. + */ + public TupleDesc getTupleDesc() { + // some code goes here + return null; + } + + /** + * @return The RecordId representing the location of this tuple on disk. May + * be null. + */ + public RecordId getRecordId() { + // some code goes here + return null; + } + + /** + * Set the RecordId information for this tuple. + * + * @param rid + * the new RecordId for this tuple. + */ + public void setRecordId(RecordId rid) { + // some code goes here + } + + /** + * Change the value of the ith field of this tuple. + * + * @param i + * index of the field to change. It must be a valid index. + * @param f + * new value for the field. + */ + public void setField(int i, Field f) { + // some code goes here + } + + /** + * @return the value of the ith field, or null if it has not been set. + * + * @param i + * field index to return. Must be a valid index. + */ + public Field getField(int i) { + // some code goes here + return null; + } + + /** + * Returns the contents of this Tuple as a string. Note that to pass the + * system tests, the format needs to be as follows: + * + * column1\tcolumn2\tcolumn3\t...\tcolumnN\n + * + * where \t is any whitespace, except newline, and \n is a newline + */ + public String toString() { + // some code goes here + throw new UnsupportedOperationException("Implement this"); + } + + /** + * @return + * An iterator which iterates over all the fields of this tuple + * */ + public Iterator<Field> fields() + { + // some code goes here + return null; + } + + /** + * reset the TupleDesc of thi tuple + * */ + public void resetTupleDesc(TupleDesc td) + { + // some code goes here + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/TupleDesc.java b/hw/hw3/starter-code/src/java/simpledb/TupleDesc.java new file mode 100644 index 0000000000000000000000000000000000000000..c7ba3eec921fed0b1f34e873cd437dad6ccf13b2 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/TupleDesc.java @@ -0,0 +1,183 @@ +package simpledb; + +import java.io.Serializable; +import java.util.*; + +/** + * TupleDesc describes the schema of a tuple. + */ +public class TupleDesc implements Serializable { + + /** + * A help class to facilitate organizing the information of each field + * */ + public static class TDItem implements Serializable { + + private static final long serialVersionUID = 1L; + + /** + * The type of the field + * */ + public final Type fieldType; + + /** + * The name of the field + * */ + public final String fieldName; + + public TDItem(Type t, String n) { + this.fieldName = n; + this.fieldType = t; + } + + public String toString() { + return fieldName + "(" + fieldType + ")"; + } + } + + /** + * @return + * An iterator which iterates over all the field TDItems + * that are included in this TupleDesc + * */ + public Iterator<TDItem> iterator() { + // some code goes here + return null; + } + + private static final long serialVersionUID = 1L; + + /** + * Create a new TupleDesc with typeAr.length fields with fields of the + * specified types, with associated named fields. + * + * @param typeAr + * array specifying the number of and types of fields in this + * TupleDesc. It must contain at least one entry. + * @param fieldAr + * array specifying the names of the fields. Note that names may + * be null. + */ + public TupleDesc(Type[] typeAr, String[] fieldAr) { + // some code goes here + } + + /** + * Constructor. Create a new tuple desc with typeAr.length fields with + * fields of the specified types, with anonymous (unnamed) fields. + * + * @param typeAr + * array specifying the number of and types of fields in this + * TupleDesc. It must contain at least one entry. + */ + public TupleDesc(Type[] typeAr) { + // some code goes here + } + + /** + * @return the number of fields in this TupleDesc + */ + public int numFields() { + // some code goes here + return 0; + } + + /** + * Gets the (possibly null) field name of the ith field of this TupleDesc. + * + * @param i + * index of the field name to return. It must be a valid index. + * @return the name of the ith field + * @throws NoSuchElementException + * if i is not a valid field reference. + */ + public String getFieldName(int i) throws NoSuchElementException { + // some code goes here + return null; + } + + /** + * Gets the type of the ith field of this TupleDesc. + * + * @param i + * The index of the field to get the type of. It must be a valid + * index. + * @return the type of the ith field + * @throws NoSuchElementException + * if i is not a valid field reference. + */ + public Type getFieldType(int i) throws NoSuchElementException { + // some code goes here + return null; + } + + /** + * Find the index of the field with a given name. + * + * @param name + * name of the field. + * @return the index of the field that is first to have the given name. + * @throws NoSuchElementException + * if no field with a matching name is found. + */ + public int fieldNameToIndex(String name) throws NoSuchElementException { + // some code goes here + return 0; + } + + /** + * @return The size (in bytes) of tuples corresponding to this TupleDesc. + * Note that tuples from a given TupleDesc are of a fixed size. + */ + public int getSize() { + // some code goes here + return 0; + } + + /** + * Merge two TupleDescs into one, with td1.numFields + td2.numFields fields, + * with the first td1.numFields coming from td1 and the remaining from td2. + * + * @param td1 + * The TupleDesc with the first fields of the new TupleDesc + * @param td2 + * The TupleDesc with the last fields of the TupleDesc + * @return the new TupleDesc + */ + public static TupleDesc merge(TupleDesc td1, TupleDesc td2) { + // some code goes here + return null; + } + + /** + * Compares the specified object with this TupleDesc for equality. Two + * TupleDescs are considered equal if they are the same size and if the n-th + * type in this TupleDesc is equal to the n-th type in td. + * + * @param o + * the Object to be compared for equality with this TupleDesc. + * @return true if the object is equal to this TupleDesc. + */ + public boolean equals(Object o) { + // some code goes here + return false; + } + + public int hashCode() { + // If you want to use TupleDesc as keys for HashMap, implement this so + // that equal objects have equals hashCode() results + throw new UnsupportedOperationException("unimplemented"); + } + + /** + * Returns a String describing this descriptor. It should be of the form + * "fieldType[0](fieldName[0]), ..., fieldType[M](fieldName[M])", although + * the exact format does not matter. + * + * @return String describing this descriptor. + */ + public String toString() { + // some code goes here + return ""; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/TupleIterator.java b/hw/hw3/starter-code/src/java/simpledb/TupleIterator.java new file mode 100644 index 0000000000000000000000000000000000000000..496b8071ab191e8c183d57952cfaf8fbe2c199ed --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/TupleIterator.java @@ -0,0 +1,60 @@ +package simpledb; + +import java.util.*; + +/** + * Implements a DbIterator by wrapping an Iterable<Tuple>. + */ +public class TupleIterator implements DbIterator { + /** + * + */ + private static final long serialVersionUID = 1L; + Iterator<Tuple> i = null; + TupleDesc td = null; + Iterable<Tuple> tuples = null; + + /** + * Constructs an iterator from the specified Iterable, and the specified + * descriptor. + * + * @param tuples + * The set of tuples to iterate over + */ + public TupleIterator(TupleDesc td, Iterable<Tuple> tuples) { + this.td = td; + this.tuples = tuples; + + // check that all tuples are the right TupleDesc + for (Tuple t : tuples) { + if (!t.getTupleDesc().equals(td)) + throw new IllegalArgumentException( + "incompatible tuple in tuple set"); + } + } + + public void open() { + i = tuples.iterator(); + } + + public boolean hasNext() { + return i.hasNext(); + } + + public Tuple next() { + return i.next(); + } + + public void rewind() { + close(); + open(); + } + + public TupleDesc getTupleDesc() { + return td; + } + + public void close() { + i = null; + } +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Type.java b/hw/hw3/starter-code/src/java/simpledb/Type.java new file mode 100644 index 0000000000000000000000000000000000000000..b1f128614eb441d569cc6ebbe80cb29b72e58257 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Type.java @@ -0,0 +1,63 @@ +package simpledb; + +import java.text.ParseException; +import java.io.*; + +/** + * Class representing a type in SimpleDB. + * Types are static objects defined by this class; hence, the Type + * constructor is private. + */ +public enum Type implements Serializable { + INT_TYPE() { + @Override + public int getLen() { + return 4; + } + + @Override + public Field parse(DataInputStream dis) throws ParseException { + try { + return new IntField(dis.readInt()); + } catch (IOException e) { + throw new ParseException("couldn't parse", 0); + } + } + + }, STRING_TYPE() { + @Override + public int getLen() { + return STRING_LEN+4; + } + + @Override + public Field parse(DataInputStream dis) throws ParseException { + try { + int strLen = dis.readInt(); + byte bs[] = new byte[strLen]; + dis.read(bs); + dis.skipBytes(STRING_LEN-strLen); + return new StringField(new String(bs), STRING_LEN); + } catch (IOException e) { + throw new ParseException("couldn't parse", 0); + } + } + }; + + public static final int STRING_LEN = 128; + + /** + * @return the number of bytes required to store a field of this type. + */ + public abstract int getLen(); + + /** + * @return a Field object of the same type as this object that has contents + * read from the specified DataInputStream. + * @param dis The input stream to read from + * @throws ParseException if the data read from the input stream is not + * of the appropriate type. + */ + public abstract Field parse(DataInputStream dis) throws ParseException; + +} diff --git a/hw/hw3/starter-code/src/java/simpledb/Utility.java b/hw/hw3/starter-code/src/java/simpledb/Utility.java new file mode 100644 index 0000000000000000000000000000000000000000..5e77d1f45b6d0a88728980f3bee33d477667fae0 --- /dev/null +++ b/hw/hw3/starter-code/src/java/simpledb/Utility.java @@ -0,0 +1,157 @@ +package simpledb; + +import java.io.*; +import java.util.ArrayList; +import java.util.UUID; + +/** Helper methods used for testing and implementing random features. */ +public class Utility { + /** + * @return a Type array of length len populated with Type.INT_TYPE + */ + public static Type[] getTypes(int len) { + Type[] types = new Type[len]; + for (int i = 0; i < len; ++i) + types[i] = Type.INT_TYPE; + return types; + } + + /** + * @return a String array of length len populated with the (possibly null) strings in val, + * and an appended increasing integer at the end (val1, val2, etc.). + */ + public static String[] getStrings(int len, String val) { + String[] strings = new String[len]; + for (int i = 0; i < len; ++i) + strings[i] = val + i; + return strings; + } + + /** + * @return a TupleDesc with n fields of type Type.INT_TYPE, each named + * name + n (name1, name2, etc.). + */ + public static TupleDesc getTupleDesc(int n, String name) { + return new TupleDesc(getTypes(n), getStrings(n, name)); + } + + /** + * @return a TupleDesc with n fields of type Type.INT_TYPE + */ + public static TupleDesc getTupleDesc(int n) { + return new TupleDesc(getTypes(n)); + } + + /** + * @return a Tuple with a single IntField with value n and with + * RecordId(HeapPageId(1,2), 3) + */ + public static Tuple getHeapTuple(int n) { + Tuple tup = new Tuple(getTupleDesc(1)); + tup.setRecordId(new RecordId(new HeapPageId(1, 2), 3)); + tup.setField(0, new IntField(n)); + return tup; + } + + /** + * @return a Tuple with an IntField for every element of tupdata + * and RecordId(HeapPageId(1, 2), 3) + */ + public static Tuple getHeapTuple(int[] tupdata) { + Tuple tup = new Tuple(getTupleDesc(tupdata.length)); + tup.setRecordId(new RecordId(new HeapPageId(1, 2), 3)); + for (int i = 0; i < tupdata.length; ++i) + tup.setField(i, new IntField(tupdata[i])); + return tup; + } + + /** + * @return a Tuple with a 'width' IntFields each with value n and + * with RecordId(HeapPageId(1, 2), 3) + */ + public static Tuple getHeapTuple(int n, int width) { + Tuple tup = new Tuple(getTupleDesc(width)); + tup.setRecordId(new RecordId(new HeapPageId(1, 2), 3)); + for (int i = 0; i < width; ++i) + tup.setField(i, new IntField(n)); + return tup; + } + + /** + * @return a Tuple with a 'width' IntFields with the value tupledata[i] + * in each field. + * do not set it's RecordId, hence do not distinguish which + * sort of file it belongs to. + */ + public static Tuple getTuple(int[] tupledata, int width) { + if(tupledata.length != width) { + System.out.println("get Hash Tuple has the wrong length~"); + System.exit(1); + } + Tuple tup = new Tuple(getTupleDesc(width)); + for (int i = 0; i < width; ++i) + tup.setField(i, new IntField(tupledata[i])); + return tup; + } + + /** + * A utility method to create a new HeapFile with a single empty page, + * assuming the path does not already exist. If the path exists, the file + * will be overwritten. The new table will be added to the Catalog with + * the specified number of columns as IntFields. + */ + public static HeapFile createEmptyHeapFile(String path, int cols) + throws IOException { + File f = new File(path); + // touch the file + FileOutputStream fos = new FileOutputStream(f); + fos.write(new byte[0]); + fos.close(); + + HeapFile hf = openHeapFile(cols, f); + HeapPageId pid = new HeapPageId(hf.getId(), 0); + + HeapPage page = null; + try { + page = new HeapPage(pid, HeapPage.createEmptyPageData()); + } catch (IOException e) { + // this should never happen for an empty page; bail; + throw new RuntimeException("failed to create empty page in HeapFile"); + } + + hf.writePage(page); + return hf; + } + + /** Opens a HeapFile and adds it to the catalog. + * + * @param cols number of columns in the table. + * @param f location of the file storing the table. + * @return the opened table. + */ + public static HeapFile openHeapFile(int cols, File f) { + // create the HeapFile and add it to the catalog + TupleDesc td = getTupleDesc(cols); + HeapFile hf = new HeapFile(f, td); + Database.getCatalog().addTable(hf, UUID.randomUUID().toString()); + return hf; + } + + public static HeapFile openHeapFile(int cols, String colPrefix, File f) { + // create the HeapFile and add it to the catalog + TupleDesc td = getTupleDesc(cols, colPrefix); + HeapFile hf = new HeapFile(f, td); + Database.getCatalog().addTable(hf, UUID.randomUUID().toString()); + return hf; + } + + public static String listToString(ArrayList<Integer> list) { + String out = ""; + for (Integer i : list) { + if (out.length() > 0) out += "\t"; + out += i; + } + return out; + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/AggregateTest.java b/hw/hw3/starter-code/test/simpledb/AggregateTest.java new file mode 100644 index 0000000000000000000000000000000000000000..38217d2a9288d78b49791c2ecb5177697e5db319 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/AggregateTest.java @@ -0,0 +1,186 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class AggregateTest extends SimpleDbTestBase { + + int width1 = 2; + DbIterator scan1; + DbIterator scan2; + DbIterator scan3; + + DbIterator sum; + DbIterator sumstring; + + DbIterator avg; + DbIterator max; + DbIterator min; + DbIterator count; + + /** + * Initialize each unit test + */ + @Before public void createTupleLists() throws Exception { + this.scan1 = TestUtil.createTupleList(width1, + new int[] { 1, 2, + 1, 4, + 1, 6, + 3, 2, + 3, 4, + 3, 6, + 5, 7 }); + this.scan2 = TestUtil.createTupleList(width1, + new Object[] { 1, "a", + 1, "a", + 1, "a", + 3, "a", + 3, "a", + 3, "a", + 5, "a" }); + this.scan3 = TestUtil.createTupleList(width1, + new Object[] { "a", 2, + "a", 4, + "a", 6, + "b", 2, + "b", 4, + "b", 6, + "c", 7 }); + + this.sum = TestUtil.createTupleList(width1, + new int[] { 1, 12, + 3, 12, + 5, 7 }); + this.sumstring = TestUtil.createTupleList(width1, + new Object[] { "a", 12, + "b", 12, + "c", 7 }); + + this.avg = TestUtil.createTupleList(width1, + new int[] { 1, 4, + 3, 4, + 5, 7 }); + this.min = TestUtil.createTupleList(width1, + new int[] { 1, 2, + 3, 2, + 5, 7 }); + this.max = TestUtil.createTupleList(width1, + new int[] { 1, 6, + 3, 6, + 5, 7 }); + this.count = TestUtil.createTupleList(width1, + new int[] { 1, 3, + 3, 3, + 5, 1 }); + + } + + /** + * Unit test for Aggregate.getTupleDesc() + */ + @Test public void getTupleDesc() { + Aggregate op = new Aggregate(scan1, 0, 0, + Aggregator.Op.MIN); + TupleDesc expected = Utility.getTupleDesc(2); + TupleDesc actual = op.getTupleDesc(); + assertEquals(expected, actual); + } + + /** + * Unit test for Aggregate.rewind() + */ + @Test public void rewind() throws Exception { + Aggregate op = new Aggregate(scan1, 1, 0, + Aggregator.Op.MIN); + op.open(); + while (op.hasNext()) { + assertNotNull(op.next()); + } + assertTrue(TestUtil.checkExhausted(op)); + + op.rewind(); + min.open(); + TestUtil.matchAllTuples(min, op); + } + + /** + * Unit test for Aggregate.getNext() using a count aggregate with string types + */ + @Test public void countStringAggregate() throws Exception { + Aggregate op = new Aggregate(scan2, 1, 0, + Aggregator.Op.COUNT); + op.open(); + count.open(); + TestUtil.matchAllTuples(count, op); + } + + /** + * Unit test for Aggregate.getNext() using a count aggregate with string types + */ + @Test public void sumStringGroupBy() throws Exception { + Aggregate op = new Aggregate(scan3, 1, 0, + Aggregator.Op.SUM); + op.open(); + sumstring.open(); + TestUtil.matchAllTuples(sumstring, op); + } + + /** + * Unit test for Aggregate.getNext() using a sum aggregate + */ + @Test public void sumAggregate() throws Exception { + Aggregate op = new Aggregate(scan1, 1, 0, + Aggregator.Op.SUM); + op.open(); + sum.open(); + TestUtil.matchAllTuples(sum, op); + } + + /** + * Unit test for Aggregate.getNext() using an avg aggregate + */ + @Test public void avgAggregate() throws Exception { + Aggregate op = new Aggregate(scan1, 1, 0, + Aggregator.Op.AVG); + op.open(); + avg.open(); + TestUtil.matchAllTuples(avg, op); + } + + /** + * Unit test for Aggregate.getNext() using a max aggregate + */ + @Test public void maxAggregate() throws Exception { + Aggregate op = new Aggregate(scan1, 1, 0, + Aggregator.Op.MAX); + op.open(); + max.open(); + TestUtil.matchAllTuples(max, op); + } + + /** + * Unit test for Aggregate.getNext() using a min aggregate + */ + @Test public void minAggregate() throws Exception { + Aggregate op = new Aggregate(scan1, 1, 0, + Aggregator.Op.MIN); + op.open(); + min.open(); + TestUtil.matchAllTuples(min, op); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(AggregateTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/CatalogTest.java b/hw/hw3/starter-code/test/simpledb/CatalogTest.java new file mode 100644 index 0000000000000000000000000000000000000000..43d058eb3d9bd59bff1883c9e2f1188b21ab6a8a --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/CatalogTest.java @@ -0,0 +1,79 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; + +import java.util.NoSuchElementException; + +import junit.framework.Assert; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.TestUtil.SkeletonFile; +import simpledb.systemtest.SimpleDbTestBase; +import simpledb.systemtest.SystemTestUtil; + +public class CatalogTest extends SimpleDbTestBase { + private static String name = "test"; + private String nameThisTestRun; + + @Before public void addTables() throws Exception { + Database.getCatalog().clear(); + nameThisTestRun = SystemTestUtil.getUUID(); + Database.getCatalog().addTable(new SkeletonFile(-1, Utility.getTupleDesc(2)), nameThisTestRun); + Database.getCatalog().addTable(new SkeletonFile(-2, Utility.getTupleDesc(2)), name); + } + + /** + * Unit test for Catalog.getTupleDesc() + */ + @Test public void getTupleDesc() throws Exception { + TupleDesc expected = Utility.getTupleDesc(2); + TupleDesc actual = Database.getCatalog().getTupleDesc(-1); + + assertEquals(expected, actual); + } + + /** + * Unit test for Catalog.getTableId() + */ + @Test public void getTableId() { + assertEquals(-2, Database.getCatalog().getTableId(name)); + assertEquals(-1, Database.getCatalog().getTableId(nameThisTestRun)); + + try { + Database.getCatalog().getTableId(null); + Assert.fail("Should not find table with null name"); + } catch (NoSuchElementException e) { + // Expected to get here + } + + try { + Database.getCatalog().getTableId("foo"); + Assert.fail("Should not find table with name foo"); + } catch (NoSuchElementException e) { + // Expected to get here + } + } + + /** + * Unit test for Catalog.getDatabaseFile() + */ + + @Test public void getDatabaseFile() throws Exception { + DbFile f = Database.getCatalog().getDatabaseFile(-1); + + // NOTE(ghuo): we try not to dig too deeply into the DbFile API here; we + // rely on HeapFileTest for that. perform some basic checks. + assertEquals(-1, f.getId()); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(CatalogTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/FilterTest.java b/hw/hw3/starter-code/test/simpledb/FilterTest.java new file mode 100644 index 0000000000000000000000000000000000000000..153fc374a81141af90f3b3dc4dcb093d4afd93a0 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/FilterTest.java @@ -0,0 +1,130 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class FilterTest extends SimpleDbTestBase { + + int testWidth = 3; + DbIterator scan; + + /** + * Initialize each unit test + */ + @Before public void setUp() { + this.scan = new TestUtil.MockScan(-5, 5, testWidth); + } + + /** + * Unit test for Filter.getTupleDesc() + */ + @Test public void getTupleDesc() { + Predicate pred = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(0)); + Filter op = new Filter(pred, scan); + TupleDesc expected = Utility.getTupleDesc(testWidth); + TupleDesc actual = op.getTupleDesc(); + assertEquals(expected, actual); + } + + /** + * Unit test for Filter.rewind() + */ + @Test public void rewind() throws Exception { + Predicate pred = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(0)); + Filter op = new Filter(pred, scan); + op.open(); + assertTrue(op.hasNext()); + assertNotNull(op.next()); + assertTrue(TestUtil.checkExhausted(op)); + + op.rewind(); + Tuple expected = Utility.getHeapTuple(0, testWidth); + Tuple actual = op.next(); + assertTrue(TestUtil.compareTuples(expected, actual)); + op.close(); + } + + /** + * Unit test for Filter.getNext() using a < predicate that filters + * some tuples + */ + @Test public void filterSomeLessThan() throws Exception { + Predicate pred; + pred = new Predicate(0, Predicate.Op.LESS_THAN, TestUtil.getField(2)); + Filter op = new Filter(pred, scan); + TestUtil.MockScan expectedOut = new TestUtil.MockScan(-5, 2, testWidth); + op.open(); + TestUtil.compareDbIterators(op, expectedOut); + op.close(); + } + + /** + * Unit test for Filter.getNext() using a < predicate that filters + * everything + */ + @Test public void filterAllLessThan() throws Exception { + Predicate pred; + pred = new Predicate(0, Predicate.Op.LESS_THAN, TestUtil.getField(-5)); + Filter op = new Filter(pred, scan); + op.open(); + assertTrue(TestUtil.checkExhausted(op)); + op.close(); + } + + /** + * Unit test for Filter.getNext() using an = predicate + */ + @Test public void filterEqual() throws Exception { + Predicate pred; + this.scan = new TestUtil.MockScan(-5, 5, testWidth); + pred = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(-5)); + Filter op = new Filter(pred, scan); + op.open(); + assertTrue(TestUtil.compareTuples(Utility.getHeapTuple(-5, testWidth), + op.next())); + op.close(); + + this.scan = new TestUtil.MockScan(-5, 5, testWidth); + pred = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(0)); + op = new Filter(pred, scan); + op.open(); + assertTrue(TestUtil.compareTuples(Utility.getHeapTuple(0, testWidth), + op.next())); + op.close(); + + this.scan = new TestUtil.MockScan(-5, 5, testWidth); + pred = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(4)); + op = new Filter(pred, scan); + op.open(); + assertTrue(TestUtil.compareTuples(Utility.getHeapTuple(4, testWidth), + op.next())); + op.close(); + } + + /** + * Unit test for Filter.getNext() using an = predicate passing no tuples + */ + @Test public void filterEqualNoTuples() throws Exception { + Predicate pred; + pred = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(5)); + Filter op = new Filter(pred, scan); + op.open(); + TestUtil.checkExhausted(op); + op.close(); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(FilterTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/HeapFileReadTest.java b/hw/hw3/starter-code/test/simpledb/HeapFileReadTest.java new file mode 100644 index 0000000000000000000000000000000000000000..67df051fa02804c2b086f0dab8433d085c15f325 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/HeapFileReadTest.java @@ -0,0 +1,196 @@ +package simpledb; + +import simpledb.systemtest.SimpleDbTestBase; +import simpledb.systemtest.SystemTestUtil; + +import java.util.*; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import static org.junit.Assert.*; +import junit.framework.JUnit4TestAdapter; + +public class HeapFileReadTest extends SimpleDbTestBase { + private HeapFile hf; + private TransactionId tid; + private TupleDesc td; + + /** + * Set up initial resources for each unit test. + */ + @Before + public void setUp() throws Exception { + hf = SystemTestUtil.createRandomHeapFile(2, 20, null, null); + td = Utility.getTupleDesc(2); + tid = new TransactionId(); + } + + @After + public void tearDown() throws Exception { + Database.getBufferPool().transactionComplete(tid); + } + + /** + * Unit test for HeapFile.getId() + */ + @Test + public void getId() throws Exception { + int id = hf.getId(); + + // NOTE(ghuo): the value could be anything. test determinism, at least. + assertEquals(id, hf.getId()); + assertEquals(id, hf.getId()); + + HeapFile other = SystemTestUtil.createRandomHeapFile(1, 1, null, null); + assertTrue(id != other.getId()); + } + + /** + * Unit test for HeapFile.getTupleDesc() + */ + @Test + public void getTupleDesc() throws Exception { + assertEquals(td, hf.getTupleDesc()); + } + /** + * Unit test for HeapFile.numPages() + */ + @Test + public void numPages() throws Exception { + assertEquals(1, hf.numPages()); + // assertEquals(1, empty.numPages()); + } + + /** + * Unit test for HeapFile.readPage() + */ + @Test + public void readPage() throws Exception { + HeapPageId pid = new HeapPageId(hf.getId(), 0); + HeapPage page = (HeapPage) hf.readPage(pid); + + // NOTE(ghuo): we try not to dig too deeply into the Page API here; we + // rely on HeapPageTest for that. perform some basic checks. + assertEquals(484, page.getNumEmptySlots()); + assertTrue(page.isSlotUsed(1)); + assertFalse(page.isSlotUsed(20)); + } + + @Test + public void readFromFileNotMemoryTest() throws Exception { + ArrayList<ArrayList<Integer>> tuples = new ArrayList<ArrayList<Integer>>(10); + for (int i = 0; i < 10; ++i) { + ArrayList<Integer> tuple = new ArrayList<Integer>(2); + for (int j = 0; j < 2; ++j) { + tuple.add(0); + } + tuples.add(tuple); + } + HeapFileEncoder.convert(tuples, hf.getFile(), BufferPool.PAGE_SIZE, 2); + HeapPageId pid = new HeapPageId(hf.getId(), 0); + HeapPage page = (HeapPage) hf.readPage(pid); + + tuples.clear(); + for (int i = 0; i < 10; ++i) { + ArrayList<Integer> tuple = new ArrayList<Integer>(2); + for (int j = 0; j < 2; ++j) { + tuple.add(1); + } + tuples.add(tuple); + } + HeapFileEncoder.convert(tuples, hf.getFile(), BufferPool.PAGE_SIZE, 2); + HeapPageId pid1 = new HeapPageId(hf.getId(), 0); + HeapPage page1 = (HeapPage) hf.readPage(pid1); + + Iterator<Tuple> it = page.iterator(); + Iterator<Tuple> it1 = page1.iterator(); + while (it.hasNext()) { + Tuple tup = it.next(); + Tuple tup1 = it1.next(); + assertTrue(!tup.toString().equals(tup1.toString())); + } + } + + @Test + public void readTwoPages() throws Exception { + hf = SystemTestUtil.createRandomHeapFile(2, 2000, null, null); + ArrayList<ArrayList<Integer>> tuples = new ArrayList<ArrayList<Integer>>(10); + tuples.clear(); + for (int i = 0; i < 2000; ++i) { + ArrayList<Integer> tuple = new ArrayList<Integer>(2); + for (int j = 0; j < 2; ++j) { + if (i == 0) + tuple.add(0); + else + tuple.add(1); + } + tuples.add(tuple); + } + HeapFileEncoder.convert(tuples, hf.getFile(), BufferPool.PAGE_SIZE, 2); + + HeapPageId pid0 = new HeapPageId(hf.getId(), 0); + HeapPage page0 = (HeapPage) hf.readPage(pid0); + Iterator<Tuple> it0 = page0.iterator(); + Tuple tup0 = it0.next(); + assertTrue(tup0.getField(0).toString().equals("0")); + + HeapPageId pid1 = new HeapPageId(hf.getId(), 1); + HeapPage page1 = (HeapPage) hf.readPage(pid1); + Iterator<Tuple> it1 = page1.iterator(); + Tuple tup1 = it1.next(); + assertTrue(tup1.getField(0).toString().equals("1")); + } + + @Test + public void testIteratorBasic() throws Exception { + HeapFile smallFile = SystemTestUtil.createRandomHeapFile(2, 3, null, + null); + + DbFileIterator it = smallFile.iterator(tid); + // Not open yet + assertFalse(it.hasNext()); + try { + it.next(); + fail("expected exception"); + } catch (NoSuchElementException e) { + } + + it.open(); + int count = 0; + while (it.hasNext()) { + assertNotNull(it.next()); + count += 1; + } + assertEquals(3, count); + it.close(); + } + + @Test + public void testIteratorClose() throws Exception { + // make more than 1 page. Previous closed iterator would start fetching + // from page 1. + HeapFile twoPageFile = SystemTestUtil.createRandomHeapFile(2, 520, + null, null); + + DbFileIterator it = twoPageFile.iterator(tid); + it.open(); + assertTrue(it.hasNext()); + it.close(); + try { + it.next(); + fail("expected exception"); + } catch (NoSuchElementException e) { + } + // close twice is harmless + it.close(); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(HeapFileReadTest.class); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/HeapPageIdTest.java b/hw/hw3/starter-code/test/simpledb/HeapPageIdTest.java new file mode 100644 index 0000000000000000000000000000000000000000..4c644211c700e8e75b693ba4386dfade17d063ce --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/HeapPageIdTest.java @@ -0,0 +1,86 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class HeapPageIdTest extends SimpleDbTestBase { + + private HeapPageId pid; + + @Before public void createPid() { + pid = new HeapPageId(1, 1); + } + + /** + * Unit test for HeapPageId.getTableId() + */ + @Test public void getTableId() { + assertEquals(1, pid.getTableId()); + } + + /** + * Unit test for HeapPageId.pageno() + */ + @Test public void pageno() { + assertEquals(1, pid.pageNumber()); + } + + /** + * Unit test for HeapPageId.hashCode() + */ + @Test public void testHashCode() { + int code1, code2; + + // NOTE(ghuo): the hashCode could be anything. test determinism, + // at least. + pid = new HeapPageId(1, 1); + code1 = pid.hashCode(); + assertEquals(code1, pid.hashCode()); + assertEquals(code1, pid.hashCode()); + + pid = new HeapPageId(2, 2); + code2 = pid.hashCode(); + assertEquals(code2, pid.hashCode()); + assertEquals(code2, pid.hashCode()); + } + + /** + * Unit test for HeapPageId.equals() + */ + @Test public void equals() { + HeapPageId pid1 = new HeapPageId(1, 1); + HeapPageId pid1Copy = new HeapPageId(1, 1); + HeapPageId pid2 = new HeapPageId(2, 2); + + // .equals() with null should return false + assertFalse(pid1.equals(null)); + + // .equals() with the wrong type should return false + assertFalse(pid1.equals(new Object())); + + assertTrue(pid1.equals(pid1)); + assertTrue(pid1.equals(pid1Copy)); + assertTrue(pid1Copy.equals(pid1)); + assertTrue(pid2.equals(pid2)); + + assertFalse(pid1.equals(pid2)); + assertFalse(pid1Copy.equals(pid2)); + assertFalse(pid2.equals(pid1)); + assertFalse(pid2.equals(pid1Copy)); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(HeapPageIdTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/HeapPageReadTest.java b/hw/hw3/starter-code/test/simpledb/HeapPageReadTest.java new file mode 100644 index 0000000000000000000000000000000000000000..fa08d715d8840263a4301bfda28870c9c295d185 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/HeapPageReadTest.java @@ -0,0 +1,129 @@ +package simpledb; + +import simpledb.TestUtil.SkeletonFile; +import simpledb.systemtest.SimpleDbTestBase; +import simpledb.systemtest.SystemTestUtil; + +import java.io.File; +import java.io.IOException; +import java.util.*; + +import org.junit.Before; +import org.junit.Test; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; +import junit.framework.JUnit4TestAdapter; + +public class HeapPageReadTest extends SimpleDbTestBase { + private HeapPageId pid; + + public static final int[][] EXAMPLE_VALUES = new int[][] { + { 31933, 862 }, + { 29402, 56883 }, + { 1468, 5825 }, + { 17876, 52278 }, + { 6350, 36090 }, + { 34784, 43771 }, + { 28617, 56874 }, + { 19209, 23253 }, + { 56462, 24979 }, + { 51440, 56685 }, + { 3596, 62307 }, + { 45569, 2719 }, + { 22064, 43575 }, + { 42812, 44947 }, + { 22189, 19724 }, + { 33549, 36554 }, + { 9086, 53184 }, + { 42878, 33394 }, + { 62778, 21122 }, + { 17197, 16388 } + }; + + public static final byte[] EXAMPLE_DATA; + static { + // Build the input table + ArrayList<ArrayList<Integer>> table = new ArrayList<ArrayList<Integer>>(); + for (int[] tuple : EXAMPLE_VALUES) { + ArrayList<Integer> listTuple = new ArrayList<Integer>(); + for (int value : tuple) { + listTuple.add(value); + } + table.add(listTuple); + } + + // Convert it to a HeapFile and read in the bytes + try { + File temp = File.createTempFile("table", ".dat"); + temp.deleteOnExit(); + HeapFileEncoder.convert(table, temp, BufferPool.getPageSize(), 2); + EXAMPLE_DATA = TestUtil.readFileBytes(temp.getAbsolutePath()); + } catch (IOException e) { + throw new RuntimeException(e); + } + } + + /** + * Set up initial resources for each unit test. + */ + @Before public void addTable() throws Exception { + this.pid = new HeapPageId(-1, -1); + Database.getCatalog().addTable(new SkeletonFile(-1, Utility.getTupleDesc(2)), SystemTestUtil.getUUID()); + } + + /** + * Unit test for HeapPage.getId() + */ + @Test public void getId() throws Exception { + HeapPage page = new HeapPage(pid, EXAMPLE_DATA); + assertEquals(pid, page.getId()); + } + + /** + * Unit test for HeapPage.iterator() + */ + @Test public void testIterator() throws Exception { + HeapPage page = new HeapPage(pid, EXAMPLE_DATA); + Iterator<Tuple> it = page.iterator(); + + int row = 0; + while (it.hasNext()) { + Tuple tup = it.next(); + IntField f0 = (IntField) tup.getField(0); + IntField f1 = (IntField) tup.getField(1); + + assertEquals(EXAMPLE_VALUES[row][0], f0.getValue()); + assertEquals(EXAMPLE_VALUES[row][1], f1.getValue()); + row++; + } + } + + /** + * Unit test for HeapPage.getNumEmptySlots() + */ + @Test public void getNumEmptySlots() throws Exception { + HeapPage page = new HeapPage(pid, EXAMPLE_DATA); + assertEquals(484, page.getNumEmptySlots()); + } + + /** + * Unit test for HeapPage.isSlotUsed() + */ + @Test public void getSlot() throws Exception { + HeapPage page = new HeapPage(pid, EXAMPLE_DATA); + + for (int i = 0; i < 20; ++i) + assertTrue(page.isSlotUsed(i)); + + for (int i = 20; i < 504; ++i) + assertFalse(page.isSlotUsed(i)); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(HeapPageReadTest.class); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/IntegerAggregatorTest.java b/hw/hw3/starter-code/test/simpledb/IntegerAggregatorTest.java new file mode 100644 index 0000000000000000000000000000000000000000..91a932cdcb56d2350081aaad18f0136f1fa9451c --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/IntegerAggregatorTest.java @@ -0,0 +1,188 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; + +import java.util.NoSuchElementException; + +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class IntegerAggregatorTest extends SimpleDbTestBase { + + int width1 = 2; + DbIterator scan1; + int[][] sum = null; + int[][] min = null; + int[][] max = null; + int[][] avg = null; + + /** + * Initialize each unit test + */ + @Before public void createTupleList() throws Exception { + this.scan1 = TestUtil.createTupleList(width1, + new int[] { 1, 2, + 1, 4, + 1, 6, + 3, 2, + 3, 4, + 3, 6, + 5, 7 }); + + // verify how the results progress after a few merges + this.sum = new int[][] { + { 1, 2 }, + { 1, 6 }, + { 1, 12 }, + { 1, 12, 3, 2 } + }; + + this.min = new int[][] { + { 1, 2 }, + { 1, 2 }, + { 1, 2 }, + { 1, 2, 3, 2 } + }; + + this.max = new int[][] { + { 1, 2 }, + { 1, 4 }, + { 1, 6 }, + { 1, 6, 3, 2 } + }; + + this.avg = new int[][] { + { 1, 2 }, + { 1, 3 }, + { 1, 4 }, + { 1, 4, 3, 2 } + }; + } + + /** + * Test IntegerAggregator.mergeTupleIntoGroup() and iterator() over a sum + */ + @Test public void mergeSum() throws Exception { + scan1.open(); + IntegerAggregator agg = new IntegerAggregator(0, Type.INT_TYPE, 1, Aggregator.Op.SUM); + + for (int[] step : sum) { + agg.mergeTupleIntoGroup(scan1.next()); + DbIterator it = agg.iterator(); + it.open(); + TestUtil.matchAllTuples(TestUtil.createTupleList(width1, step), it); + } + } + + /** + * Test IntegerAggregator.mergeTupleIntoGroup() and iterator() over a min + */ + @Test public void mergeMin() throws Exception { + scan1.open(); + IntegerAggregator agg = new IntegerAggregator(0,Type.INT_TYPE, 1, Aggregator.Op.MIN); + + DbIterator it; + for (int[] step : min) { + agg.mergeTupleIntoGroup(scan1.next()); + it = agg.iterator(); + it.open(); + TestUtil.matchAllTuples(TestUtil.createTupleList(width1, step), it); + } + } + + /** + * Test IntegerAggregator.mergeTupleIntoGroup() and iterator() over a max + */ + @Test public void mergeMax() throws Exception { + scan1.open(); + IntegerAggregator agg = new IntegerAggregator(0, Type.INT_TYPE, 1, Aggregator.Op.MAX); + + DbIterator it; + for (int[] step : max) { + agg.mergeTupleIntoGroup(scan1.next()); + it = agg.iterator(); + it.open(); + TestUtil.matchAllTuples(TestUtil.createTupleList(width1, step), it); + } + } + + /** + * Test IntegerAggregator.mergeTupleIntoGroup() and iterator() over an avg + */ + @Test public void mergeAvg() throws Exception { + scan1.open(); + IntegerAggregator agg = new IntegerAggregator(0, Type.INT_TYPE, 1, Aggregator.Op.AVG); + + DbIterator it; + for (int[] step : avg) { + agg.mergeTupleIntoGroup(scan1.next()); + it = agg.iterator(); + it.open(); + TestUtil.matchAllTuples(TestUtil.createTupleList(width1, step), it); + } + } + + /** + * Test IntegerAggregator.iterator() for DbIterator behaviour + */ + @Test public void testIterator() throws Exception { + // first, populate the aggregator via sum over scan1 + scan1.open(); + IntegerAggregator agg = new IntegerAggregator(0, Type.INT_TYPE, 1, Aggregator.Op.SUM); + try { + while (true) + agg.mergeTupleIntoGroup(scan1.next()); + } catch (NoSuchElementException e) { + // explicitly ignored + } + + DbIterator it = agg.iterator(); + it.open(); + + // verify it has three elements + int count = 0; + try { + while (true) { + it.next(); + count++; + } + } catch (NoSuchElementException e) { + // explicitly ignored + } + assertEquals(3, count); + + // rewind and try again + it.rewind(); + count = 0; + try { + while (true) { + it.next(); + count++; + } + } catch (NoSuchElementException e) { + // explicitly ignored + } + assertEquals(3, count); + + // close it and check that we don't get anything + it.close(); + try { + it.next(); + throw new Exception("IntegerAggregator iterator yielded tuple after close"); + } catch (Exception e) { + // explicitly ignored + } + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(IntegerAggregatorTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/JoinPredicateTest.java b/hw/hw3/starter-code/test/simpledb/JoinPredicateTest.java new file mode 100644 index 0000000000000000000000000000000000000000..1f11322d5fb1714421290812ca56b5dc1de5d415 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/JoinPredicateTest.java @@ -0,0 +1,66 @@ +package simpledb; + +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; +import static org.junit.Assert.assertTrue; +import static org.junit.Assert.assertFalse; +import junit.framework.JUnit4TestAdapter; + +public class JoinPredicateTest extends SimpleDbTestBase { + + /** + * Unit test for JoinPredicate.filter() + */ + @Test public void filterVaryingVals() { + int[] vals = new int[] { -1, 0, 1 }; + + for (int i : vals) { + JoinPredicate p = new JoinPredicate(0, + Predicate.Op.EQUALS, 0); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i - 1))); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i))); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + JoinPredicate p = new JoinPredicate(0, + Predicate.Op.GREATER_THAN, 0); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i - 1))); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i))); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + JoinPredicate p = new JoinPredicate(0, + Predicate.Op.GREATER_THAN_OR_EQ, 0); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i - 1))); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i))); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + JoinPredicate p = new JoinPredicate(0, + Predicate.Op.LESS_THAN, 0); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i - 1))); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i))); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + JoinPredicate p = new JoinPredicate(0, + Predicate.Op.LESS_THAN_OR_EQ, 0); + assertFalse(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i - 1))); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i))); + assertTrue(p.filter(Utility.getHeapTuple(i), Utility.getHeapTuple(i + 1))); + } + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(JoinPredicateTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/JoinTest.java b/hw/hw3/starter-code/test/simpledb/JoinTest.java new file mode 100644 index 0000000000000000000000000000000000000000..9c406c92278ff3d90bb408a0cd530ff610475c52 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/JoinTest.java @@ -0,0 +1,115 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class JoinTest extends SimpleDbTestBase { + + int width1 = 2; + int width2 = 3; + DbIterator scan1; + DbIterator scan2; + DbIterator eqJoin; + DbIterator gtJoin; + + /** + * Initialize each unit test + */ + @Before public void createTupleLists() throws Exception { + this.scan1 = TestUtil.createTupleList(width1, + new int[] { 1, 2, + 3, 4, + 5, 6, + 7, 8 }); + this.scan2 = TestUtil.createTupleList(width2, + new int[] { 1, 2, 3, + 2, 3, 4, + 3, 4, 5, + 4, 5, 6, + 5, 6, 7 }); + this.eqJoin = TestUtil.createTupleList(width1 + width2, + new int[] { 1, 2, 1, 2, 3, + 3, 4, 3, 4, 5, + 5, 6, 5, 6, 7 }); + this.gtJoin = TestUtil.createTupleList(width1 + width2, + new int[] { + 3, 4, 1, 2, 3, // 1, 2 < 3 + 3, 4, 2, 3, 4, + 5, 6, 1, 2, 3, // 1, 2, 3, 4 < 5 + 5, 6, 2, 3, 4, + 5, 6, 3, 4, 5, + 5, 6, 4, 5, 6, + 7, 8, 1, 2, 3, // 1, 2, 3, 4, 5 < 7 + 7, 8, 2, 3, 4, + 7, 8, 3, 4, 5, + 7, 8, 4, 5, 6, + 7, 8, 5, 6, 7 }); + } + + /** + * Unit test for Join.getTupleDesc() + */ + @Test public void getTupleDesc() { + JoinPredicate pred = new JoinPredicate(0, Predicate.Op.EQUALS, 0); + Join op = new Join(pred, scan1, scan2); + TupleDesc expected = Utility.getTupleDesc(width1 + width2); + TupleDesc actual = op.getTupleDesc(); + assertEquals(expected, actual); + } + + /** + * Unit test for Join.rewind() + */ + @Test public void rewind() throws Exception { + JoinPredicate pred = new JoinPredicate(0, Predicate.Op.EQUALS, 0); + Join op = new Join(pred, scan1, scan2); + op.open(); + while (op.hasNext()) { + assertNotNull(op.next()); + } + assertTrue(TestUtil.checkExhausted(op)); + op.rewind(); + + eqJoin.open(); + Tuple expected = eqJoin.next(); + Tuple actual = op.next(); + assertTrue(TestUtil.compareTuples(expected, actual)); + } + + /** + * Unit test for Join.getNext() using a > predicate + */ + @Test public void gtJoin() throws Exception { + JoinPredicate pred = new JoinPredicate(0, Predicate.Op.GREATER_THAN, 0); + Join op = new Join(pred, scan1, scan2); + op.open(); + gtJoin.open(); + TestUtil.matchAllTuples(gtJoin, op); + } + + /** + * Unit test for Join.getNext() using an = predicate + */ + @Test public void eqJoin() throws Exception { + JoinPredicate pred = new JoinPredicate(0, Predicate.Op.EQUALS, 0); + Join op = new Join(pred, scan1, scan2); + op.open(); + eqJoin.open(); + TestUtil.matchAllTuples(eqJoin, op); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(JoinTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/PredicateTest.java b/hw/hw3/starter-code/test/simpledb/PredicateTest.java new file mode 100644 index 0000000000000000000000000000000000000000..bbacf79a3758a45bc947f7f1c41a06acb09cbe5d --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/PredicateTest.java @@ -0,0 +1,65 @@ +package simpledb; + +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; +import static org.junit.Assert.assertTrue; +import static org.junit.Assert.assertFalse; +import junit.framework.JUnit4TestAdapter; + +public class PredicateTest extends SimpleDbTestBase{ + + /** + * Unit test for Predicate.filter() + */ + @Test public void filter() { + int[] vals = new int[] { -1, 0, 1 }; + + for (int i : vals) { + Predicate p = new Predicate(0, Predicate.Op.EQUALS, TestUtil.getField(i)); + assertFalse(p.filter(Utility.getHeapTuple(i - 1))); + assertTrue(p.filter(Utility.getHeapTuple(i))); + assertFalse(p.filter(Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + Predicate p = new Predicate(0, Predicate.Op.GREATER_THAN, + TestUtil.getField(i)); + assertFalse(p.filter(Utility.getHeapTuple(i - 1))); + assertFalse(p.filter(Utility.getHeapTuple(i))); + assertTrue(p.filter(Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + Predicate p = new Predicate(0, Predicate.Op.GREATER_THAN_OR_EQ, + TestUtil.getField(i)); + assertFalse(p.filter(Utility.getHeapTuple(i - 1))); + assertTrue(p.filter(Utility.getHeapTuple(i))); + assertTrue(p.filter(Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + Predicate p = new Predicate(0, Predicate.Op.LESS_THAN, + TestUtil.getField(i)); + assertTrue(p.filter(Utility.getHeapTuple(i - 1))); + assertFalse(p.filter(Utility.getHeapTuple(i))); + assertFalse(p.filter(Utility.getHeapTuple(i + 1))); + } + + for (int i : vals) { + Predicate p = new Predicate(0, Predicate.Op.LESS_THAN_OR_EQ, + TestUtil.getField(i)); + assertTrue(p.filter(Utility.getHeapTuple(i - 1))); + assertTrue(p.filter(Utility.getHeapTuple(i))); + assertFalse(p.filter(Utility.getHeapTuple(i + 1))); + } + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(PredicateTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/RecordIdTest.java b/hw/hw3/starter-code/test/simpledb/RecordIdTest.java new file mode 100644 index 0000000000000000000000000000000000000000..660655153577ffb1f459a0d0c7868383c94e5b04 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/RecordIdTest.java @@ -0,0 +1,72 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class RecordIdTest extends SimpleDbTestBase { + + private static RecordId hrid; + private static RecordId hrid2; + private static RecordId hrid3; + private static RecordId hrid4; + + @Before public void createPids() { + HeapPageId hpid = new HeapPageId(-1, 2); + HeapPageId hpid2 = new HeapPageId(-1, 2); + HeapPageId hpid3 = new HeapPageId(-2, 2); + hrid = new RecordId(hpid, 3); + hrid2 = new RecordId(hpid2, 3); + hrid3 = new RecordId(hpid, 4); + hrid4 = new RecordId(hpid3, 3); + + } + + /** + * Unit test for RecordId.getPageId() + */ + @Test public void getPageId() { + HeapPageId hpid = new HeapPageId(-1, 2); + assertEquals(hpid, hrid.getPageId()); + + } + + /** + * Unit test for RecordId.tupleno() + */ + @Test public void tupleno() { + assertEquals(3, hrid.tupleno()); + } + + /** + * Unit test for RecordId.equals() + */ + @Test public void equals() { + assertEquals(hrid, hrid2); + assertEquals(hrid2, hrid); + assertFalse(hrid.equals(hrid3)); + assertFalse(hrid3.equals(hrid)); + assertFalse(hrid2.equals(hrid4)); + assertFalse(hrid4.equals(hrid2)); + } + + /** + * Unit test for RecordId.hashCode() + */ + @Test public void hCode() { + assertEquals(hrid.hashCode(), hrid2.hashCode()); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(RecordIdTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/StringAggregatorTest.java b/hw/hw3/starter-code/test/simpledb/StringAggregatorTest.java new file mode 100644 index 0000000000000000000000000000000000000000..9615236d3e4e902a77c6331b168dd098eaa3b01d --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/StringAggregatorTest.java @@ -0,0 +1,115 @@ +package simpledb; + +import java.util.*; + +import org.junit.Before; +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; +import static org.junit.Assert.assertEquals; +import junit.framework.JUnit4TestAdapter; + +public class StringAggregatorTest extends SimpleDbTestBase { + + int width1 = 2; + DbIterator scan1; + int[][] count = null; + + /** + * Initialize each unit test + */ + @Before public void createTupleList() throws Exception { + this.scan1 = TestUtil.createTupleList(width1, + new Object[] { 1, "a", + 1, "b", + 1, "c", + 3, "d", + 3, "e", + 3, "f", + 5, "g" }); + + // verify how the results progress after a few merges + this.count = new int[][] { + { 1, 1 }, + { 1, 2 }, + { 1, 3 }, + { 1, 3, 3, 1 } + }; + + } + + /** + * Test String.mergeTupleIntoGroup() and iterator() over a COUNT + */ + @Test public void mergeCount() throws Exception { + scan1.open(); + StringAggregator agg = new StringAggregator(0, Type.INT_TYPE, 1, Aggregator.Op.COUNT); + + for (int[] step : count) { + agg.mergeTupleIntoGroup(scan1.next()); + DbIterator it = agg.iterator(); + it.open(); + TestUtil.matchAllTuples(TestUtil.createTupleList(width1, step), it); + } + } + + /** + * Test StringAggregator.iterator() for DbIterator behaviour + */ + @Test public void testIterator() throws Exception { + // first, populate the aggregator via sum over scan1 + scan1.open(); + StringAggregator agg = new StringAggregator(0, Type.INT_TYPE, 1, Aggregator.Op.COUNT); + try { + while (true) + agg.mergeTupleIntoGroup(scan1.next()); + } catch (NoSuchElementException e) { + // explicitly ignored + } + + DbIterator it = agg.iterator(); + it.open(); + + // verify it has three elements + int count = 0; + try { + while (true) { + it.next(); + count++; + } + } catch (NoSuchElementException e) { + // explicitly ignored + } + assertEquals(3, count); + + // rewind and try again + it.rewind(); + count = 0; + try { + while (true) { + it.next(); + count++; + } + } catch (NoSuchElementException e) { + // explicitly ignored + } + assertEquals(3, count); + + // close it and check that we don't get anything + it.close(); + try { + it.next(); + throw new Exception("StringAggreator iterator yielded tuple after close"); + } catch (Exception e) { + // explicitly ignored + } + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(StringAggregatorTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/TestUtil.java b/hw/hw3/starter-code/test/simpledb/TestUtil.java new file mode 100644 index 0000000000000000000000000000000000000000..48c75856e190ad90db760ee1e75f28f6c23e18cc --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/TestUtil.java @@ -0,0 +1,396 @@ +package simpledb; + +import java.io.*; +import java.util.*; + +import static org.junit.Assert.*; + +public class TestUtil { + /** + * @return an IntField with value n + */ + public static Field getField(int n) { + return new IntField(n); + } + + /** + * @return a DbIterator over a list of tuples constructed over the data + * provided in the constructor. This iterator is already open. + * @param width the number of fields in each tuple + * @param tupdata an array such that the ith element the jth tuple lives + * in slot j * width + i + * @require tupdata.length % width == 0 + * @throws DbException if we encounter an error creating the + * TupleIterator + */ + public static TupleIterator createTupleList(int width, int[] tupdata) { + int i = 0; + ArrayList<Tuple> tuplist = new ArrayList<Tuple>(); + while (i < tupdata.length) { + Tuple tup = new Tuple(Utility.getTupleDesc(width)); + for (int j = 0; j < width; ++j) + tup.setField(j, getField(tupdata[i++])); + tuplist.add(tup); + } + + TupleIterator result = new TupleIterator(Utility.getTupleDesc(width), tuplist); + result.open(); + return result; + } + + /** + * @return a DbIterator over a list of tuples constructed over the data + * provided in the constructor. This iterator is already open. + * @param width the number of fields in each tuple + * @param tupdata an array such that the ith element the jth tuple lives + * in slot j * width + i. Objects can be strings or ints; tuples must all be of same type. + * @require tupdata.length % width == 0 + * @throws DbException if we encounter an error creating the + * TupleIterator + */ + public static TupleIterator createTupleList(int width, Object[] tupdata) { + ArrayList<Tuple> tuplist = new ArrayList<Tuple>(); + TupleDesc td; + Type[] types = new Type[width]; + int i= 0; + for (int j = 0; j < width; j++) { + if (tupdata[j] instanceof String) { + types[j] = Type.STRING_TYPE; + } + if (tupdata[j] instanceof Integer) { + types[j] = Type.INT_TYPE; + } + } + td = new TupleDesc(types); + + while (i < tupdata.length) { + Tuple tup = new Tuple(td); + for (int j = 0; j < width; j++) { + Field f; + Object t = tupdata[i++]; + if (t instanceof String) + f = new StringField((String)t, Type.STRING_LEN); + else + f = new IntField((Integer)t); + + tup.setField(j, f); + } + tuplist.add(tup); + } + + TupleIterator result = new TupleIterator(td, tuplist); + result.open(); + return result; + } + + /** + * @return true iff the tuples have the same number of fields and + * corresponding fields in the two Tuples are all equal. + */ + public static boolean compareTuples(Tuple t1, Tuple t2) { + if (t1.getTupleDesc().numFields() != t2.getTupleDesc().numFields()) + return false; + + for (int i = 0; i < t1.getTupleDesc().numFields(); ++i) { + if (!(t1.getTupleDesc().getFieldType(i).equals(t2.getTupleDesc().getFieldType(i)))) + return false; + if (!(t1.getField(i).equals(t2.getField(i)))) + return false; + } + + return true; + } + + /** + * Check to see if the DbIterators have the same number of tuples and + * each tuple pair in parallel iteration satisfies compareTuples . + * If not, throw an assertion. + */ + public static void compareDbIterators(DbIterator expected, DbIterator actual) + throws DbException, TransactionAbortedException { + while (expected.hasNext()) { + assertTrue(actual.hasNext()); + + Tuple expectedTup = expected.next(); + Tuple actualTup = actual.next(); + assertTrue(compareTuples(expectedTup, actualTup)); + } + // Both must now be exhausted + assertFalse(expected.hasNext()); + assertFalse(actual.hasNext()); + } + + /** + * Check to see if every tuple in expected matches <b>some</b> tuple + * in actual via compareTuples. Note that actual may be a superset. + * If not, throw an assertion. + */ + public static void matchAllTuples(DbIterator expected, DbIterator actual) throws + DbException, TransactionAbortedException { + // TODO(ghuo): this n^2 set comparison is kind of dumb, but we haven't + // implemented hashCode or equals for tuples. + boolean matched = false; + while (expected.hasNext()) { + Tuple expectedTup = expected.next(); + matched = false; + actual.rewind(); + + while (actual.hasNext()) { + Tuple next = actual.next(); + if (compareTuples(expectedTup, next)) { + matched = true; + break; + } + } + + if (!matched) { + throw new RuntimeException("expected tuple not found: " + expectedTup); + } + } + } + + /** + * Verifies that the DbIterator has been exhausted of all elements. + */ + public static boolean checkExhausted(DbIterator it) + throws TransactionAbortedException, DbException { + + if (it.hasNext()) return false; + + try { + Tuple t = it.next(); + System.out.println("Got unexpected tuple: " + t); + return false; + } catch (NoSuchElementException e) { + return true; + } + } + + /** + * @return a byte array containing the contents of the file 'path' + */ + public static byte[] readFileBytes(String path) throws IOException { + File f = new File(path); + InputStream is = new FileInputStream(f); + byte[] buf = new byte[(int) f.length()]; + + int offset = 0; + int count = 0; + while (offset < buf.length + && (count = is.read(buf, offset, buf.length - offset)) >= 0) { + offset += count; + } + + // check that we grabbed the entire file + if (offset < buf.length) + throw new IOException("failed to read test data"); + + // Close the input stream and return bytes + is.close(); + return buf; + } + + /** + * Stub DbFile class for unit testing. + */ + public static class SkeletonFile implements DbFile { + private int tableid; + private TupleDesc td; + + public SkeletonFile(int tableid, TupleDesc td) { + this.tableid = tableid; + this.td = td; + } + + public Page readPage(PageId id) throws NoSuchElementException { + throw new RuntimeException("not implemented"); + } + + public int numPages() { + throw new RuntimeException("not implemented"); + } + + public void writePage(Page p) throws IOException { + throw new RuntimeException("not implemented"); + } + + public ArrayList<Page> insertTuple(TransactionId tid, Tuple t) + throws DbException, IOException, TransactionAbortedException { + throw new RuntimeException("not implemented"); + } + + public ArrayList<Page> deleteTuple(TransactionId tid, Tuple t) + throws DbException, TransactionAbortedException { + throw new RuntimeException("not implemented"); + } + + public int bytesPerPage() { + throw new RuntimeException("not implemented"); + } + + public int getId() { + return tableid; + } + + public DbFileIterator iterator(TransactionId tid) { + throw new RuntimeException("not implemented"); + } + + public TupleDesc getTupleDesc() { + return td; + } + } + + /** + * Mock SeqScan class for unit testing. + */ + public static class MockScan implements DbIterator { + private int cur, low, high, width; + + /** + * Creates a fake SeqScan that returns tuples sequentially with 'width' + * fields, each with the same value, that increases from low (inclusive) + * and high (exclusive) over getNext calls. + */ + public MockScan(int low, int high, int width) { + this.low = low; + this.high = high; + this.width = width; + this.cur = low; + } + + public void open() { + } + + public void close() { + } + + public void rewind() { + cur = low; + } + + public TupleDesc getTupleDesc() { + return Utility.getTupleDesc(width); + } + + protected Tuple readNext() { + if (cur >= high) return null; + + Tuple tup = new Tuple(getTupleDesc()); + for (int i = 0; i < width; ++i) + tup.setField(i, new IntField(cur)); + cur++; + return tup; + } + + public boolean hasNext() throws DbException, TransactionAbortedException { + if (cur >= high) return false; + return true; + } + + public Tuple next() throws DbException, TransactionAbortedException, NoSuchElementException { + if(cur >= high) throw new NoSuchElementException(); + Tuple tup = new Tuple(getTupleDesc()); + for (int i = 0; i < width; ++i) + tup.setField(i, new IntField(cur)); + cur++; + return tup; + } + } + + /** + * Helper class that attempts to acquire a lock on a given page in a new + * thread. + * + * @return a handle to the Thread that will attempt lock acquisition after it + * has been started + */ + static class LockGrabber extends Thread { + + TransactionId tid; + PageId pid; + Permissions perm; + boolean acquired; + Exception error; + Object alock; + Object elock; + + /** + * @param tid the transaction on whose behalf we want to acquire the lock + * @param pid the page over which we want to acquire the lock + * @param perm the desired lock permissions + */ + public LockGrabber(TransactionId tid, PageId pid, Permissions perm) { + this.tid = tid; + this.pid = pid; + this.perm = perm; + this.acquired = false; + this.error = null; + this.alock = new Object(); + this.elock = new Object(); + } + + public void run() { + try { + Database.getBufferPool().getPage(tid, pid, perm); + synchronized(alock) { + acquired = true; + } + } catch (Exception e) { + e.printStackTrace(); + synchronized(elock) { + error = e; + } + + try { + Database.getBufferPool().transactionComplete(tid, false); + } catch (java.io.IOException e2) { + e2.printStackTrace(); + } + } + } + + /** + * @return true if we successfully acquired the specified lock + */ + public boolean acquired() { + synchronized(alock) { + return acquired; + } + } + + /** + * @return an Exception instance if one occured during lock acquisition; + * null otherwise + */ + public Exception getError() { + synchronized(elock) { + return error; + } + } + } + + /** JUnit fixture that creates a heap file and cleans it up afterward. */ + public static abstract class CreateHeapFile { + protected CreateHeapFile() { + try{ + emptyFile = File.createTempFile("empty", ".dat"); + } catch (IOException e) { + throw new RuntimeException(e); + } + emptyFile.deleteOnExit(); + } + + protected void setUp() throws Exception { + try{ + Database.reset(); + empty = Utility.createEmptyHeapFile(emptyFile.getAbsolutePath(), 2); + } catch (IOException e) { + throw new RuntimeException(e); + } + } + + protected HeapFile empty; + private final File emptyFile; + } +} diff --git a/hw/hw3/starter-code/test/simpledb/TupleDescTest.java b/hw/hw3/starter-code/test/simpledb/TupleDescTest.java new file mode 100644 index 0000000000000000000000000000000000000000..1cbe39601908e59495e68ac47160af10adfb9b9b --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/TupleDescTest.java @@ -0,0 +1,177 @@ +package simpledb; + +import java.util.NoSuchElementException; + +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +import static org.junit.Assert.*; +import junit.framework.Assert; +import junit.framework.JUnit4TestAdapter; + +public class TupleDescTest extends SimpleDbTestBase { + + /** + * Unit test for TupleDesc.combine() + */ + @Test public void combine() { + TupleDesc td1, td2, td3; + + td1 = Utility.getTupleDesc(1, "td1"); + td2 = Utility.getTupleDesc(2, "td2"); + + // test td1.combine(td2) + td3 = TupleDesc.merge(td1, td2); + assertEquals(3 , td3.numFields()); + assertEquals(3 * Type.INT_TYPE.getLen(), td3.getSize()); + for (int i = 0; i < 3; ++i) + assertEquals(Type.INT_TYPE, td3.getFieldType(i)); + assertEquals(combinedStringArrays(td1, td2, td3), true); + + // test td2.combine(td1) + td3 = TupleDesc.merge(td2, td1); + assertEquals(3 , td3.numFields()); + assertEquals(3 * Type.INT_TYPE.getLen(), td3.getSize()); + for (int i = 0; i < 3; ++i) + assertEquals(Type.INT_TYPE, td3.getFieldType(i)); + assertEquals(combinedStringArrays(td2, td1, td3), true); + + // test td2.combine(td2) + td3 = TupleDesc.merge(td2, td2); + assertEquals(4 , td3.numFields()); + assertEquals(4 * Type.INT_TYPE.getLen(), td3.getSize()); + for (int i = 0; i < 4; ++i) + assertEquals(Type.INT_TYPE, td3.getFieldType(i)); + assertEquals(combinedStringArrays(td2, td2, td3), true); + } + + /** + * Ensures that combined's field names = td1's field names + td2's field names + */ + private boolean combinedStringArrays(TupleDesc td1, TupleDesc td2, TupleDesc combined) { + for (int i = 0; i < td1.numFields(); i++) { + if (!(((td1.getFieldName(i) == null) && (combined.getFieldName(i) == null)) || + td1.getFieldName(i).equals(combined.getFieldName(i)))) { + return false; + } + } + + for (int i = td1.numFields(); i < td1.numFields() + td2.numFields(); i++) { + if (!(((td2.getFieldName(i-td1.numFields()) == null) && (combined.getFieldName(i) == null)) || + td2.getFieldName(i-td1.numFields()).equals(combined.getFieldName(i)))) { + return false; + } + } + + return true; + } + + /** + * Unit test for TupleDesc.getType() + */ + @Test public void getType() { + int[] lengths = new int[] { 1, 2, 1000 }; + + for (int len: lengths) { + TupleDesc td = Utility.getTupleDesc(len); + for (int i = 0; i < len; ++i) + assertEquals(Type.INT_TYPE, td.getFieldType(i)); + } + } + + /** + * Unit test for TupleDesc.nameToId() + */ + @Test public void nameToId() { + int[] lengths = new int[] { 1, 2, 1000 }; + String prefix = "test"; + + for (int len: lengths) { + // Make sure you retrieve well-named fields + TupleDesc td = Utility.getTupleDesc(len, prefix); + for (int i = 0; i < len; ++i) { + assertEquals(i, td.fieldNameToIndex(prefix + i)); + } + + // Make sure you throw exception for non-existent fields + try { + td.fieldNameToIndex("foo"); + Assert.fail("foo is not a valid field name"); + } catch (NoSuchElementException e) { + // expected to get here + } + + // Make sure you throw exception for null searches + try { + td.fieldNameToIndex(null); + Assert.fail("null is not a valid field name"); + } catch (NoSuchElementException e) { + // expected to get here + } + + // Make sure you throw exception when all field names are null + td = Utility.getTupleDesc(len); + try { + td.fieldNameToIndex(prefix); + Assert.fail("no fields are named, so you can't find it"); + } catch (NoSuchElementException e) { + // expected to get here + } + } + } + + /** + * Unit test for TupleDesc.getSize() + */ + @Test public void getSize() { + int[] lengths = new int[] { 1, 2, 1000 }; + + for (int len: lengths) { + TupleDesc td = Utility.getTupleDesc(len); + assertEquals(len * Type.INT_TYPE.getLen(), td.getSize()); + } + } + + /** + * Unit test for TupleDesc.numFields() + */ + @Test public void numFields() { + int[] lengths = new int[] { 1, 2, 1000 }; + + for (int len : lengths) { + TupleDesc td = Utility.getTupleDesc(len); + assertEquals(len, td.numFields()); + } + } + + @Test public void testEquals() { + TupleDesc singleInt = new TupleDesc(new Type[]{Type.INT_TYPE}); + TupleDesc singleInt2 = new TupleDesc(new Type[]{Type.INT_TYPE}); + TupleDesc intString = new TupleDesc(new Type[]{Type.INT_TYPE, Type.STRING_TYPE}); + + // .equals() with null should return false + assertFalse(singleInt.equals(null)); + + // .equals() with the wrong type should return false + assertFalse(singleInt.equals(new Object())); + + assertTrue(singleInt.equals(singleInt)); + assertTrue(singleInt.equals(singleInt2)); + assertTrue(singleInt2.equals(singleInt)); + assertTrue(intString.equals(intString)); + + assertFalse(singleInt.equals(intString)); + assertFalse(singleInt2.equals(intString)); + assertFalse(intString.equals(singleInt)); + assertFalse(intString.equals(singleInt2)); + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(TupleDescTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/TupleTest.java b/hw/hw3/starter-code/test/simpledb/TupleTest.java new file mode 100644 index 0000000000000000000000000000000000000000..e3bf6aef057041c7ba22104a5b55aecbbbd99512 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/TupleTest.java @@ -0,0 +1,67 @@ +package simpledb; + +import static org.junit.Assert.assertEquals; +import junit.framework.JUnit4TestAdapter; + +import org.junit.Test; + +import simpledb.systemtest.SimpleDbTestBase; + +public class TupleTest extends SimpleDbTestBase { + + /** + * Unit test for Tuple.getField() and Tuple.setField() + */ + @Test public void modifyFields() { + TupleDesc td = Utility.getTupleDesc(2); + + Tuple tup = new Tuple(td); + tup.setField(0, new IntField(-1)); + tup.setField(1, new IntField(0)); + + assertEquals(new IntField(-1), tup.getField(0)); + assertEquals(new IntField(0), tup.getField(1)); + + tup.setField(0, new IntField(1)); + tup.setField(1, new IntField(37)); + + assertEquals(new IntField(1), tup.getField(0)); + assertEquals(new IntField(37), tup.getField(1)); + } + + /** + * Unit test for Tuple.getTupleDesc() + */ + @Test public void getTupleDesc() { + TupleDesc td = Utility.getTupleDesc(5); + Tuple tup = new Tuple(td); + assertEquals(td, tup.getTupleDesc()); + } + + /** + * Unit test for Tuple.getRecordId() and Tuple.setRecordId() + */ + @Test public void modifyRecordId() { + Tuple tup1 = new Tuple(Utility.getTupleDesc(1)); + HeapPageId pid1 = new HeapPageId(0,0); + RecordId rid1 = new RecordId(pid1, 0); + tup1.setRecordId(rid1); + + try { + assertEquals(rid1, tup1.getRecordId()); + } catch (java.lang.UnsupportedOperationException e) { + //rethrow the exception with an explanation + throw new UnsupportedOperationException("modifyRecordId() test failed due to " + + "RecordId.equals() not being implemented. This is not required for Lab 1, " + + "but should pass when you do implement the RecordId class."); + } + } + + /** + * JUnit suite target + */ + public static junit.framework.Test suite() { + return new JUnit4TestAdapter(TupleTest.class); + } +} + diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/AggregateTest.java b/hw/hw3/starter-code/test/simpledb/systemtest/AggregateTest.java new file mode 100644 index 0000000000000000000000000000000000000000..39cd9763cd9b3dd5dced87af40717b8d23974e49 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/AggregateTest.java @@ -0,0 +1,120 @@ +package simpledb.systemtest; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import simpledb.*; + +import org.junit.Test; + +public class AggregateTest extends SimpleDbTestBase { + public void validateAggregate(DbFile table, Aggregator.Op operation, int aggregateColumn, int groupColumn, ArrayList<ArrayList<Integer>> expectedResult) + throws DbException, TransactionAbortedException, IOException { + TransactionId tid = new TransactionId(); + SeqScan ss = new SeqScan(tid, table.getId(), ""); + Aggregate ag = new Aggregate(ss, aggregateColumn, groupColumn, operation); + + SystemTestUtil.matchTuples(ag, expectedResult); + Database.getBufferPool().transactionComplete(tid); + } + + private int computeAggregate(ArrayList<Integer> values, Aggregator.Op operation) { + if (operation == Aggregator.Op.COUNT) return values.size(); + + int value = 0; + if (operation == Aggregator.Op.MIN) value = Integer.MAX_VALUE; + else if (operation == Aggregator.Op.MAX) value = Integer.MIN_VALUE; + + for (int v : values) { + switch (operation) { + case MAX: + if (v > value) value = v; + break; + case MIN: + if (v < value) value = v; + break; + case AVG: + case SUM: + value += v; + break; + default: + throw new IllegalArgumentException("Unsupported operation " + operation); + } + } + + if (operation == Aggregator.Op.AVG) value /= values.size(); + return value; + } + + private ArrayList<ArrayList<Integer>> aggregate(ArrayList<ArrayList<Integer>> tuples, Aggregator.Op operation, int aggregateColumn, int groupColumn) { + // Group the values + HashMap<Integer, ArrayList<Integer>> values = new HashMap<Integer, ArrayList<Integer>>(); + for (ArrayList<Integer> t : tuples) { + Integer key = null; + if (groupColumn != Aggregator.NO_GROUPING) key = t.get(groupColumn); + Integer value = t.get(aggregateColumn); + + if (!values.containsKey(key)) values.put(key, new ArrayList<Integer>()); + values.get(key).add(value); + } + + ArrayList<ArrayList<Integer>> results = new ArrayList<ArrayList<Integer>>(); + for (Map.Entry<Integer, ArrayList<Integer>> e : values.entrySet()) { + ArrayList<Integer> result = new ArrayList<Integer>(); + if (groupColumn != Aggregator.NO_GROUPING) result.add(e.getKey()); + result.add(computeAggregate(e.getValue(), operation)); + results.add(result); + } + return results; + } + + private final static int ROWS = 1024; + private final static int MAX_VALUE = 64; + private final static int COLUMNS = 3; + private void doAggregate(Aggregator.Op operation, int groupColumn) + throws IOException, DbException, TransactionAbortedException { + // Create the table + ArrayList<ArrayList<Integer>> createdTuples = new ArrayList<ArrayList<Integer>>(); + HeapFile table = SystemTestUtil.createRandomHeapFile( + COLUMNS, ROWS, MAX_VALUE, null, createdTuples); + + // Compute the expected answer + ArrayList<ArrayList<Integer>> expected = + aggregate(createdTuples, operation, 1, groupColumn); + + // validate that we get the answer + validateAggregate(table, operation, 1, groupColumn, expected); + } + + @Test public void testSum() throws IOException, DbException, TransactionAbortedException { + doAggregate(Aggregator.Op.SUM, 0); + } + + @Test public void testMin() throws IOException, DbException, TransactionAbortedException { + doAggregate(Aggregator.Op.MIN, 0); + } + + @Test public void testMax() throws IOException, DbException, TransactionAbortedException { + doAggregate(Aggregator.Op.MAX, 0); + } + + @Test public void testCount() throws IOException, DbException, TransactionAbortedException { + doAggregate(Aggregator.Op.COUNT, 0); + } + + @Test public void testAverage() throws IOException, DbException, TransactionAbortedException { + doAggregate(Aggregator.Op.AVG, 0); + } + + @Test public void testAverageNoGroup() + throws IOException, DbException, TransactionAbortedException { + doAggregate(Aggregator.Op.AVG, Aggregator.NO_GROUPING); + } + + /** Make test compatible with older version of ant. */ + public static junit.framework.Test suite() { + return new junit.framework.JUnit4TestAdapter(AggregateTest.class); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/FilterBase.java b/hw/hw3/starter-code/test/simpledb/systemtest/FilterBase.java new file mode 100644 index 0000000000000000000000000000000000000000..cd18945f8ca19801a2e5aca9b162a2d90b574095 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/FilterBase.java @@ -0,0 +1,85 @@ +package simpledb.systemtest; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import static org.junit.Assert.*; +import org.junit.Test; + +import simpledb.*; + +public abstract class FilterBase extends SimpleDbTestBase { + private static final int COLUMNS = 3; + private static final int ROWS = 1097; + + /** Should apply the predicate to table. This will be executed in transaction tid. */ + protected abstract int applyPredicate(HeapFile table, TransactionId tid, Predicate predicate) + throws DbException, TransactionAbortedException, IOException; + + /** Optional hook for validating database state after applyPredicate. */ + protected void validateAfter(HeapFile table) + throws DbException, TransactionAbortedException, IOException {} + + protected ArrayList<ArrayList<Integer>> createdTuples; + + private int runTransactionForPredicate(HeapFile table, Predicate predicate) + throws IOException, DbException, TransactionAbortedException { + TransactionId tid = new TransactionId(); + int result = applyPredicate(table, tid, predicate); + Database.getBufferPool().transactionComplete(tid); + return result; + } + + private void validatePredicate(int column, int columnValue, int trueValue, int falseValue, + Predicate.Op operation) throws IOException, DbException, TransactionAbortedException { + // Test the true value + HeapFile f = createTable(column, columnValue); + Predicate predicate = new Predicate(column, operation, new IntField(trueValue)); + assertEquals(ROWS, runTransactionForPredicate(f, predicate)); + f = Utility.openHeapFile(COLUMNS, f.getFile()); + validateAfter(f); + + // Test the false value + f = createTable(column, columnValue); + predicate = new Predicate(column, operation, new IntField(falseValue)); + assertEquals(0, runTransactionForPredicate(f, predicate)); + f = Utility.openHeapFile(COLUMNS, f.getFile()); + validateAfter(f); + } + + private HeapFile createTable(int column, int columnValue) + throws IOException, DbException, TransactionAbortedException { + Map<Integer, Integer> columnSpecification = new HashMap<Integer, Integer>(); + columnSpecification.put(column, columnValue); + createdTuples = new ArrayList<ArrayList<Integer>>(); + return SystemTestUtil.createRandomHeapFile( + COLUMNS, ROWS, columnSpecification, createdTuples); + } + + @Test public void testEquals() throws + DbException, TransactionAbortedException, IOException { + validatePredicate(0, 1, 1, 2, Predicate.Op.EQUALS); + } + + @Test public void testLessThan() throws + DbException, TransactionAbortedException, IOException { + validatePredicate(1, 1, 2, 1, Predicate.Op.LESS_THAN); + } + + @Test public void testLessThanOrEq() throws + DbException, TransactionAbortedException, IOException { + validatePredicate(2, 42, 42, 41, Predicate.Op.LESS_THAN_OR_EQ); + } + + @Test public void testGreaterThan() throws + DbException, TransactionAbortedException, IOException { + validatePredicate(2, 42, 41, 42, Predicate.Op.GREATER_THAN); + } + + @Test public void testGreaterThanOrEq() throws + DbException, TransactionAbortedException, IOException { + validatePredicate(2, 42, 42, 43, Predicate.Op.GREATER_THAN_OR_EQ); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/FilterTest.java b/hw/hw3/starter-code/test/simpledb/systemtest/FilterTest.java new file mode 100644 index 0000000000000000000000000000000000000000..eeba3df487baa4b063dc4130527833e0b4b21a34 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/FilterTest.java @@ -0,0 +1,29 @@ +package simpledb.systemtest; + +import java.io.IOException; +import static org.junit.Assert.*; +import simpledb.*; + +public class FilterTest extends FilterBase { + @Override + protected int applyPredicate(HeapFile table, TransactionId tid, Predicate predicate) + throws DbException, TransactionAbortedException, IOException { + SeqScan ss = new SeqScan(tid, table.getId(), ""); + Filter filter = new Filter(predicate, ss); + filter.open(); + + int resultCount = 0; + while (filter.hasNext()) { + assertNotNull(filter.next()); + resultCount += 1; + } + + filter.close(); + return resultCount; + } + + /** Make test compatible with older version of ant. */ + public static junit.framework.Test suite() { + return new junit.framework.JUnit4TestAdapter(FilterTest.class); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/JoinTest.java b/hw/hw3/starter-code/test/simpledb/systemtest/JoinTest.java new file mode 100644 index 0000000000000000000000000000000000000000..337d4232eaa73db610e3d21c67998919fc1f18b9 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/JoinTest.java @@ -0,0 +1,76 @@ +package simpledb.systemtest; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; + +import org.junit.Test; + +import simpledb.*; + +public class JoinTest extends SimpleDbTestBase { + private static final int COLUMNS = 2; + public void validateJoin(int table1ColumnValue, int table1Rows, int table2ColumnValue, + int table2Rows) + throws IOException, DbException, TransactionAbortedException { + // Create the two tables + HashMap<Integer, Integer> columnSpecification = new HashMap<Integer, Integer>(); + columnSpecification.put(0, table1ColumnValue); + ArrayList<ArrayList<Integer>> t1Tuples = new ArrayList<ArrayList<Integer>>(); + HeapFile table1 = SystemTestUtil.createRandomHeapFile( + COLUMNS, table1Rows, columnSpecification, t1Tuples); + assert t1Tuples.size() == table1Rows; + + columnSpecification.put(0, table2ColumnValue); + ArrayList<ArrayList<Integer>> t2Tuples = new ArrayList<ArrayList<Integer>>(); + HeapFile table2 = SystemTestUtil.createRandomHeapFile( + COLUMNS, table2Rows, columnSpecification, t2Tuples); + assert t2Tuples.size() == table2Rows; + + // Generate the expected results + ArrayList<ArrayList<Integer>> expectedResults = new ArrayList<ArrayList<Integer>>(); + for (ArrayList<Integer> t1 : t1Tuples) { + for (ArrayList<Integer> t2 : t2Tuples) { + // If the columns match, join the tuples + if (t1.get(0).equals(t2.get(0))) { + ArrayList<Integer> out = new ArrayList<Integer>(t1); + out.addAll(t2); + expectedResults.add(out); + } + } + } + + // Begin the join + TransactionId tid = new TransactionId(); + SeqScan ss1 = new SeqScan(tid, table1.getId(), ""); + SeqScan ss2 = new SeqScan(tid, table2.getId(), ""); + JoinPredicate p = new JoinPredicate(0, Predicate.Op.EQUALS, 0); + Join joinOp = new Join(p, ss1, ss2); + + // test the join results + SystemTestUtil.matchTuples(joinOp, expectedResults); + + joinOp.close(); + Database.getBufferPool().transactionComplete(tid); + } + + @Test public void testSingleMatch() + throws IOException, DbException, TransactionAbortedException { + validateJoin(1, 1, 1, 1); + } + + @Test public void testNoMatch() + throws IOException, DbException, TransactionAbortedException { + validateJoin(1, 2, 2, 10); + } + + @Test public void testMultipleMatch() + throws IOException, DbException, TransactionAbortedException { + validateJoin(1, 3, 1, 3); + } + + /** Make test compatible with older version of ant. */ + public static junit.framework.Test suite() { + return new junit.framework.JUnit4TestAdapter(JoinTest.class); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/ScanTest.java b/hw/hw3/starter-code/test/simpledb/systemtest/ScanTest.java new file mode 100644 index 0000000000000000000000000000000000000000..5352516a4105f0790710ea9b0365a6043f5f547b --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/ScanTest.java @@ -0,0 +1,111 @@ +package simpledb.systemtest; + +import simpledb.systemtest.SystemTestUtil; + +import static org.junit.Assert.*; + +import java.io.File; +import java.io.IOException; +import java.util.ArrayList; +import java.util.NoSuchElementException; +import java.util.Random; + +import org.junit.Test; + +import simpledb.*; + +/** + * Dumps the contents of a table. + * args[1] is the number of columns. E.g., if it's 5, then ScanTest will end + * up dumping the contents of f4.0.txt. + */ +public class ScanTest extends SimpleDbTestBase { + private final static Random r = new Random(); + + /** Tests the scan operator for a table with the specified dimensions. */ + private void validateScan(int[] columnSizes, int[] rowSizes) + throws IOException, DbException, TransactionAbortedException { + for (int columns : columnSizes) { + for (int rows : rowSizes) { + ArrayList<ArrayList<Integer>> tuples = new ArrayList<ArrayList<Integer>>(); + HeapFile f = SystemTestUtil.createRandomHeapFile(columns, rows, null, tuples); + SystemTestUtil.matchTuples(f, tuples); + Database.resetBufferPool(BufferPool.DEFAULT_PAGES); + } + } + } + + /** Scan 1-4 columns. */ + @Test public void testSmall() throws IOException, DbException, TransactionAbortedException { + int[] columnSizes = new int[]{1, 2, 3, 4}; + int[] rowSizes = + new int[]{0, 1, 2, 511, 512, 513, 1023, 1024, 1025, 4096 + r.nextInt(4096)}; + validateScan(columnSizes, rowSizes); + } + + /** Test that rewinding a SeqScan iterator works. */ + @Test public void testRewind() throws IOException, DbException, TransactionAbortedException { + ArrayList<ArrayList<Integer>> tuples = new ArrayList<ArrayList<Integer>>(); + HeapFile f = SystemTestUtil.createRandomHeapFile(2, 1000, null, tuples); + + TransactionId tid = new TransactionId(); + SeqScan scan = new SeqScan(tid, f.getId(), "table"); + scan.open(); + for (int i = 0; i < 100; ++i) { + assertTrue(scan.hasNext()); + Tuple t = scan.next(); + assertEquals(tuples.get(i), SystemTestUtil.tupleToList(t)); + } + + scan.rewind(); + for (int i = 0; i < 100; ++i) { + assertTrue(scan.hasNext()); + Tuple t = scan.next(); + assertEquals(tuples.get(i), SystemTestUtil.tupleToList(t)); + } + scan.close(); + Database.getBufferPool().transactionComplete(tid); + } + + /** Verifies that the buffer pool is actually caching data. + * @throws TransactionAbortedException + * @throws DbException */ + @Test public void testCache() throws IOException, DbException, TransactionAbortedException { + /** Counts the number of readPage operations. */ + class InstrumentedHeapFile extends HeapFile { + public InstrumentedHeapFile(File f, TupleDesc td) { + super(f, td); + } + + @Override + public Page readPage(PageId pid) throws NoSuchElementException { + readCount += 1; + return super.readPage(pid); + } + + public int readCount = 0; + } + + // Create the table + final int PAGES = 30; + ArrayList<ArrayList<Integer>> tuples = new ArrayList<ArrayList<Integer>>(); + File f = SystemTestUtil.createRandomHeapFileUnopened(1, 992*PAGES, 1000, null, tuples); + TupleDesc td = Utility.getTupleDesc(1); + InstrumentedHeapFile table = new InstrumentedHeapFile(f, td); + Database.getCatalog().addTable(table, SystemTestUtil.getUUID()); + + // Scan the table once + SystemTestUtil.matchTuples(table, tuples); + assertEquals(PAGES, table.readCount); + table.readCount = 0; + + // Scan the table again: all pages should be cached + SystemTestUtil.matchTuples(table, tuples); + assertEquals(0, table.readCount); + } + + /** Make test compatible with older version of ant. */ + public static junit.framework.Test suite() { + return new junit.framework.JUnit4TestAdapter(ScanTest.class); + } +} diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/SimpleDbTestBase.java b/hw/hw3/starter-code/test/simpledb/systemtest/SimpleDbTestBase.java new file mode 100644 index 0000000000000000000000000000000000000000..e2b0e68f2a0eb47137208c79c591c54caeea10c5 --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/SimpleDbTestBase.java @@ -0,0 +1,20 @@ +package simpledb.systemtest; + +import org.junit.Before; + +import simpledb.Database; + +/** + * Base class for all SimpleDb test classes. + * @author nizam + * + */ +public class SimpleDbTestBase { + /** + * Reset the database before each test is run. + */ + @Before public void setUp() throws Exception { + Database.reset(); + } + +} diff --git a/hw/hw3/starter-code/test/simpledb/systemtest/SystemTestUtil.java b/hw/hw3/starter-code/test/simpledb/systemtest/SystemTestUtil.java new file mode 100644 index 0000000000000000000000000000000000000000..70b67ef78e59a07f6f11923bfd5b92801f9bf53e --- /dev/null +++ b/hw/hw3/starter-code/test/simpledb/systemtest/SystemTestUtil.java @@ -0,0 +1,230 @@ +package simpledb.systemtest; + +import java.io.File; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Random; +import java.util.UUID; + +import org.junit.Assert; + +import simpledb.*; + +public class SystemTestUtil { + public static final TupleDesc SINGLE_INT_DESCRIPTOR = + new TupleDesc(new Type[]{Type.INT_TYPE}); + + private static final int MAX_RAND_VALUE = 1 << 16; + + /** @param columnSpecification Mapping between column index and value. */ + public static HeapFile createRandomHeapFile( + int columns, int rows, Map<Integer, Integer> columnSpecification, + ArrayList<ArrayList<Integer>> tuples) + throws IOException, DbException, TransactionAbortedException { + return createRandomHeapFile(columns, rows, MAX_RAND_VALUE, columnSpecification, tuples); + } + + /** @param columnSpecification Mapping between column index and value. */ + public static HeapFile createRandomHeapFile( + int columns, int rows, int maxValue, Map<Integer, Integer> columnSpecification, + ArrayList<ArrayList<Integer>> tuples) + throws IOException, DbException, TransactionAbortedException { + File temp = createRandomHeapFileUnopened(columns, rows, maxValue, + columnSpecification, tuples); + return Utility.openHeapFile(columns, temp); + } + + public static HeapFile createRandomHeapFile( + int columns, int rows, Map<Integer, Integer> columnSpecification, + ArrayList<ArrayList<Integer>> tuples, String colPrefix) + throws IOException, DbException, TransactionAbortedException { + return createRandomHeapFile(columns, rows, MAX_RAND_VALUE, columnSpecification, tuples, colPrefix); + } + + public static HeapFile createRandomHeapFile( + int columns, int rows, int maxValue, Map<Integer, Integer> columnSpecification, + ArrayList<ArrayList<Integer>> tuples, String colPrefix) + throws IOException, DbException, TransactionAbortedException { + File temp = createRandomHeapFileUnopened(columns, rows, maxValue, + columnSpecification, tuples); + return Utility.openHeapFile(columns, colPrefix, temp); + } + + public static File createRandomHeapFileUnopened(int columns, int rows, + int maxValue, Map<Integer, Integer> columnSpecification, + ArrayList<ArrayList<Integer>> tuples) throws IOException { + if (tuples != null) { + tuples.clear(); + } else { + tuples = new ArrayList<ArrayList<Integer>>(rows); + } + + Random r = new Random(); + + // Fill the tuples list with generated values + for (int i = 0; i < rows; ++i) { + ArrayList<Integer> tuple = new ArrayList<Integer>(columns); + for (int j = 0; j < columns; ++j) { + // Generate random values, or use the column specification + Integer columnValue = null; + if (columnSpecification != null) columnValue = columnSpecification.get(j); + if (columnValue == null) { + columnValue = r.nextInt(maxValue); + } + tuple.add(columnValue); + } + tuples.add(tuple); + } + + // Convert the tuples list to a heap file and open it + File temp = File.createTempFile("table", ".dat"); + temp.deleteOnExit(); + HeapFileEncoder.convert(tuples, temp, BufferPool.getPageSize(), columns); + return temp; + } + + public static ArrayList<Integer> tupleToList(Tuple tuple) { + ArrayList<Integer> list = new ArrayList<Integer>(); + for (int i = 0; i < tuple.getTupleDesc().numFields(); ++i) { + int value = ((IntField)tuple.getField(i)).getValue(); + list.add(value); + } + return list; + } + + public static void matchTuples(DbFile f, List<ArrayList<Integer>> tuples) + throws DbException, TransactionAbortedException, IOException { + TransactionId tid = new TransactionId(); + matchTuples(f, tid, tuples); + Database.getBufferPool().transactionComplete(tid); + } + + public static void matchTuples(DbFile f, TransactionId tid, List<ArrayList<Integer>> tuples) + throws DbException, TransactionAbortedException, IOException { + SeqScan scan = new SeqScan(tid, f.getId(), ""); + matchTuples(scan, tuples); + } + + public static void matchTuples(DbIterator iterator, List<ArrayList<Integer>> tuples) + throws DbException, TransactionAbortedException, IOException { + ArrayList<ArrayList<Integer>> copy = new ArrayList<ArrayList<Integer>>(tuples); + + if (Debug.isEnabled()) { + Debug.log("Expected tuples:"); + for (ArrayList<Integer> t : copy) { + Debug.log("\t" + Utility.listToString(t)); + } + } + + iterator.open(); + while (iterator.hasNext()) { + Tuple t = iterator.next(); + ArrayList<Integer> list = tupleToList(t); + boolean isExpected = copy.remove(list); + Debug.log("scanned tuple: %s (%s)", t, isExpected ? "expected" : "not expected"); + if (!isExpected) { + Assert.fail("expected tuples does not contain: " + t); + } + } + iterator.close(); + + if (!copy.isEmpty()) { + String msg = "expected to find the following tuples:\n"; + final int MAX_TUPLES_OUTPUT = 10; + int count = 0; + for (ArrayList<Integer> t : copy) { + if (count == MAX_TUPLES_OUTPUT) { + msg += "[" + (copy.size() - MAX_TUPLES_OUTPUT) + " more tuples]"; + break; + } + msg += "\t" + Utility.listToString(t) + "\n"; + count += 1; + } + Assert.fail(msg); + } + } + + /** + * Returns number of bytes of RAM used by JVM after calling System.gc many times. + * @return amount of RAM (in bytes) used by JVM + */ + public static long getMemoryFootprint() { + // Call System.gc in a loop until it stops freeing memory. This is + // still no guarantee that all the memory is freed, since System.gc is + // just a "hint". + Runtime runtime = Runtime.getRuntime(); + long memAfter = runtime.totalMemory() - runtime.freeMemory(); + long memBefore = memAfter + 1; + while (memBefore != memAfter) { + memBefore = memAfter; + System.gc(); + memAfter = runtime.totalMemory() - runtime.freeMemory(); + } + + return memAfter; + } + + /** + * Generates a unique string each time it is called. + * @return a new unique UUID as a string, using java.util.UUID + */ + public static String getUUID() { + return UUID.randomUUID().toString(); + } + + private static double[] getDiff(double[] sequence) { + double ret[] = new double[sequence.length - 1]; + for (int i = 0; i < sequence.length - 1; ++i) + ret[i] = sequence[i + 1] - sequence[i]; + return ret; + } + /** + * Checks if the sequence represents a quadratic sequence (approximately) + * ret[0] is true if the sequence is quadratic + * ret[1] is the common difference of the sequence if ret[0] is true. + * @param sequence + * @return ret[0] = true if sequence is qudratic(or sub-quadratic or linear), ret[1] = the coefficient of n^2 + */ + public static Object[] checkQuadratic(double[] sequence) { + Object ret[] = checkLinear(getDiff(sequence)); + ret[1] = (Double)ret[1]/2.0; + return ret; + } + + /** + * Checks if the sequence represents an arithmetic sequence (approximately) + * ret[0] is true if the sequence is linear + * ret[1] is the common difference of the sequence if ret[0] is true. + * @param sequence + * @return ret[0] = true if sequence is linear, ret[1] = the common difference + */ + public static Object[] checkLinear(double[] sequence) { + return checkConstant(getDiff(sequence)); + } + + /** + * Checks if the sequence represents approximately a fixed sequence (c,c,c,c,..) + * ret[0] is true if the sequence is linear + * ret[1] is the constant of the sequence if ret[0] is true. + * @param sequence + * @return ret[0] = true if sequence is constant, ret[1] = the constant + */ + public static Object[] checkConstant(double[] sequence) { + Object[] ret = new Object[2]; + //compute average + double sum = .0; + for(int i = 0; i < sequence.length; ++i) + sum += sequence[i]; + double av = sum/(sequence.length + .0); + //compute standard deviation + double sqsum = 0; + for(int i = 0; i < sequence.length; ++i) + sqsum += (sequence[i] - av)*(sequence[i] - av); + double std = Math.sqrt(sqsum/(sequence.length + .0)); + ret[0] = std < 1.0 ? Boolean.TRUE : Boolean.FALSE; + ret[1] = av; + return ret; + } +}