Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • sarafa/p2
  • mertht/p2
  • cse332-19au/p2
  • sandchow/p2
  • hanzhang/p2
  • cse332-19sp/p2
  • cse332-20sp/p2
  • cse332-20su-tasks/p2
  • cse332-20su/p2
  • cse332-21su/p2
  • cse332-21wi/p2
  • cse332-21sp/p2
  • cse332-20au/p2
13 results
Show changes
Commits on Source (65)
Showing
with 405 additions and 296 deletions
/bin/
.DS_Store
.idea/
*.iml
No preview for this file type
How To Write Unmaintainable Code
CMP_home Last updated Thursday, 18-Nov-1999 20:27:28 PDT by Roedy Green ©1997-1999 Canadian Mind Products.
This essay is also available in Spanish.
In the interests of creating employment opportunities in the Java programming field, I am passing on these tips from the masters on how to write code that is so difficult to maintain, that the people who come after you will take years to make even the simplest changes. Further, if you follow all these rules religiously, you will even guarantee yourself a lifetime of employment, since no one but you has a hope in hell of maintaining the code.
General Principles
To foil the maintenance programmer, you have to understand how he thinks. He has your giant program. He has no time to read it all, much less understand it. He wants to rapidly find the place to make his change, make it and get out and have no unexpected side effects from the change.
He views your code through a tube taken from the centre of a roll of toilet paper. He can only see a tiny piece of your program at a time. You want to make sure he can never get the big picture from doing that. You want to make it as hard as possible for him to find the code he is looking for. But even more important, you want to make it as awkward as possible for him to safely ignore anything.
Specific Techniques
Lie in the comments. You don't have to actively lie, just fail to keep comments as up to date with the code.
Pepper the code with comments like /* add 1 to i */ however, never document wooly stuff like the overall purpose of the package or method.
Make sure that every method does a little bit more (or less) than its name suggests. As a simple example, a method named isValid(x) should as a side effect convert x to binary and store the result in a database.
Use acronyms to keep the code terse. Real men never define acronyms; they understand them genetically.
In the interests of efficiency, avoid encapsulation. Callers of a method need all the external clues they can get to remind them how the method works inside.
If, for example, you were writing an airline reservation system, make sure there are at least 25 places in the code that need to be modified if you were to add another airline. Never document where they are. People who come after you have no business modifying your code without thoroughly understanding every line of it.
In the name of efficiency, use cut/paste/clone/modify. This works much faster than using many small reusable modules.
Never never put a comment on a variable. Facts about how the variable is used, its bounds, its legal values, its implied/displayed number of decimal points, its units of measure, its display format, its data entry rules (e.g. total fill, must enter), when its value can be trusted etc. should be gleaned from the procedural code. If your boss forces you to write comments, lard method bodies with them, but never comment a variable, not even a temporary!
Try to pack as much as possible into a single line. This saves the overhead of temporary variables, and makes source files shorter by eliminating new line characters and white space. Tip: remove all white space around operators. Good programmers can often hit the 255 character line length limit imposed by some editors. The bonus of long lines is that programmers who cannot read 6 point type must scroll to view them.
Cd wrttn wtht vwls s mch trsr. When using abbreviations inside variable or method names, break the boredom with several variants for the same word, and even spell it out longhand once in while. This helps defeat those lazy bums who use text search to understand only some aspect of your program. Consider variant spellings as a variant on the ploy, e.g. mixing International colour, with American color and dude-speak kulerz. If you spell out names in full, there is only one possible way to spell each name. These are too easy for the maintenance programmer to remember. Because there are so many different ways to abbreviate a word, with abbreviations, you can have several different variables that all have the same apparent purpose. As an added bonus, the maintenance programmer might not even notice they are separate variables.
Never use an automated source code tidier to keep your code aligned. Lobby to have them banned them from your company on the grounds they create false deltas in PVCS (version control tracking) or that every programmer should have his own indenting style held forever sacrosanct for any module he wrote. Banning them is quite easy, even though they save the millions of keystrokes doing manual alignment and days wasted misinterpreting poorly aligned code. Just insist that everyone use the same tidied format, not just for storing in the common repository, but while they are editing. This starts an RWAR and the boss, to keep the peace, will ban automated tidying. Without automated tidying, you are now free to accidentally misalign the code to give the optical illusion that bodies of loops and ifs are longer or shorter than they really are, or that else clauses match a different if than they really do. e.g.
if (a)
if (b) x = y;
else x = z;
Never put in any { } surrounding your if/else blocks unless they are syntactically obligatory. If you have a deeply nested mixture of if/else statements and blocks, especially with misleading indentation, you can trip up even an expert maintenance programmer.
Rigidly follow the guidelines about no goto, no early returns, and no labelled breaks especially when you can increase the if/else nesting depth by at least 5 levels.
Use very long variable names that differ from each other by only one character, or only in upper/lower case. An ideal variable name pair is swimmer and swimner. Exploit the failure of most fonts to clearly discriminate between ilI1| or oO08 with identifier pairs like parselnt and parseInt or D0Calc and DOCalc. l is an exceptionally fine choice for a variable name since it will, to the casual glance, masquerade as the constant 1. Create varible names that differ from each other only in case e.g. HashTable and Hashtable.
Wherever scope rules permit, reuse existing unrelated variable names. Similarly, use the same temporary variable for two unrelated purposes (purporting to save stack slots). For a fiendish variant, morph the variable, for example, assign a value to a variable at the top of a very long method, and then somewhere in the middle, change the meaning of the variable in a subtle way, such as converting it from a 0-based coordinate to a 1-based coordinate. Be certain not to document this change in meaning.
Use lower case l to indicate long constants. e.g. 10l is more likely to be mistaken for 101 that 10L is.
Ignore the conventions in Java for where to use upper case in variable and class names i.e. Classes start with upper case, variables with lower case, constants are all upper case, with internal words capitalised. After all, Sun does (e.g. instanceof vs isInstanceOf, Hashtable). Not to worry, the compiler won't even issue a warning to give you away. If your boss forces you to use the conventions, when there is any doubt about whether an internal word should be capitalised, avoid capitalising or make a random choice, e.g. use both inputFileName and outputfilename. You can of course drive your team members insane by inventing your own insanely complex naming conventions then berate others for not following them. The ultimate technique is to create as many variable names as possible that differ subtlely from each other only in case.
Never use i for the innermost loop variable. Use anything but. Use i liberally for any other purpose especially for non-int variables. Similary use n as a loop index.
Never use local variables. Whenever you feel the temptation to use one, make it into an instance or static variable instead to unselfishly share it with all the other methods of the class. This will save you work later when other methods need similar declarations. C++ programmers can go a step further by making all variables global.
Never document gotchas in the code. If you suspect there may be a bug in a class, keep it to yourself. If you have ideas about how the code should be reorganised or rewritten, for heaven's sake, do not write them down. Remember the words of Thumper "If you can't say anything nice, don't say anything at all". What if the programmer who wrote that code saw your comments? What if the owner of the company saw them? What if a customer did? You could get yourself fired.
To break the boredom, use a thesaurus to look up as much alternate vocabulary as possible to refer to the same action, e.g. display, show, present. Vaguely hint there is some subtle difference, where none exists. However, if there are two similar functions that have a crucial difference, always use the same word in describing both functions (e.g. print to mean write to a file, and to a print on a laser, and to display on the screen). Under no circumstances, succumb to demands to write a glossary with the special purpose project vocabulary unambiguously defined. Doing so would be unprofessional breach of the structured design principle of information hiding.
In naming functions, make heavy use of abstract words like it, everything, data, handle, stuff, do, routine, perform and the digits e.g. routineX48, PerformDataFunction, DoIt, HandleStuff and do_args_method.
In Java, all primitives passed as parameters are effectively read-only because they are passed by value. The callee can modify the parameters, but that has no effect on the caller's variables. In contrast all objects passed are read-write. The reference is passed by value, which means the object itself is effectively passed by reference. The callee can do whatever it wants to the fields in your object. Never document whether a method actually modifies the fields in each of the passed parameters. Name your methods to suggest they only look at the fields when they actually change them.
Never document the units of measure of any variable, input, output or parameter. e.g. feet, metres, cartons. This is not so important in bean counting, but it is very important in engineering work. As a corollary, never document the units of measure of any conversion constants, or how the values were derived. It is mild cheating, but very effective, to salt the code with some incorrect units of measure in the comments. If you are feeling particularly malicious, make up your own unit of measure; name it after yourself or some obscure person and never define it. If somebody challenges you, tell them you did so that you could use integer rather than floating point arithmetic.
In engineering work there are two ways to code. One is to convert all inputs to S.I. (metric) units of measure, then do your calculations then convert back to various civil units of measure for output. The other is to maintain the various mixed measure systems throughout. Always choose the second. It's the American way!
I am going to let you in on a little-known coding secret. Exceptions are a pain in the behind. Properly-written code never fails, so exceptions are actually unnecessary. Don't waste time on them. Subclassing exceptions is for incompetents who know their code will fail. You can greatly simplify your program by having only a single try/catch in the entire application (in main) that calls System.exit(). Just stick a perfectly standard set of throws on every method header whether they could throw any exceptions or not.
C compilers transform myArray[i] into *(myArray + i), which is equivalent to *(i + myArray) which is equivalent to i[myArray]. Experts know to put this to good use. Unfortunately, this technique can only be used in native classes.
If you have an array with 100 elements in it, hard code the literal 100 in as many places in the program as possible. Never use a static final named constant for the 100, or refer to it as myArray.length. To make changing this constant even more difficult, use the literal 50 instead of 100/2, or 99 instead of 100-1. You can futher disguise the 100 by checking for a == 101 instead of a > 100 or a > 99 instead of a >= 100.
Consider things like page sizes, where the lines consisting of x header, y body, and z footer lines, you can apply the obfuscations independently to each of these and to their partial or total sums.
These time-honoured techniques are especially effective in a program with two unrelated arrays that just accidentally happen to both have 100 elements. There are even more fiendish variants. To lull the maintenance programmer into a false sense of security, dutifully create the named constant, but very occasionally "accidentally" use the literal 100 value instead of the named constant. Most fiendish of all, in place of the literal 100 or the correct named constant, sporadically use some other unrelated named constant that just accidentally happens to have the value 100, for now. It almost goes without saying that you should avoid any consistent naming scheme that would associate an array name with its size constant.
Eschew any form of table-driven logic. It starts out innocently enough, but soon leads to end users proofreading and then shudder, even modifying the tables for themselves.
Nest as deeply as you can. Good coders can get up to 10 levels of ( ) on a single line and 20 { } in a single method. C++ coders have the additional powerful option of preprocessor nesting totally independent of the nest structure of the underlying code. You earn extra Brownie points whenever the beginning and end of a block appear on separate pages in a printed listing. Wherever possible, convert nested ifs into nested [? :] ternaries.
Join a computer book of the month club. Select authors who appear to be too busy writing books to have had any time to actually write any code themselves. Browse the local bookstore for titles with lots of cloud diagrams in them and no coding examples. Skim these books to learn obscure pedantic words you can use to intimidate the whippersnappers that come after you. Your code should impress. If people can't understand your vocabulary, they must assume that you are very intelligent and that your algorithms are very deep. Avoid any sort of homely analogies in your algorithm explanations.
Make "improvements" to your code often, and force users to upgrade often - after all, no one wants to be running an outdated version. Just because they think they're happy with the program as it is, just think how much happier they will be after you've "fixed" it! Don't tell anyone what the differences between versions are unless you are forced to - after all, why tell someone about bugs in the old version they might never have noticed otherwise?
The About Box should contain only the name of the program, the names of the coders and a copyright notice written in legalese. Ideally it should link to several megs of code that produce an entertaining animated display. However, it should never contain a description of what the program is for, its minor version number, or the date of the most recent code revision, or the website where to get the updates, or the author's email address. This way all the users will soon all be running on different versions, and will attempt to install version N+2 before installing version N+1.
The more changes you can make between versions the better, you don't want users to become bored with the same old API or user interface year after year. Finally, if you can make this change without the users noticing, this is better still - it will keep them on their toes, and keep them from becoming complacent.
If you have to write classes for some other programmer to use, put environment-checking code (getenv() in C++ / System.getProperty() in Java) in your classes' nameless static initializers, and pass all your arguments to the classes this way, rather than in the constructor methods. The advantage is that the initializer methods get called as soon as the class program binaries get loaded, even before any of the classes get instantiated, so they will usually get executed before the program main(). In other words, there will be no way for the rest of the program to modify these parameters before they get read into your classes - the users better have set up all their environment variables just the way you had them!
Choose your variable names to have absolutely no relation to the labels used when such variables are displayed on the screen. E.g. on the screen label the field "Postal Code" but in the code call the associated variable "zip".
Java lets you create methods that have the same name as the class, but that are not constructors. Exploit this to sow confusion.
Never use layouts. That way when the maintenance programmer adds one more field he will have to manually adjust the absolute co-ordinates of every other thing displayed on the screen. If your boss forces you to use a layout, use a single giant GridBagLayout, and hard code in absolute grid co-ordinates.
In Java, disdain the interface. If your supervisors complain, tell them that Java interfaces force you to "cut-and-paste" code between different classes that implement the same interface the same way, and they know how hard that would be to maintain. Instead, do as the Java AWT designers did - put lots of functionality in your classes that can only be used by classes that inherit from them, and use lots of "instanceof" checks in your methods. This way, if someone wants to reuse your code, they have to extend your classes. If they want to reuse your code from two different classes - tough luck, they can't extend both of them at once!
Make all of your leaf classes final. After all, you're done with the project - certainly no one else could possibly improve on your work by extending your classes. And it might even be a security flaw - after all, isn't java.lang.String final for just this reason? If other coders in your project complain, tell them about the execution speed improvement you're getting.
Make as many of your variables as possible static. If you don't need more than one instance of the class in this program, no one else ever will either. Again, if other coders in the project complain, tell them about the execution speed improvement you're getting.
Keep all of your unused and outdated methods and variables around in your code. After all - if you needed to use it once in 1976, who knows if you will want to use it again sometime? Sure the program's changed since then, but it might just as easily change back, you "don't want to have to reinvent the wheel" (supervisors love talk like that). If you have left the comments on those methods and variables untouched, and sufficiently cryptic, anyone maintaining the code will be too scared to touch them.
On a method called makeSnafucated insert only the comment /* make snafucated */. Never define what snafucated means anywhere. Only a fool does not already know, with complete certainty, what snafucated means.
Reverse the parameters on a method called drawRectangle(height, width) to drawRectangle(width, height) without making any change whatsoever to the name of the method. Then a few releases later, reverse it back again. The maintenance programmers can't tell by quickly looking at any call if it has been adjusted yet. Generalisations are left as an exercise for the reader.
Instead of using a parameters to a single method, create as many separate methods as you can. For example instead of setAlignment(int alignment) where alignment is an enumerated constant, for left, right, center, create three methods: setLeftAlignment, setRightAlignment, and setCenterAlignment. Of course, for the full effect, you must clone the common logic to make it hard to keep in sync.
The Kama Sutra technique has the added advantage of driving any users or documenters of the package to distraction as well as the maintenance programmers. Create a dozen overloaded variants of the same method that differ in only the most minute detail. I think it was Oscar Wilde who observed that positions 47 and 115 of the Kama Sutra were the same except in 115 the woman had her fingers crossed. Users of the package then have to carefully peruse the long list of methods to figure out just which variant to use. The technique also balloons the documentation and thus ensures it will more likely be out of date. If the boss asks why you are doing this, explain it is solely for the convenience of the users. Again for the full effect, clone any common logic.
Declare every method and variable public. After all, somebody, sometime might want to use it. Once a method has been declared public, it can't very well be retracted, now can it? This makes it very difficult to later change the way anything works under the covers. It also has the delightful side effect of obscuring what a class is for. If the boss asks if you are out of your mind, tell him you are following the classic principles of transparent interfaces.
In C++, overload library functions by using #define. That way it looks like you are using a familiar library function where in actuality you are using something totally different.
In C++, overload +,-,*,/ to do things totally unrelated to addition, subtraction etc. After all, if the Stroustroup can use the shift operator to do I/O, why should you not be equally creative? If you overload +, make sure you do it in a way that i = i + 5; has a totally different meaning from i += 5;
When documenting, and you need an arbitrary name to represent a filename use "file". Never use an obviously arbitrary name like "Charlie.dat" or "Frodo.txt". In general, in your examples, use arbitrary names that sound as much like reserved keywords as possible. For example, good names for parameters or variables would be: "bank", "blank", "class", "const", "constant", "input", "key", "keyword", "kind", "output", "parameter""parm", "system", "type", "value", "var" and "variable". If you use actual reserved words for your arbitrary names, which would be rejected by your command processor or compiler, so much the better. If you do this well, the users will be hopelessly confused between reserved keywords and arbitrary names in your example, but you can look innocent, claiming you did it to help them associate the appropriate purpose with each variable.
Always document your command syntax with your own, unique, undocumented brand of BNF notation. Never explain the syntax by providing a suite of annotated sample valid and invalid commands. That would demonstrate a complete lack of academic rigour. Railway diagrams are almost as gauche. Make sure there is no obvious way of telling a terminal symbol (something you would actually type) from an intermediate one -- something that represents a phrase in the syntax. Never use typeface, colour, caps, or any other visual clues to help the reader distinguish the two. Use the exact same punctuation glyphs in your BNF notation that you use in the command language itself, so the reader can never tell if a (...), [...], {...} or "..." is something you actually type as part of the command, or is intended to give clues about which syntax elements are obligatory, repeatable or optional in your BNF notation. After all, if they are too stupid to figure out your variant of BNF, they have no business using your program.
The macro preprocessor offers great opportunities for obfuscation. The key technique is to nest macro expansions several layers deep so that you have to discover all the various parts in many different *.hpp files. Placing executable code into macros then including those macros in every *.cpp file (even those that never use those macros) will maximize the amount of recompilation necessary if ever that code changes.
Java is schizophrenic about array declarations. You can do them the old C, way String x[], (which uses mixed pre-postfix notation) or the new way String[] x, which uses pure prefix notation. If you want to really confuse people, mix the notations: e.g.
byte[] rowvector, colvector, matrix[];
which is equivalent to:
byte[] rowvector;
byte[] colvector;
byte[][] matrix;
Java offers great opportunity for obfuscation whenever you have to convert. As a simple example, if you have to convert a double to a String, go circuitously, via Double with new Double(d).toString rather than the more direct Double.toString(d). You can, of course, be far more circuitous than that! Avoid any conversion techniques recommended by the Conversion Amanuensis. You get bonus points for every extra temporary object you leave littering the heap after your conversion.
Use threads with abandon.
Philosophy
The people who design languages are the people who write the compilers and system classes. Quite naturally they design to make their work easy and mathematically elegant. However, there are 10,000 maintenance programmers to every compiler writer. The grunt maintenance programmers have absolutely no say in the design of languages. Yet the total amount of code they write dwarfs the code in the compilers.
An example of the result of this sort of elitist thinking is the JDBC interface. It makes life easy for the JDBC implementor, but a nightmare for the maintenance programmer. It is far clumsier than the Fortran interface that came out with SQL three decades ago.
Maintenance programmers, if somebody ever consulted them, would demand ways to hide the housekeeping details so they could see the forest for the trees. They would demand all sorts of shortcuts so they would not have to type so much and so they could see more of the program at once on the screen. They would complain loudly about the myriad petty time-wasting tasks the compilers demand of them.
There are some efforts in this direction: NetRexx, Bali, and visual editors (e.g. IBM's Visual Age is a start) that can collapse detail irrelevant to the current purpose.
The Shoemaker Has No Shoes
Imagine having an accountant as a client who insisted on maintaining his general ledgers using a word processor. You would do you best to persuade him that his data should be structured. He needs validation with cross field checks. You would persuade him he could do so much more with that data when stored in a database, including controlled simultaneous update.
Imagine taking on a software developer as a client. He insists on maintaining all his data with a text editor. He is not yet even exploiting the word processor's colour, type size or fonts.
Think of what might happen if we started storing source code as structured data. We could view the same source code in many alternate ways, e.g. as Java, as NextRex, as a decision table, as a flow chart, as a loop structure skeleton (with the detail stripped off), as Java with various levels of detail or comments removed, as Java with highlights on the variables and method invocations of current interest, or as Java with generated comments about argument names and/or types. We could display complex arithmetic expressions in 2D, the way TeX and mathematicians do. You could see code with additional or fewer parentheses, (depending on how comfortable you feel with the precedence rules ). Parenthesis nests could use varying size and colour to help matching by eye. With changes as transparent overlay sets that you can optionally remove or apply, you could watch in real time as other programmers on your team, working in a different country, modified code in classes that you were working on too.
You could use the full colour abilities of the modern screen to give subliminal clues, e.g. by automatically assigning a portion of the spectrum to each package/class using a pastel shades as the backgrounds to any references to methods or variables of that class. You could bold face the definition of any identifier to make it stand out.
You could ask what methods/constructors will produce an object of type X? What methods will accept an object of type X as a parameter? What variables are accessible in this point in the code? By clicking on a method invocation or variable reference, you could see its definition, helping sort out which version of a given method will actually be invoked. You could ask to globally visit all references to a given method or variable, and tick them off once each was dealt with. You could do quite a bit of code writing by point and click.
Some of these ideas would not pan out. But the best way to find out which would be valuable in practice is to try them. Once we had the basic tool, we could experiment with hundreds of similar ideas to make like easier for the maintenance programmer.
I discuss this further under SCID and in the SCID student project.
Contributors
The following are some of the people who contributed to this list. My lawyers recommended I exclude those who taught by example.
Hugh McDonald, hughmcd@ican
Gareth Meyrick, gareth@pangloss.ucsf.edu
Jarle Stabell, jarle.stabell@dokpro.uio.no
Ko-Haw Nieh, niko@quality.com
Jim Johnson, jimj@jumpmusic.com
Jim Hyslop, Jim.Hyslop@mars.leitch.com
George Ruban, gruban%adsl4@gte.com
Mats Carlid, mats@adbk.se
John P. McGrath, mcgrath@enter.net
Brian Hurt, brianh@bit3.com
Chris Schlenker, Christoph.Schlenker@gfk.de
Nicholas Widdows, nicholas.widdows@traceplc.co.uk
Greg Compestine, gregcompestine@caleb-bldr.com
Carl L. Gay, sigue@thecia.net
This article appeared in Java Developers' Journal (volume 2 issue 6). I also spoke on this topic in 1997 November at the Colorado Summit Conference. It has been gradually growing ever since. I have had quite a few requests for permission to build links here. You are welcome to.
......@@ -30,11 +30,6 @@ import org.alicebot.ab.MagicBooleans;
import org.alicebot.ab.MagicStrings;
import org.alicebot.ab.PCAIMLProcessorExtension;
import com.google.code.chatterbotapi.ChatterBot;
import com.google.code.chatterbotapi.ChatterBotFactory;
import com.google.code.chatterbotapi.ChatterBotSession;
import com.google.code.chatterbotapi.ChatterBotType;
import cse332.misc.WordReader;
import javafx.application.Platform;
import javafx.embed.swing.JFXPanel;
......@@ -53,7 +48,6 @@ public class ChatWindow {
private final StringBuilder content;
public String theirUsername;
public Chat esession;
public ChatterBotSession csession;
private final WordSuggestor[] markov;
private final UMessageServerConnection connection;
private final SpellingCorrector checker;
......@@ -86,38 +80,12 @@ public class ChatWindow {
this.esession = new Chat(bot);
}
else if (this.theirUsername.equals("cleverbot")) {
ChatterBotFactory factory = new ChatterBotFactory();
ChatterBot bot1;
try {
bot1 = factory.create(ChatterBotType.CLEVERBOT);
this.csession = bot1.createSession();
} catch (Exception e) {
}
}
}
/**
* Initialize the contents of the frame.
*/
private void initialize() {
try {
String path = new java.io.File(".").getCanonicalPath();
this.content.append("<link rel='stylesheet' type='text/css' href='file:///"
+ path + "/chat.css'>");
this.content.append("<head>");
this.content.append(
" <script language=\"javascript\" type=\"text/javascript\">");
this.content.append(" function toBottom(){");
this.content
.append(" window.scrollTo(0, document.body.scrollHeight);");
this.content.append(" }");
this.content.append(" </script>");
this.content.append("</head>");
this.content.append("<body onload='toBottom()'>");
} catch (IOException e1) {
}
private void initialize() {
this.frame = new JFrame();
this.frame.setBounds(100, 100, 290, 390);
this.frame.setDefaultCloseOperation(WindowConstants.HIDE_ON_CLOSE);
......@@ -137,17 +105,28 @@ public class ChatWindow {
gbc_msgScrollPane.gridx = 0;
gbc_msgScrollPane.gridy = 0;
this.chatMessagesPanel = new JFXPanel();
this.frame.getContentPane().add(this.chatMessagesPanel, gbc_msgScrollPane);
Platform.runLater(() -> {
ChatWindow.this.chatMessages = new WebView();
BorderPane borderPane = new BorderPane();
borderPane.setCenter(ChatWindow.this.chatMessages);
Scene scene = new Scene(borderPane, 450, 450);
ChatWindow.this.chatMessagesPanel.setScene(scene);
});
show();
(new Thread() {
public void run() {
try {
String path = new java.io.File(".").getCanonicalPath();
content.append("<link rel='stylesheet' type='text/css' href='file:///"
+ path + "/chat.css'>");
content.append("<head>");
content.append(
" <script language=\"javascript\" type=\"text/javascript\">");
content.append(" function toBottom(){");
content
.append(" window.scrollTo(0, document.body.scrollHeight);");
content.append(" }");
content.append(" </script>");
content.append("</head>");
content.append("<body onload='toBottom()'>");
} catch (IOException e1) {
}
}
}).start();
JPanel suggestionsPanel = new JPanel();
GridBagConstraints gbc_suggestionsPanel = new GridBagConstraints();
......@@ -269,11 +248,29 @@ public class ChatWindow {
}
};
this.frame.addKeyListener(giveFocus);
this.chatMessagesPanel.addKeyListener(giveFocus);
suggestionsPanel.addKeyListener(giveFocus);
myMessagePanel.addKeyListener(giveFocus);
(new Thread() {
public void run() {
chatMessagesPanel = new JFXPanel();
chatMessagesPanel.addKeyListener(giveFocus);
frame.getContentPane().add(chatMessagesPanel, gbc_msgScrollPane);
Platform.runLater(() -> {
ChatWindow.this.chatMessages = new WebView();
BorderPane borderPane = new BorderPane();
borderPane.setCenter(ChatWindow.this.chatMessages);
Scene scene = new Scene(borderPane, 450, 450);
ChatWindow.this.chatMessagesPanel.setScene(scene);
});
}
}).start();
this.frame.pack();
show();
this.myMessage.requestFocusInWindow();
}
......@@ -322,13 +319,15 @@ public class ChatWindow {
String text = ("SOL " + this.myMessage.getText()).trim();
int lastSpace = text.lastIndexOf(' ');
String allButLast = lastSpace > -1 ? text.substring(0, lastSpace) : null;
this.undo = this.myMessage.getText();
String newText = (allButLast.replaceAll("SOL", "") + " " + result).trim();
if (this.myMessage.getText().startsWith(newText)) {
return false;
if (allButLast != null) {
this.undo = this.myMessage.getText();
String newText = (allButLast.replaceAll("SOL", "") + " " + result).trim();
if (this.myMessage.getText().startsWith(newText)) {
return false;
}
this.myMessage.setText(newText);
return true;
}
this.myMessage.setText(newText);
return true;
}
return false;
}
......@@ -374,13 +373,6 @@ public class ChatWindow {
receiveMessage(this.esession.multisentenceRespond(msg));
return;
}
else if (this.theirUsername.equals("cleverbot")) {
try {
receiveMessage(this.csession.think(msg));
return;
} catch (Exception e) {
}
}
else {
try {
this.connection.m_channel(this.theirUsername, msg);
......
......@@ -15,7 +15,6 @@ import javax.swing.JList;
import p2.wordsuggestor.WordSuggestor;
public class MainWindow {
private JFrame frame;
private List<String> usernames;
private final List<ChatWindow> chats;
......@@ -51,27 +50,30 @@ public class MainWindow {
list.addMouseListener(new MouseAdapter() {
@Override
public void mouseClicked(MouseEvent e) {
@SuppressWarnings("unchecked")
JList<String> list = (JList<String>) e.getSource();
if (e.getClickCount() == 2) {
int index = list.locationToIndex(e.getPoint());
for (ChatWindow client : MainWindow.this.chats) {
if (client.theirUsername
.equals(MainWindow.this.usernames.get(index))) {
client.show();
return;
(new Thread() {
public void run() {
@SuppressWarnings("unchecked")
JList<String> list = (JList<String>) e.getSource();
if (e.getClickCount() == 2) {
int index = list.locationToIndex(e.getPoint());
for (ChatWindow client : MainWindow.this.chats) {
if (client.theirUsername
.equals(MainWindow.this.usernames.get(index))) {
client.show();
return;
}
}
MainWindow.this.chats
.add(new ChatWindow(MainWindow.this.usernames.get(index),
MainWindow.this.markov, MainWindow.this.connection));
}
}
MainWindow.this.chats
.add(new ChatWindow(MainWindow.this.usernames.get(index),
MainWindow.this.markov, MainWindow.this.connection));
}
}).start();
}
});
this.usernames = new ArrayList<String>();
this.usernames.add("cleverbot");
this.usernames.add("eliza");
this.model = new UsersModel(this.usernames);
list.setModel(this.model);
......@@ -86,6 +88,9 @@ public class MainWindow {
}
usersSet.remove(this.username);
this.usernames = new ArrayList<String>(usersSet);
try {
Thread.sleep(100);
} catch (InterruptedException e) {}
this.model.update(this.usernames);
}
......
......@@ -6,16 +6,24 @@ import java.awt.EventQueue;
import java.awt.GridBagConstraints;
import java.awt.GridBagLayout;
import java.awt.Insets;
import java.awt.event.WindowAdapter;
import java.awt.event.WindowEvent;
import java.awt.event.KeyAdapter;
import java.awt.event.KeyEvent;
import java.awt.BorderLayout;
import java.io.IOException;
import java.util.function.Supplier;
import javax.swing.Box;
import javafx.embed.swing.JFXPanel;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JTextField;
import javax.swing.JDialog;
import javax.swing.JProgressBar;
import javax.swing.SwingUtilities;
import javax.swing.SwingWorker;
import cse332.interfaces.misc.Dictionary;
import cse332.types.AlphabeticString;
......@@ -62,35 +70,66 @@ public class uMessage {
}
@Override
public void run() {
int N = uMessage.N;
try {
uMessage.markov[this.i] = new WordSuggestor(uMessage.CORPUS, N - this.i,
4, uMessage.NEW_OUTER, uMessage.NEW_INNER);
uMessage.loading[this.i] = false;
this.window.update();
} catch (IOException e) {
public void run() {
int N = uMessage.N;
try {
uMessage.markov[this.i] = new WordSuggestor(uMessage.CORPUS, N - this.i,
4, uMessage.NEW_OUTER, uMessage.NEW_INNER);
this.window.update();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
/**
* Launch the application.
*/
public static void main(String[] args) {
EventQueue.invokeLater(() -> {
final uMessage window = new uMessage();
window.frmUmessageLogin.setVisible(true);
window.errors.setText("Loading the Markov Data (n = " + uMessage.N + ")...");
uMessage.markov = new WordSuggestor[uMessage.N];
uMessage.loading = new boolean[uMessage.N];
for (int i1 = 0; i1 < uMessage.N; i1++) {
uMessage.loading[i1] = true;
(new Thread() {
public void run() {
new JFXPanel();
}
for (int i2 = 0; i2 < uMessage.N; i2++) {
new Thread(new MarkovLoader(window, i2)).start();
}).start();
final uMessage window = new uMessage();
markov = new WordSuggestor[uMessage.N];
JDialog dialog = new JDialog((JFrame)null, "Please wait...", true);//true means that the dialog created is modal
JLabel lblStatus = new JLabel("<html><b>Loading Markov Data (n = " + uMessage.N + ")...</b><br>Depending on the data structures you're using<br>" +
"and your computer, this might take a bit.</html>");
JProgressBar pbProgress = new JProgressBar(0, 100);
pbProgress.setIndeterminate(true); //we'll use an indeterminate progress bar
dialog.add(BorderLayout.NORTH, lblStatus);
dialog.add(BorderLayout.CENTER, pbProgress);
dialog.addWindowListener(new WindowAdapter() {
@Override public void windowClosing(WindowEvent e) {
System.exit(0);
}
});
dialog.setSize(300, 90);
SwingWorker<Void, Void> sw = new SwingWorker<Void, Void>() {
@Override
protected Void doInBackground() throws Exception {
for (int i2 = 1; i2 < uMessage.N; i2++) {
new Thread(new MarkovLoader(window, i2)).start();
}
new MarkovLoader(window, 0).run();
return null;
}
@Override
protected void done() {
dialog.dispose();//close the modal dialog
window.frmUmessageLogin.setVisible(true);
}
};
sw.execute(); // this will start the processing on a separate thread
dialog.setVisible(true); //this will block user input as long as the processing task is working
}
/**
......@@ -112,7 +151,7 @@ public class uMessage {
gridBagLayout.columnWidths = new int[] { 0, 90, 90, 90, 0, 0 };
gridBagLayout.rowHeights = new int[] { 0, 9, 0, 0 };
gridBagLayout.columnWeights = new double[] { 0.0, 0.0, 1.0, 0.0, 0.0,
Double.MIN_VALUE };
Double.MIN_VALUE };
gridBagLayout.rowWeights = new double[] { 0.0, 0.0, 0.0, Double.MIN_VALUE };
this.frmUmessageLogin.getContentPane().setLayout(gridBagLayout);
......@@ -131,7 +170,7 @@ public class uMessage {
this.frmUmessageLogin.getContentPane().add(this.horizontalStrut,
gbc_horizontalStrut);
JLabel usernameLabel = new JLabel("Username:");
JLabel usernameLabel = new JLabel("UWnetID:");
GridBagConstraints gbc_usernameLabel = new GridBagConstraints();
gbc_usernameLabel.fill = GridBagConstraints.BOTH;
gbc_usernameLabel.insets = new Insets(0, 0, 5, 5);
......@@ -143,10 +182,14 @@ public class uMessage {
this.username.addKeyListener(new KeyAdapter() {
@Override
public void keyReleased(KeyEvent e) {
if (update() && e.getKeyCode() == KeyEvent.VK_ENTER) {
uMessage.this.login.setEnabled(false);
login();
}
(new Thread() {
public void run() {
if (update() && e.getKeyCode() == KeyEvent.VK_ENTER) {
uMessage.this.login.setEnabled(false);
login();
}
}
}).start();
}
});
......@@ -187,14 +230,6 @@ public class uMessage {
}
public boolean update() {
boolean noneLoading = true;
for (int i = 0; i < uMessage.loading.length; i++) {
noneLoading &= !uMessage.loading[i];
}
if (noneLoading) {
this.errors.setText("");
this.errors.setForeground(Color.BLACK);
}
if (!this.loggingIn && this.username.getText().length() > 0) {
this.login.setEnabled(true);
this.errors.setForeground(Color.BLACK);
......@@ -210,11 +245,10 @@ public class uMessage {
this.loggingIn = true;
update();
try {
this.connection = new UMessageServerConnection(this,
this.username.getText().replaceAll(" ", ""));
this.connection.go();
} catch (IOException e1) {
}
connection = new UMessageServerConnection(uMessage.this,
username.getText().replaceAll(" ", ""));
connection.go();
} catch (IOException e1) {}
}
public void badNick() {
......@@ -225,19 +259,6 @@ public class uMessage {
}
public void loggedIn(String username) {
boolean noneLoading = false;
while (!noneLoading) {
noneLoading = true;
for (int i = 0; i < uMessage.loading.length; i++) {
noneLoading &= !uMessage.loading[i];
}
try {
Thread.sleep(200);
} catch (InterruptedException e) {
}
}
this.frmUmessageLogin.dispose();
this.window = new MainWindow(username, uMessage.markov, this.connection);
this.loggingIn = false;
......
......@@ -35,9 +35,11 @@ public class BinarySearchTree<K extends Comparable<K>, V>
public BSTNode[] children; // The children of this node.
/**
* Create a new data node and increment the enclosing tree's size.
* Create a new data node.
*
* @param data
* @param key
* key with which the specified value is to be associated
* @param value
* data element to be stored at this node.
*/
@SuppressWarnings("unchecked")
......@@ -47,7 +49,7 @@ public class BinarySearchTree<K extends Comparable<K>, V>
}
}
private BSTNode find(K key, V value) {
protected BSTNode find(K key, V value) {
BSTNode prev = null;
BSTNode current = this.root;
......@@ -58,9 +60,6 @@ public class BinarySearchTree<K extends Comparable<K>, V>
// We found the key!
if (direction == 0) {
if (value != null) {
current.value = value;
}
return current;
}
else {
......@@ -71,9 +70,9 @@ public class BinarySearchTree<K extends Comparable<K>, V>
}
}
// If value is null, we need to actually add in the new value
// If value is not null, we need to actually add in the new value
if (value != null) {
current = new BSTNode(key, value);
current = new BSTNode(key, null);
if (this.root == null) {
this.root = current;
}
......@@ -90,6 +89,9 @@ public class BinarySearchTree<K extends Comparable<K>, V>
@Override
public V find(K key) {
if (key == null) {
throw new IllegalArgumentException();
}
BSTNode result = find(key, null);
if (result == null) {
return null;
......@@ -99,7 +101,13 @@ public class BinarySearchTree<K extends Comparable<K>, V>
@Override
public V insert(K key, V value) {
return find(key, value).value;
if (key == null || value == null) {
throw new IllegalArgumentException();
}
BSTNode current = find(key, value);
V oldValue = current.value;
current.value = value;
return oldValue;
}
@Override
......
......@@ -89,22 +89,23 @@ public abstract class WorkList<E> implements Iterable<E> {
}
/**
* Note that the toString() method of a WorkList _consumes_ the WorkList.
* This can lead to odd and unpredictable behavior.
*
* @postcondition hasWork() is false
* Returns some basic information about this particular worklist.
*
* Calling this method does not consume the worklist.
*
* @return a string representation of this worklist
*/
@Override
public String toString() {
StringBuilder result = new StringBuilder();
result.append("[");
while (this.hasWork()) {
result.append(this.next().toString() + ", ");
}
if (result.length() > 1) {
result.replace(result.length() - 2, result.length(), "");
if (this.hasWork()) {
return String.format("%s(size = %d, peek = %s)",
this.getClass().getSimpleName(),
this.size(),
this.peek());
} else {
return String.format("%s(size = %d)",
this.getClass().getSimpleName(),
this.size());
}
result.append("]");
return result.toString();
}
}
......@@ -6,7 +6,7 @@ import cse332.datastructures.trees.BinarySearchTree;
* TODO: Replace this comment with your own as appropriate.
*
* AVLTree must be a subclass of BinarySearchTree<E> and must use
* inheritance and callst o superclass methods to avoid unnecessary
* inheritance and calls to superclass methods to avoid unnecessary
* duplication or copying of functionality.
*
* 1. Create a subclass of BSTNode, perhaps named AVLNode.
......
......@@ -14,11 +14,16 @@ import cse332.interfaces.misc.Dictionary;
* restrict the size of the input domain (i.e., it must accept
* any key) or the number of inputs (i.e., it must grow as necessary).
* 3. Your HashTable should rehash as appropriate (use load factor as
* shown in class).
* 5. HashTable should be able to grow at least up to 200,000 elements.
* shown in class!).
* 5. HashTable should be able to resize its capacity to prime numbers for more
* than 200,000 elements. After more than 200,000 elements, it should
* continue to resize using some other mechanism.
* 6. We suggest you hard code some prime numbers. You can use this
* list: http://primes.utm.edu/lists/small/100000.txt
* NOTE: Do NOT copy the whole list!
* 7. When implementing your iterator, you should NOT copy every item to another
* dictionary/list and return that dictionary/list's iterator.
*/
public class ChainingHashTable<K, V> extends DeletelessDictionary<K, V> {
private Supplier<Dictionary<K, V>> newChain;
......
......@@ -9,14 +9,16 @@ import cse332.interfaces.misc.DeletelessDictionary;
/**
* TODO: Replace this comment with your own as appropriate.
* 1. The list is typically not sorted.
* 2. Add new items to the front oft he list.
* 2. Add new items to the front of the list.
* 3. Whenever find is called on an item, move it to the front of the
* list. This means you remove the node from its current position
* and make it the first node in the list.
* 4. You need to implement an iterator. The iterator SHOULD NOT move
* elements to the front. The iterator should return elements in
* the order they are stored in the list, starting with the first
* element in the list.
* element in the list. When implementing your iterator, you should
* NOT copy every item to another dictionary/list and return that
* dictionary/list's iterator.
*/
public class MoveToFrontList<K, V> extends DeletelessDictionary<K, V> {
@Override
......
......@@ -3,10 +3,12 @@ package p2.clients;
import java.io.IOException;
import java.util.function.Supplier;
import cse332.datastructures.trees.BinarySearchTree;
import cse332.interfaces.misc.BString;
import cse332.interfaces.misc.Dictionary;
import cse332.types.AlphabeticString;
import cse332.types.NGram;
import datastructures.dictionaries.AVLTree;
import datastructures.dictionaries.ChainingHashTable;
import datastructures.dictionaries.HashTrieMap;
import p2.wordsuggestor.WordSuggestor;
......@@ -20,6 +22,17 @@ public class NGramTester {
Supplier<Dictionary<K, V>> constructor) {
return () -> new ChainingHashTable<K, V>(constructor);
}
@SuppressWarnings({ "rawtypes", "unchecked" })
public static <K, V> Supplier<Dictionary<K, V>> binarySearchTreeConstructor() {
return () -> new BinarySearchTree();
}
@SuppressWarnings({ "rawtypes", "unchecked" })
public static <K, V> Supplier<Dictionary<K, V>> avlTreeConstructor() {
return () -> new AVLTree();
}
public static void main(String[] args) {
try {
......
package p2.wordsuggestor;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.util.Stack;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import cse332.interfaces.worklists.LIFOWorkList;
import datastructures.worklists.ArrayStack;
import org.json.simple.JSONObject;
import org.json.simple.parser.JSONParser;
import org.json.simple.parser.ParseException;
import org.json.simple.JSONArray;
public final class ParseFBMessages {
private ParseFBMessages() {
/* should not be instantiated */ }
// INSTRUCTIONS:
//
// <Your FB Name> may be either:
// 1) Your name on Messenger (e.g. "Danny Allen")
// 2) Your username on facebook, which can be found by looking at the URL on your profile
// It's typically 1), but for whatever reason Facebook sometimes labels them
// with 2) (sorry!). You can check which one your messages are labeled with by
// opening up one of the message files and taking a look.
//
// <Your FB Archive> is the directory on your computer where the archive is stored.
// (e.g. "/Users/Me/Downloads/MyArchiveName" or "C:\Users\Me\Downloads\MyArchiveName")
// You may be able to use a relative path like "./MyArchiveName", but results can
// vary from machine to machine.
//
// DO NOT PUSH YOUR ME.TXT FILE TO GITLAB. WE DO NOT WANT YOUR PRIVATE CONVERSATIONS!!!!
public static void main(String[] args) throws IOException {
if (args.length != 2) {
System.out.println("USAGE: ParseFBMessages <Your FB Name> <Your FB Archive>");
System.exit(1);
}
String name = args[0];
String archive = args[1];
String name = "<Your FB Name>"; // e.g. "Ruth Anderson"
String archive = "<Your FB Archive>"; // e.g. "/Users/rea/workspace/332/facebook-rea/messages"
LIFOWorkList<String> messages = new ArrayStack<String>();
Document doc = Jsoup
.parse(new File(archive + File.separator + "html/messages.htm"), "UTF-8");
Elements messagesElements = doc.getElementsByTag("p");
Stack<String> corpus = new Stack<>();
File[] listOfFiles = (new File(archive + File.separator + "inbox")).listFiles();
for (Element content : messagesElements) {
if (content.previousElementSibling().getElementsByClass("user").text()
.equals(name)) {
messages.add(content.text());
for (int i = 0; i < listOfFiles.length; i++) {
File conversation = new File(listOfFiles[i], "message.json");
if (conversation.isFile()) {
try {
JSONObject obj = (JSONObject) new JSONParser().parse(new FileReader(conversation));
JSONArray messages = (JSONArray) obj.get("messages");
for (Object m: messages) {
JSONObject msg = (JSONObject) m;
String sender = (String) msg.get("sender_name");
if(sender != null && sender.equals(name)) {
corpus.push((String) msg.get("content"));
}
}
} catch (ParseException e) {
System.err.println("Could not parse: " + conversation.toString());
}
}
}
PrintWriter out = new PrintWriter("me.txt", "UTF-8");
while (messages.hasWork()) {
out.println(messages.next());
while (!corpus.isEmpty()) {
out.println(corpus.pop());
}
out.close();
......
......@@ -15,7 +15,6 @@ import datastructures.worklists.ArrayStack;
/**
* An executable that generates text in the style of the provided input file.
* You will need to modify this file.
*/
public class WordSuggestor {
private final int N, K;
......@@ -73,6 +72,12 @@ public class WordSuggestor {
words[i] = allWords.next();
i--;
}
for (i = 0; i < words.length; i++) {
if (words[i] == null) {
words[i] = "NULL";
}
}
return new NGram(words);
}
......
# Project 2 (uMessage) Write-Up #
--------
## Project Enjoyment ##
- How Was Your Partnership?
<pre>TODO</pre>
- What was your favorite part of the project?
<pre>TODO</pre>
- What was your least favorite part of the project?
<pre>TODO</pre>
- How could the project be improved?
<pre>TODO</pre>
- Did you enjoy the project?
<pre>TODO</pre>
-----
## Experiments ##
Throughout p1 and p2, you have written (or used) several distinct implementations of the Dictionary interface:
- HashTrieMap
- MoveToFrontList
- BinarySearchTree
- AVLTree
- ChainingHashTable
In this Write-Up, you will compare various aspects of these data structures. This will take a significant amount of
time, and you should not leave it to the last minute. For each experiment, we expect you to:
- Explain how you constructed the inputs to make your conclusions
- Explain why your data supports your conclusions
- Explain your methodology (e.g., if we wanted to re-run your experiment, we would be able to)
- Include the inputs themselves in the experiments folder
- Include your data either directly in the write-up or in the experiments folder
- If you think it helps your explanation, you can include graphs of the outputs (we recommend that you do this for some of them)
- We recommend that you keep your "N" (as in "N-gram") constant throughout these experiments. (N = 2 and N = 3 are reasonable.)
### BST vs. AVLTree ###
Construct input files to NGramTester of your choosing to demonstrate that an AVL Tree is asymptotically better
than a Binary Search Tree.
<pre>TODO</pre>
### BST Worst Case vs. BST Best Case ###
We know that the worst case for a BST insertion is O(n) and the best case is O(lg n). Construct input
files of your choosing that demonstrate these best and worst cases for a large n. How big is the difference?
Is it surprising?
<pre>TODO</pre>
### ChainingHashTable ###
Your ChainingHashTable should take as an argument to its constructor the type of "chains" it uses. Determine
which type of chain is (on average) best: an MTFList, a BST, or an AVL Tree. Explain your intuition on why
the answer you got makes sense (or doesn't!).
<pre>TODO</pre>
### Hash Functions ###
Write a new hash function (it doesn't have to be any good, but remember to include the code in your repository).
Compare the runtime of your ChainingHashTable when the hash function is varied. How big of a difference can the
hash function make? (You should keep all other inputs (e.g., the chain type) constant.)
<pre>TODO</pre>
### General Purpose Dictionary ###
Compare all of the dictionaries (on their best settings, as determined above) on several large input files. Is
there a clear winner? Why or why not? Is the winner surprising to you?
<pre>TODO</pre>
### General Sorts ###
You have several general purpose sorts (InsertionSort, HeapSort, TopKSort). We would like you to compare these
sorts using *step counting*. That is, for all other experiments, you likely compared the time it took for the various
things to run, but for this one, we want you to (1) choose a definition of step, (2) modify the sorting algorithms to
calculate the number of steps, and (3) compare the results. In this case, there is a "good" definition of step, and
there are many bad ones. We expect you to justify your choice.
<pre>TODO</pre>
### Top K Sort ###
TopKSort should theoretically be better for small values of k. Determine (using timing or step-counting--your choice)
which n (input size) and k (number of elements sorted) makes TopKSort worthwhile over your best sort from the previous
experiment.
<pre>TODO</pre>
### uMessage ###
Use uMessage to test out your implementations. Using N=3, uMessage should take less than a minute to load using
your best algorithms and data structures on a reasonable machine.
- How are the suggestions uMessage gives with the default corpus?
<pre>TODO</pre>
- Now, switch uMessage to use a corpus of YOUR OWN text. To do this, you will need a corpus.
You can use anything you like (Facebook, google talk, e-mails, etc.) We provide
instructions and a script to format Facebook data correctly as we expect it will be the most common
choice. If you are having problems getting data, please come to office hours and ask for help.
Alternatively, you can concatenate a bunch of English papers you've written together to get a corpus
of your writing. PLEASE DO NOT INCLUDE "me.txt" IN YOUR REPOSITORY. WE DO NOT WANT YOUR PRIVATE CONVERSATIONS.
* Follow these instructions to get your Facebook data: https://www.facebook.com/help/212802592074644
* Run the ParseFBMessages program in the main package.
* Use the output file "me.txt" as the corpus for uMessage.
- How are the suggestions uMessage gives wth the new corpus?
<pre>TODO</pre>
-----
## Above and Beyond ##
- Did you do any Above and Beyond? Describe exactly what you implemented.
<pre>TODO</pre>
\ No newline at end of file
......@@ -23,7 +23,7 @@ public class CircularArrayComparatorTests extends TestsUtility {
test("test_a_aa");
test("test_equality_consistent_with_compare");
test("test_compare_transitive");
test("test_equals_doesnt_modify");
finish();
}
......@@ -110,4 +110,13 @@ public class CircularArrayComparatorTests extends TestsUtility {
l2.add("b");
return l1.equals(l2) && l1.compareTo(l2) == 0 ? 1 : 0;
}
public static int test_equals_doesnt_modify() {
CircularArrayFIFOQueue<String> l1 = init();
CircularArrayFIFOQueue<String> l2 = init();
l1.add("a");
l2.add("a");
l1.equals(l2);
return l1.size() == 1 ? 1 : 0;
}
}
......@@ -2,7 +2,7 @@ package tests.gitlab.ckpt1;
import java.util.Arrays;
import java.util.Comparator;
import java.util.HashMap;
import java.util.TreeMap;
import java.util.Map;
import java.util.function.Supplier;
......@@ -10,16 +10,16 @@ import cse332.datastructures.containers.Item;
import cse332.interfaces.misc.Dictionary;
import cse332.types.AlphabeticString;
import cse332.types.NGram;
import datastructures.dictionaries.HashTrieMap;
import cse332.datastructures.trees.BinarySearchTree;
import p2.wordsuggestor.NGramToNextChoicesMap;
import tests.TestsUtility;
public class NGramToNextChoicesMapTests extends TestsUtility {
private static Supplier<Dictionary<NGram, Dictionary<AlphabeticString, Integer>>> newOuter =
() -> new HashTrieMap<String, NGram, Dictionary<AlphabeticString, Integer>>(NGram.class);
() -> new BinarySearchTree();
private static Supplier<Dictionary<AlphabeticString, Integer>> newInner =
() -> new HashTrieMap<Character, AlphabeticString, Integer>(AlphabeticString.class);
private static Supplier<Dictionary<AlphabeticString, Integer>> newInner =
() -> new BinarySearchTree();
public static void main(String[] args) {
new NGramToNextChoicesMapTests().run();
......@@ -95,7 +95,7 @@ public class NGramToNextChoicesMapTests extends TestsUtility {
if (items.length != answer.length) return 0;
String[] itemsWithoutCounts = new String[items.length];
for (int j = 0; j < answer.length; j++) {
if (items[j].value != 1) return 0;
if (!items[j].value.equals(1)) return 0;
itemsWithoutCounts[j] = items[j].key;
}
Arrays.sort(itemsWithoutCounts);
......@@ -127,10 +127,10 @@ public class NGramToNextChoicesMapTests extends TestsUtility {
return 1;
}
// TODO: Not finished yet
@SuppressWarnings("unchecked")
public static int testRepeatedWordsPerNGram() {
NGramToNextChoicesMap map = init();
// Creates Ngrams to test for with N = 3
NGram[] ngrams = new NGram[]{
new NGram(new String[]{"foo", "bar", "baz"}),
new NGram(new String[]{"fee", "fi", "fo"}),
......@@ -138,7 +138,7 @@ public class NGramToNextChoicesMapTests extends TestsUtility {
new NGram(new String[]{"3", "2", "2"}),
new NGram(new String[]{"a", "s", "d"})
};
// Array of words seen after each Ngram with correlating index from above
String[][] words = new String[][] {
new String[]{"bop", "bip", "boop", "bop", "bop"},
new String[]{"fum", "giants", "giants"},
......@@ -148,7 +148,10 @@ public class NGramToNextChoicesMapTests extends TestsUtility {
};
// yes this is awful, but i can't think of a better way to do it atm
Map<NGram, Item<String, Integer>[]> answers = new HashMap<>();
// Creates answers for getCountsAfter - Word seen after and count
// corrlates with words and ngrams above
// Note that words after are in sorted order, not in order of array in words
Map<NGram, Item<String, Integer>[]> answers = new TreeMap<>();
answers.put(ngrams[0], (Item<String, Integer>[]) new Item[3]);
answers.get(ngrams[0])[0] = new Item<String, Integer>("bip", 1);
answers.get(ngrams[0])[1] = new Item<String, Integer>("boop", 1);
......@@ -167,12 +170,14 @@ public class NGramToNextChoicesMapTests extends TestsUtility {
answers.get(ngrams[4])[1] = new Item<String, Integer>("for", 2);
answers.get(ngrams[4])[2] = new Item<String, Integer>("while", 2);
// Adds nGrams and words after to student's NGramToNextChoicesMap
for (int i = 0; i < ngrams.length; i++) {
for (int j = 0; j < words[i].length; j++) {
map.seenWordAfterNGram(ngrams[i], words[i][j]);
}
}
// checks to see if getCountsAfter returns correctly
for (int i = 0; i < ngrams.length; i++) {
NGram ngram = ngrams[i];
Item<String, Integer>[] results = map.getCountsAfter(ngram);
......@@ -187,12 +192,15 @@ public class NGramToNextChoicesMapTests extends TestsUtility {
});
Item<String, Integer>[] expected = answers.get(ngram);
// checks for correct number of unique words after
if (results.length != expected.length) return 0;
for (int j = 0; j < expected.length; j++) {
// checks if correct word after via sorted words
if (!expected[j].key.equals(results[j].key)) {
return 0;
}
if (expected[j].value != results[j].value) {
// checks if correct count for given word after
if (!expected[j].value.equals(results[j].value)) {
return 0;
}
}
......
......@@ -139,6 +139,7 @@ public class AVLTreeTests extends TestsUtility {
// Check for accuracy
passed &= totalCount == (n * (n + 1)) / 2 * 5;
passed &= tree.size() == n;
passed &= tree.find("00851") != null;
passed &= tree.find("00851") == 4260;
return passed ? 1 : 0;
......
......@@ -11,7 +11,10 @@ public class Ckpt2Tests extends GradingUtility {
return new Class<?>[] {
AVLTreeTests.class,
HashTableTests.class,
CircularArrayHashCodeTests.class
CircularArrayHashCodeTests.class,
QuickSortTests.class,
TopKSortTests.class,
HeapSortTests.class
};
}
}