Wednesday, September 29, 2010

Class Notes: 9/29/10

Ideas for strategies?

Elba: Put 2 lights overlapping with the collector between them to guide the mosquitoes
David: When we have sub-areas, put the collector right near the gap to bring them to it
Yang Jian: Put lines along each obstacle
Neerja: Also need to focus heavily on the light frequency, not just the placement
Dan: Once you get the mosquitos in one area, it's easy to move them as a bunch, rather than moving them individually. The obstacles can be used to keep them in the bunch
Andrew: What if you have lights near each other and flick them really really fast, so that they go to some average location instead of to the light directly
Eva: As we have more and more lights, create an abstraction of light groups
Pilunchana: What if you had only one light?
Neetha: Place the lights in places that are less obscured by obstacles
Neerja: A lot of how long things take to run depends on the initial placement of the mosquitos
Archana: Place the lights in a zig-zag pattern to maximize the board coverage
Pilunchana: When there are a lot of obstacles, a zig-zag may be very useful
Neha: This doesn't work when you have very few lights
Nitin: Never put the collector out of the way
Elba: When you do the zig zag, they should overlap 100%


Deliverables for Weds:
1 - Come up with at least 1, preferable 2 boards per group, and a target number of lights for that board
2 - A player, which doesn't have to do anything super useful. Modify dumb player somehow interesting

Teams for Project 2:
Dan Iter 1
Zach Sheppard 1
Daniel Wilkey 1
John Graham 2
Jesse Bentert 2
Elizabeth Kierstead 2
Lauren Pully 3
Neetha Sebastian 3
Eva Sitaridi 3
Archana Balakrishnan 4
Neerja Pancholi 4
Flavio Palandri Antonelli4
Dawei Shi 5
Neha Srivastava 5
Nitin Natarajan 5
Elba Garza 6
Andrew Shu 6
Pilunchana Kiatruangkrai 6
Blake Arnold 7
Yang Jian 7
Dan Federman 7

(Is there a bug causing the mosquitos to not fly straight? Can we add more contrast to the simulator?)

Monday, September 27, 2010

Class notes: 9/27/10

Tournament format:
Warning: If your player errors out during the tournament, it will be removed from the rest of the tournament

Do we include old players?
Neha: Include just G1
KR: Let's have 2 classes of tournaments - one with just new players, one with both
Daniel: Some have bugs
KR: OK, so we will use G1 as a representative of last year's players
KR: Note that we are optimizing for high score, rather than ranking

KR: Any ideas for a new tournament, not one that we did last year?
Nitin: Bring in more of LessStingyPlayer
DFed: Using LessStringyPlayer will hurt conservative players
KR: Is it interesting to run with 7 hidden letters?
Elba: Yeah, it's pretty interesting. People still get 1 letter. There is still some play
Nitin: With 7 hidden letters, LessStingPlayer should be there.
Yang Jian: 7 hidden letters means that we depend a lot more on luck
Eva: With 7 hidden letters, we should have more rounds
Daniel: Yes, we could do this and it wouldn't really take that long either
Neerja: Leave out group 1 when you start with 7 hidden letters
KR: But it actually shows an interesting baseline of what happens when you just use your 7 starting letters

Final format:
Number of rounds fix to 100
Vary number of letters 0-7 (for 7 hidden letters do 200)
We will include last year's Group 1 as a player in addition to this year's 7:
8 choose 2 2-player tournaments
8 player tournament
8 player tournament, with 2 of 4 players, done enough so that everyone plays as a duplicate player once
single player tournament, with 4, 8, and 12 of each player
8 choose 2 2-player tournaments, with 5 of each player, plus a "LessStingyPlayer" who always bids rand(0,3)

If we have time afterwords, run 8 players + LessStingyPlayer

Last year:
Vary number of letters 0-6
Vary number of players 2-14
Number of rounds fix to 50

5 choose 2 2-player tournaments
5 player tournament
5 player tournament, with 2 of each player
single player tournament, with 5, 10, and 15 of each player
5 choose 2 2-player tournaments, with 5 of each player, plus a "StingyPlayer" who always bids 0

I will make the results available only as a SQL database - no more text output. I'll send out a SQLite database, probably (might make a public MySQL instead)


We did a run with 7 letters...
What went well/wrong?
Daniel: We are bidding high on letters when we are close
Lauren: We didn't do so great - we were too conservative, which is a big problem on 7 hidden letters. When we get 7 letters for free, we should make sure that we're not that conservative

KR: What makes a successful player?
EPK: It's important to adjust your bidding range based on what everyone else is doing
Monica: It should, however, be largely based upon how you value the letters.
Elba: You need to STILL value the letters carefully on a low/medium/high range, but you must define what low/medium/high is from the market
Neha: Never bid on anything that doesn't help you (ie, don't try to bid to hurt people)

KR: How many of your players will really try to harm opponents in a small game?
(4-5 people raised their hands)

Blake: Being able to adapt to the game will be important.
Yang Jian: Same - having multiple strategies

KR: How many of you have multiple strategies that you switch between?
(about half)
Zach: Our player has 3 strategies, which it switches based upon how many letters we have. We switched based upon an early/middle/late game strategy
Dfed: We are focusing on getting one main strategy to work, otherwise things become much too complex
Neerja: Concentrate on the main strategy. Work on fine tuning various parameters
KR: Yes, for sure - work hard on figuring out how to set each parameter
David: Budget control in 2 player game - make sure that you always have points to rebound as necessary
Daniel: The bonus is very disproportionate, (jumping from 11-12 points to 60+), so it's sooo important to make sure you get a 7 letter word. Spending 30 points for a 6 letter is not OK, but 50 points for 7 letters is good. It's really, really hard to figure out all of the specific corner cases and account for them.

KR: What did you use from last year?
EPK: A priori algorithm
KR: Did anyone rely on last year's strategies, with or without using the code?
Zach: The early/middle/late idea
KR: How essential was it to use last year's players for background?
Eva: We were able to bootrstrap our own discussions
Dan: We already knew a lot of the issues
Zach: It was helpful to be able to play against last year's players as we were developing
Archana: I was in some ways constrained to thinking in previous ways
Dan: It was interesting to read from past group's mistakes

Wednesday, September 22, 2010

Class notes: 9/22/10

JB will be returning the score and/or word of each player at the end of the round

We did a run:
Dan (g6): We happened to bid 5 for a Q, which we think to be a relatively low bid, then 95 for an E. We are assuming that nobody else will bid high
Daniel: Highest scoring word without 7 letters is 32. 95 is way too much
Blake: Maybe getting something for 0 points and having a huge rack is a good thing this time

Flavio: We have a lot of letters right now, but we've been bidding pretty low. We plan to get as many letters as possible, hoping that we'll eventually get a 7 letter word.

KAR: So what have we learned so far?
Daniel: G1 from last year is really good.
Neha (g5): We have been bidding pretty close to the true value of the word, without thinking about the value to others (we're not getting things).
Zach: Hidden letters form a burden for last year's players, more so than for us
Neerja: We are trying to do more upfront
John: Everyone is driving up the cost of common letters more so than they should be
Eva: We should adjust bids based upon the frequency of letters remaining
KR: But a few people overbidding can still drive everything up

Monica: But some letters may not come up at all in the game
Daniel: An agressive/passive strategy will play differently in a game of different size. We need to take that into account.

KR: In a two player game, maybe you should never bid 0.

David: With 6 hidden letters, people should want different things, so maybe bids will be lower

Monica: Actually, you can pay a LOT now, because you only need 2 letters!

Deliverable for Monday

A player that is robust under these various kinds of configurations.

Final deliverable due at noon on Wednesday.

Monday, September 20, 2010

Class Notes: 9/20/10

KR: Last week we didn't really talk about HOW we should/will/are implementing our strategies. What kinds of data structures are appropriate? Are we happy with what happened last year? What kind of innovation can be made?

Jesse: For any hand of letters, calculate the expected value if you had these letters. You'd need the probability of getting each letter, and of what the cost may be. Use a linear scan, no special data structures
Nitin: In the beginning, compute the list of 7 letter words. Prune out the ones that you can't form with at least 50% of what's on your rack currently. Bid high, low, or medium based upon frequencies.

KR: Let's focus on how, rather than pure strategy.
Neeha: Need to adapt previous strategies to take into account how many bids are left.
EPK: We use the apriori algorithm. We can use this general algorithm to make strategy decisions. The interesting change is that we can look at subsets of our rack, because we can form a 7 letter word with more than 7 letters (rather than assume that all letters on the rack must be used to form a 7 letter word).
Daniel: We use an intersection to create a list of words with 1 A, 1 B, etc, 2 A's, 2B's, etc. We do a linear merge upon the sorted word list.

Zach: We take each word, and if it's possible, rebuilt them into a list of letters, sorted by the letters that they have.
DFed: Each node tells you how many words are reachable from each set of letters. (JB: Take a look at Group 2 from last year, we had this all in a SQLite database, plus some other stats too :P)
Eva: We do this all in a few hundred MB

Lauren: We are doing an intersection of both strategies. We have an alphabetical trie for word lookup at the end of the game, plus apriori to find what is reachable. We assume that we may get at least 8 letters, so that you can drop a letter if necessary.
Dan: Try to look at percentage of words that can be formed overall. Rather than counting the words that ARE possible, look also at the words that are form-able.

We ran a game and:
Andrew (g4): We bid 22 for an E. We did this because we are close to 7, and we really like E's.
John (g9): We bid 12 for a V, and that's because we can form a lot of 7 letter words with (A,I,R,V). Our max bid is 15
Aranchana (g1): Similar to John/G9's strategy here
Monica: We are planning to add more to infer others' bidding strategies
Group 6 bid 68 for an S (bug)

DFed: Everyone should be bidding high on S's because they should perceive that they form more 7 letter words.

KR: Did we learn anything about strategy from this?
Daniel: Everyone seems to be pretty conservative.
Monica: Group 3 took on way too many vowels. Only looked at letter frequencies, not looking at what words can be made.
Nitin: We were supposed to stop after a 7 letter word, but stopped at 7 letters. Try to figure out the value of a high bid for everyone before putting in a high bid.
Flavio: We computed the average score for a letter, but we should have come up with an expected score (using the probability of getting each letter)

KR: What's the next most important thing to get working?
DFed: Coming up with an approximation of a letter's value to other people, to adjust your bid.
Blake: If we do a strategy where we bid more than we really value it for, we can use a sliding "bank" for using to outbid others, limiting the damage.
Neeha: DFed & Blake's strategy sound good.

Deliverables, etc

For this Weds: Fix your bugs, etc. and improve your strategies! Let's see the implementation match the discussion, rather than a "we should be doing X but seem to be doing Y" discussion.

Note that the 27th will be the last discussion class for this project. Final code will be submitted on the 29th, and presentations/write ups on Oct 4.

Using my old SQLite database

If you want to play around with statistics, you should take a look at the seven.db file included in the simulator in src/g2/util. You can access it with the sqlite application (I believe one comes standard on Mac OS X, you can download one otherwise).

There are two tables:
lettercollections:
letters|oc_1|oc_2|oc_3|oc_4|oc_5|oc_6|oc_7|a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z

The first column is the set of letters, sorted alphabetically. Each oc_n column shows how many words occur with those letters of length n. The a-z columns show how many of each letter are in that word.

words:
word|letters|score

The first column is the actual word, the second the alphabetically sorted set of letters, and the 3rd the score you'll get for it.

SQLite supports standard SQL, ex, to find the most common letters in 7 letter words:
pazuzu:util jon$ sqlite3 seven.db
sqlite> select letters,oc_7 from lettercollections where length(letters)=1 order by oc_7 desc;
letters|oc_7
E|16068
S|13216
I|12154
A|12028
R|10727
N|9584
O|9426
L|8664
T|8382
U|6765
D|6627
C|5942
G|5338
M|4941
P|4711
H|4599
B|3996
Y|3298
K|2718
F|2563
W|2285
V|1623
Z|760
X|629
J|577
Q|360

Wednesday, September 15, 2010

Class notes: 9/15/10

We did a run, here's what happened:

Zach: Group 7 is bidding rand(0,20) for every letter.
Nitin (Group 4): Bids based upon the frequency of getting new 7 letter words
Shishir (Group 2): Calculate an average best-case points for using the letter, then bid it
Neetha (Group 8): Tries to bid a lot, but is underbidding relative to everyone else
Lauren (Group 3): Bid 1 by default. Bids more if it will give us a 7 letter word, plans to increase bidding to just overbid other people
EPK (Group 1): Multiply the face value of the letter by 3
Aditya (Group 8): Focus on letters that are frequent in words. When it's an uncommon letter, put out a random bid

KR: If you aren't bidding high and others are, should you be worried?
Daniel: No. The high bidding players will lose all of their points, while the low bidding players will do OK
Nitin: If there were very few players, the agressive bidder will prevail
Aniket: Conservative players will miss out on the letters they need
Dan: Eventually market values will go down, and the conservative players will prevail for their low investment

KR: What did we learn from this round?
Archie: It's OK to be offensive, as long as nobody else is.
Monica: There are other ways to be offensive, ex, bidding high early on OR just randomly bidding high to block others
John: You should decide when to be agressive based upon how many points are left/where we are in the round. If we are later in the game, then you'll probably pay less (via the Vickrey system)


KR: How does the Vickrey auction fit in with this all?
EPK: There's no advantage to bidding high if others are. Just let them pay as much as they want.
DFed: The value of a letter for each team is not really knowable if there are secret letters. In the relatively short period of time that this game runs on for, it's hard to guess what people might bid. Just bid whatever your own value is, plus some perturbation perhaps.
Neeha: How high you bid should also depend on your present score - you don't want to run out of points
Andrew: For a micro-strategy, still value letters after you have a 7 letter word, to bid low and prevent them from getting anything (or at least making them pay)
Archana: Should keep a history and infer the strategy of other players. Realize that if someone needs an A, avoid it, realizing that it will be cheaper
David: When we track bids, we're too conservative, because we are including stingy player.
Jesse: Nobody wants to waste the points to try to block someone else, because maybe someone else will. Limit blocking to small games only

KR: It might be a good idea to focus on our own strategies before working on counter and counter-counter strategies
Neeha: To really block other players, you might need to get 10-12-15 letters, and it will be costly
Aniket: Use a progressive bidding strategy, valuing letters more as they become less available
DFed: There will be deflation over the course of the game, so you should keep bidding something as you keep going

EPK: Should we bid based upon how many points we have?
Flavio: No, look only at what you spent so far on that word.
Monica: But if you have 60 points and everyone else has 100, it's worth trying to catch up.

KR: What about simulating your strategies... run them in parallel then decide which one to use

Neerja: Focusing on countering someone else always gets you into trouble
Daniel: If you are very close with someone else, you SHOULD try to kill them, if everyone else is far behind, because it won't really sacrifice your ranking
Archie: Separate tactics from strategy: Strategies are long term, but within a game you might have multiple tactics
Pilunchana: You might think that your bidding is high, but it might not actually be relative to others


Deliverable for Monday:
Revised player that is correct (doesn't claim it has letters it doesn't), and does something smart (doesn't just bid random, etc)

Monday, September 13, 2010

Class Notes: 9/13/10

Remember to register for the course if you haven't.

Project 1 Introduction

Q/A:
Daniel: Are letters carried over between rounds?
KR: No
Zach: If rounds are independent, do the number of secret letters remain constant?
KR: Yes
Jatin: How are starting points associated with the number of secret letters?
KR: You always start with 100 points no matter how many secret/starting letters you have, but in each game everyone starts with the same number of letters.

KR: What do you think were some important ideas identified last year?
Elizabeth: Using a strategy that values letters based upon how others are bidding as well.
Jesse: Valuing a letter more if it kept paths to getting 7 letters high
Aniket: Integrating the history - remembering that if it was hard to get an A before, it might be hard to get an A again
Monica: Microstrategies are helpful in creating refinements vs. a monolithic strategy

KR: Remember that you have access to last year's players. This can be good, or it can be bad, depending on how they were written and what you plan to do. Remember that if you take last year's player, tweak a few parameters, and submit it as yours, you will not be evaluated as highly as a team who brings in new, fresh ideas. Taking someone else's code as a starting base is fine, but make sure to implement NEW functionality as well, which you can clearly attribute as your own work.

KR: What else was interesting from last year?
John: Preprocessing over the wordset (offline)
Flavio: You also need to have something fast online (ex, the apriori algorithm)
(JB: paper here, Ben's original code here, original comments from Ben: "The ScrabbleMiner file has a demo/test program that builds the data structure and then looks up the set of letters passed on the command line--the other two have the actual logic.  The rest of the code has fewer comments than it ought to, but isn't *entirely* incomprehensible, I hope.  If you want to comprehend it in detail, the attached paper would probably be helpful, though.")
Yang-ziah (sp?): Some players gave a bonus for getting a letter that would complete a 7 letter word (typically 15 points). Also important to track letter frequency (as it decays from the starting scrabble frequencies)

KR: Remember, I strongly prefer that you are not using laptops during the class, unless you are, ex, submitting your code so that JB can run it, or doing something very very necessary (JB: Don't bother taking notes at all except for maybe jotting a few reminders to yourself, just read mine).

Monica: Data management was super necessary, ex, removing anagrams from the dictionary
Eva: Using a loss-based strategy - sometimes you should bid high just to block somebody else from getting something
KR: That wasn't that useful last year actually. Could it be more useful this year?
Neha: Since we have an unlimited rack size, even if we won a letter we didn't want, we won't get killed by it
Pranjal: If we've already gotten our 7 letter word, bid to block others (or make them pay) instead of bidding 0
Zach: You might still end up winning and losing points
Lauren: If you're in a small group of players it might be useful still, or if you are ahead by a lot.
Dan: Will we know how many rounds we are playing?
KR: Yes - it's 8p
Dan: Actually, if you are behind, you might want to take some risks. KR - Or, have a "meta strategy" to say "am I doing well?" and if not, "do something different
DFed: You might be losing a lot to get these letters, because if someone really needs them, they'll bid high
Nitin: If there are no hidden letters, you have perfect information, and you can stop it. And you should. If there ARE hidden letters, then you can only approximate if you are blocking a 7 letter word.
Elba: Alternatively, have a micro-strategy that changes when anyone gets 7 letters
Archy: Is "winning" by total points or total wins? KR - Total points. Archy - So don't always win, just do above average
Jatin: Maybe you should just always bid high at the start, since you'll be the only one taking this risk
Blake: This could be very, very bad if someone else does the same!
Aditya: In the first round, if people re bidding a lot, then hold out and wait to see who does what.

KR: What are some important differences we should look at?
Andrew: Sniping may not be as effective this year, since there will be more and more letters. The premium to get the 7th letter may be lower
Daniel: Hard to do a lot in 1 second
Bo: Pay more attention to other people's bidding and create strategies based upon that
Yang-ziah (sp?): Should focus more on ourselves

KR: How big a deal is it that there are more letters possibly?
Neerja: Do all of the letters have the same # of points? KR - No, they are scrabble scored.
Archana: These extra letters force you to bid more points
Claire: Different strategies for different games (2/3/4 etc person games)

Deliverable for Wednesday:
Submit a player. It doesn't have to be great. It doesn't have to win. It shouldn't crash, and it should have something that is building towards a final strategy.

Please make sure to name your players in the package:
seven.f10.g<groupnumber>.<yourplayername>

Project 1 Teams


Name Group
Elizabeth Kierstead 1
Archana Balakrishnan 1
Blake Arnold 1
Lauren Pully 3
Elba Garza 3
Dawei Shi 3
Andrew Shu 4
Neetha Sebastian 4
Flavio Palandri Antonelli4
Nitin Natarajan 4
Neerja Pancholi 5
Neha Srivastava 5
Jesse Bentert 5
Dan Iter 6
Pilunchana Kiatruangkrai 6
Yang Jian 6
Eva Sitaridi 7
Zach Sheppard 7
Dan Federman 7
John Graham 9
Daniel Wilkey 9
Monica Ramirez-Santana 9





Friday, September 10, 2010

Project 1 Simulator Available on CW

Please find the simulator for project 1 available on CourseWorks, with some brief notes on running it and implementing your own player on google code.

Tuesday, September 7, 2010

Getting started with SVN and Eclipse

Some people come into this class and haven't ever used SVN (or another version control system). With the super-fast deliverable schedule and intense team collaboration, you'll need to use a system for sharing your code with each other and managing revisions. If you've never done this before, take a look at this tutorial on using google code, eclipse, and svn for tips on getting started. If you get stuck, feel free to shoot me a note.

Update:
Some platforms don't have native SVN libraries available for subclipse to use:

If you are using a 64bit windows platform, you may need to install this:
http://www.sliksvn.com/en/download

If you are using a 64bit Mac platform, you may need to install this:
http://www.open.collab.net/downloads/community/

Welcome to PPS!

Welcome to the TA Page for the Fall 2010 edition of Programming and Problem Solving - Columbia COMS 4444.

First, about me: My name is Jonathan Bell ("Jon Bell" or "JB" most commonly). I'm a first year Master's student here, but did my undergraduate here and took PPS last year. As an undergrad, I specialized in the "Applications" track. As an MS student, I'm in the "Systems" track. I've interned at Sikorsky Aircraft (in CT), Codestreet (here in NYC), and have been working here for the Dean running and creating various software systems for the school. As of this summer, I have also become co-founder and CTO of RentPost, an online tenant-landlord relations system focusing on rent collection. As an undergrad I worked with Residential Programs for two years as a Community Adviser, creating programming for the 500 residents in the buildings that I oversaw. In my downtime, I enjoy cooking, riding my bike, and taking pictures. It's very poorly maintained, but I do have a flickr.

I'm very excited to be TA'ing this course: it was my favorite course by far last year, and probably top 2-3 courses ever. I've done my best this summer to get the simulators for this term's 4 projects (and make them as bug free as possible!) but please let me know immediately if you believe you have found any bugs or have any problems. I have created a Google Code as a repository for the simulators for your convenience, should we need to update the simulators. I will, of course, also be posting exports of the simulators on CourseWorks as well.

Contacting me:
I will be holding office hours tentatively on Mondays from 11:00-12:00 and Wednesday from 2:45-3:45 in 6LE1 CEPSR - the Programming Systems Laboratory. These hours are subject to revision within the next few weeks. I'm typically around campus a LOT though, so please feel free to ask to meet at another time. I'm available by email, at jbell at seas dot columbia, Jabber (GChat etc) at jon at jonbell (dot) net, or AIM at jonbellhasnoh. If you're stuck with something and want to just ask me a quick question, don't hesitate to ping me on GChat or AIM. If I'm busy I'll let you know, but more than likely I will be able to help you if I'm around.

I hope you enjoy your semester!
Jon