With over !NaN! hits and counting!

May 27, 2009

Trying to Game Swoopo, dagnabit

Filed under: data, statistics, R, kinda maybe funny — Dave @ 12:14 am

Casinos love guys with systems.

Jeff Atwood and Ted Dzubia both hate Swoopo, so it’s roughly as bad as PHP. A quick overview: “auctions” start at $0.00 and each bid raises the price by pennies, the time remaining in the auction by 10 seconds and costs the bidder 75 cents to place.

If you can get the last bid in (and you only place a few), you can pick up a $1000 laptop for $30. I mostly ignored Swoopo until Joshua Stein tried to game it. He was thwarted by HTTP requests not being accurate to the sub-second (since Swoopo gives ties to the users who waste money on automatic bidding), and determined that bidding was indistinguishable from gambling.

But I’m not convinced it can’t be gamed, the key being that you want to game it with high probability rather than win any one auction.

Just as a first pass, I think you want to find auctions where:

  • Several are closing at the same time - so there’s less competition
  • At a particular time of the day - same reason
  • Only auctions for $500+ items selling for more than 90% off, so any accidental purchases can be safely sold at a profit (I don’t want to bother reselling DVDs)

So I used a greasemonkey script to download the last 10 000 winners into a spreadsheet.

Quick facts:

  • 9904 auctions were won by 4217 distinct users (7 by phone)
  • The average savings (vs the suggested price) was %65, although in 35 users paid more than the suggested price
  • 2853 auctions were open only to manual bidders, rather than the automatic bidbutler (the difference in savings %66 vs %66 isn’t significant).
  • Wins are spaced fairly evenly throughout the 24 hour clock
  • The average winner placed ~95 bids, thousands are not uncommon, one “winner” placed 2623 bids
  • Roughly one in ten auction winners placed only 1 or 2 bids.

Clearly the last point hints that it’s possible to win by sniping at the last minute.

Roughly 1 in 8 auctions was for items valued at more than $500, and won for less than 20% of the suggested price. “Winners” used an average of 311 bids — that doesn’t look good.

Next step, crack out the R.

Source: Swoopo dataset 3

June 2, 2007

Printing Photos

Filed under: statistics — Dave @ 9:29 pm

I’m looking to get a few hundred photos printed so I’ve been looking at prices.

PE Photo looks good, and I read good things about them.

They have a lot of different sizes of prints, since I’m planning on covering a wall in them, I was curious if there was a cheaper way to do it. (There isn’t). Here’s a graph of the cost in dollars per foot for different photos. Strangely the price seems to peak at the mid-size pictures, rather than at one extremity.

costperfoot.png


Some other nice printing prices seem to be overnightprints.com’s 100 full colour business cards for $10 and making custom mugs at discountmugs.com

April 7, 2007

Awksomeness (part 2)

Filed under: awk, programming, statistics, R — Dave @ 8:39 pm

I expanded the program from earlier to include special cases related to the data at Rate My Prof:

BEGIN {
 s=""; FS="n";
 print ("last,first,department,votes,quality,ease");
}
/<td/ {
 str = $1;
 gsub(/<[^>]*>/, "",  str);
 gsub(/[t ]/, "", str);
 if( length(str)<40 && length(str)>0 )s=(s str ",");
}
/<tr|<TR/ {
 sub(/,$/, "", s);
 gsub(/&nbsp;/, "0", s);
 gsub(/,,/, ",", s);
 if(length(s)>0) print s; s=""
}

In R:

> uw<-read.csv("c:/newsite/articles/ratemyprof/marksuw.txt")
> plot(uw$ease,uw$quality, xlim=c(1,5), ylim=c(1,5))

Quality vs Easiness

And the first result is that a professor’s quality and easiness aren’t strongly correlated.

Actually, here’s a more honest graph:

> uw$quality2<-uw$quality+runif(length(uw$quality), min=-.05, max = .05)
> uw$ease2<-uw$ease+runif(length(uw$ease), min=-.05, max = .05)
> plot(uw$ease2,uw$quality2, xlim=c(1,5), ylim=c(1,5))

Quality vs Easiness 2

Looking at the distribution of “quality” marks:

Original Quality Distribution
The data isn’t normally distributed — not even close (the average is 3.4), and if a prof has only one vote then that vote really skews them far more than it should (a prof with 50 votes averaging 4.5 is probably better than a prof with a single 5). So I’m going to multiply the distance from the mean by the root of the number of votes:

> qual<-mean(uw$quality)+((uw$quality-mean(uw$quality))*(uw$votes/10)^.5)
> hist(uw$qual,breaks=c(20))

Modified Quality Distribution

Much nicer. Except there’s still one prof originally rated 2.3 — but 198 times who gets slaughtered down to a -1.5. Maybe we don’t need to worry about a few edge conditions.