Aug. 12: New MLB Gameday data project.
I've been scratching my head a lot lately over how to improve my fantasy baseball free agent/draft picks, but aside from reading articles and looking at Yahoo's sortable stats, I feel like I'm just not getting any better. About a month ago I started looking into getting detailed at-bat statistics to run queries on, with the hope of predicting streaks better, and came across some incredible resources with more data than I honestly thought even existed.
Retrosheet, for one, has near-complete archived stats for each at-bat of every game going back to about 1953 (with some older years too), plus a ton of other data on players, parks, games, etc. While writing some scripts to parse that and put it in a working database, I found that the MLB actually puts up XML files with all of their Gameday information. Not only are the at-bat and detailed game stats there, but since like mid-2007 they've included the incredibly detailed PITCHf/x data, with things like start and end velocity, break points, etc.
Anyway, I decided to go with Retrosheet for all the historic data and managed to put everything from 1957 to 2006 into a MySQL database. Now, the hard part is writing a script to parse those XML files at mlb.com and update the database for 2007 and 2008, and then nightly after that, formatted like the Retrosheet data. I can't wait to start contributing to the awesome sabermetrics community, and once I get steamrolling on this I'll recount how I did everything and put up the scripts/database dumps for download. I was actually surprised someone hasn't released a Retrosheet-compatible database of all Gameday stats yet, but that's definitely the plan once I get the bugs worked out of this.
Jan. 4: Magpie RSS for Flickr.
I've been messing around (poorly) with RSS feeds for a while and last week I came across Magpie RSS, a super simple php class to handle pretty much any feed you can throw at it. Right now the Asides page uses Magpie for the Delicious bookmarks and Flickr photos; the bookmarks were pretty straightforward but I was having some trouble displaying the photos correctly. This is pretty rough, but if you're looking for some sample code to plug in here's what I'm currently using:
if ($url) { // Set $url to your Flickr url
$rss = fetch_rss($url);
$count = 0;
foreach ($rss->items as $item) {
if ($count == 14) { break; // to limit it
} else {
$image = $item['description'];
$image_ary = explode("\n\n",$image);
$image_img = str_replace("_m.jpg", ".jpg", $image_ary[0]);
$image_img_ary = explode("</p>", $image_img);
$image = $image_img_ary[1];
$image = str_replace("<p>", "", $image);
$image = str_replace(".jpg", "_s.jpg", $image);
$image = preg_replace("/width="d+"/" , '', $image);
$image = preg_replace("/height="d+"/" , '', $image);
echo "$image";
} $count++; } }
You can set $url to your Flickr rss feed, and make sure to include the mapgpierss file ;) I'm pretty sure I'm adding some unnecessary steps in there, but it gets the job done. Feel free to test it out or email me if you've got something.
Dec. 28: New site design finally.
I finally found some free time over Christmas break to give this thing a much needed layout and brush up my neglected php skills. It's pretty trimmed down right now but I'm definitely still going to keep it bare bones this time. I also finally signed up for a del.icio.us account; I've been looking for a way to sync bookmarks across computers and I'm not sure why I waited so long on this, but it's definitely worth it. More coding!