Building a Retrosheet Database — Part 3 (The Easy/Mac Way)

I’ve worked in technology pretty much all my life, but my first job was on the support desk of a software company. It was consumer software, too, so anybody and everybody called in — professionals, novices, little kids, people who wanted to learn, people who wanted us to do their work for them, and people who didn’t understand how computer mice worked. It was challenging. But I think the hardest part was that our department didn’t have remote software. This meant that every time a person had a problem, they had to just explain it to us. We couldn’t see what was happening, so we had to trust what the person was saying was accurate. Everybody sees a computer screen differently. Very rarely did a customer see it in the same fashion I was used to seeing it.

When I published the first two parts of this Retrosheet series (Part 1 | Part 2), I did not monitor the comments well enough. I apologize for that. Some people had legitimate questions and I wasn’t there to answer them. I also learned that I was making all of you trust my explanations of things. I was making you see everything through my eyes, and some people got a little confused. This stuff happens. I’m hoping to fix that here.

I also promised a way for Mac people to have access to this, so I’m doing that as well. This article/video is for:

1. Mac (and Linux) users

2. Windows users who had trouble with the first two parts

Some caveats for Windows users looking to use this method:

1. The whole point of the first two parts was to show how the SQL files can get made using Chadwick. The idea being that you could grab the 2015 (and all subsequent season) data when that became available. The method mentioned in this article involves installing one flat file. That means that you’ll need to delete all the data and reload it for 2015 next season. I have no problem making this file for you folks, and will continue to do so as long as TechGraphs and I are around. It’s not a big deal, it’s just a different approach and I wanted to be transparent about that. Mac users basically have to do this method since the Chadwick tools aren’t available for OSX.

2. The video walks Mac users through installing and setting up a MySQL database. Your method differs, and it was explained in parts one and two linked above. I try to make it clear about when I’m dealing only with Mac users, but I wanted to give you a heads up on that.

Everyone using this method will need to download this file. Just save it to the desktop. The video will walk you through the rest.

You will also need these two lines of code. The video will tell you how to do it.

cd /usr/local/mysql/bin

./mysqladmin -u root password ‘password’

I will be monitoring the comments more closely, so let me know if you have any questions. Thanks, and good luck.





David G. Temple is the Managing Editor of TechGraphs and a contributor to FanGraphs, NotGraphs and The Hardball Times. He hosts the award-eligible podcast Stealing Home. Dayn Perry once called him a "Bible Made of Lasers." Follow him on Twitter @davidgtemple.

10 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
fromthemachine
9 years ago

David, are you planning on continuing this series with a basic search/terms tutorial for MySQL?

Thanks in advance.

Ed Miller
9 years ago

This series has been super helpful. Eagerly awaiting Part 4.

Michael Ruston
9 years ago

David, this is great, thanks!
Couple questions-
1) In Part 2 of this series did you mention to run partition.sql (in addition to events.sql, games.sql, and subs.sql?)
2) When I tried to run partition.sql, I got this: (for both the events table and games table)
“Error Code: 1526. Table has no partition for value 2010”
I think it’s because the schema for these tables only includes years up to 2009, as defined on lines 351 and 537 of retrosheet_table_schema.sql
Is that what’s causing that error code?
Thanks again!

Ryan J Lind
9 years ago

That is because the initial retrosheet table schema.sql has not been updated. You see lines like:

“PARTITION BY LIST (YEAR_ID) (PARTITION p1 VALUES IN (1921) ENGINE = MYISAM, PARTITION p2 VALUES IN (1922) ENGINE = MYISAM, …”

If you scroll to the very end you can see it only goes to 2009. Partitions for the years 2010,2011,2012,2013 and 2014 will need to be added manually. Update the sql file and then re run ONLY the parts that drop/create events, games, sub. Do not run the parts of the file that drop and create events_bck because that is where all your data lives until you run partitions.sql

Email me if you need help.
Ryan

Michael Ruston
9 years ago
Reply to  Ryan J Lind

Thanks Ryan, that was exactly it!

Dustin
9 years ago

For those of us that aren’t savvy enough to understand everything that’s happening in each of these files, a tutorial that shows how to update the database at the end of each season (when the new retrosheet events file is released) would be a godsend.

Dustin
9 years ago
Reply to  Dustin

Piggybacking on an earlier comment about partitions.sql, when is that sql file supposed to be executed?

Dustin
9 years ago
Reply to  Dustin

I was able to figure out how to update the database by looking through the existing files and modifying as necessary for future seasons.

One other question, though, what is the process for working with a partitioned table? For example, if you wanted to return all of the at-bats for Sammy Sosa during the 1998 HR record season, I would expect this would be the syntax:

SELECT * FROM events PARTITION (p49) WHERE BAT_ID = ‘sosasa01’

However, this result returns null (no records), regardless of what playerID you put into the BAT_ID. Is there something special that has to be done when using SELECT on a partitioned table?

Isaac Sasson
8 years ago

While connecting to the server on Sequel Pro I am getting a message saying “Access Denied”. Any help??