dr_whom: (Default)
[personal profile] dr_whom
This year's was an odd Mystery Hunt for me in several ways, the most obvious of which is that I didn't actually solve very many puzzles. There are two different reasons for this, one of which is that I just felt kind of off my game solvingwise. A lot of times when I was working on a puzzle in a group I wouldn't feel like I was contributing very much; I'd miss ahas, or not be able to solve clues that I think I should have gotten; or I'd find people working on a puzzle I could have contributed to when they were almost done with it. I basically didn't look at a single meta. I'm not entirely sure why this is the case, and I could easily be overestimating the number of times this actually happened.

The second reason I didn't solve very many puzzles is a much more delightful one. Although I love solving Mystery Hunt puzzles, just about the only other thing I like as much as I like solving Mystery Hunt puzzles is writing and performing showtune parodies! The theme of the Hunt this year, as many people reading this will already be aware, was The Producers, and teams were called upon not only to solve puzzles but also to perform all of the lousy musicals that the Producers were producing. This obviously was not an opportunity that I was willing to pass up. So I really spent quite a lot of time this Hunt specifically skipping puzzles so that I could work on our lyrics or go perform our shows for Codex. So in fact notwithstanding the fact that I didn't actually solve that many puzzles, I had a great time this Hunt—just not for the same reasons I usually have a great time at the Hunt. I contributed to the lyrics for all six of our productions, and performed in five of them, and, you know, though I say so myself, I thought our lyrics came out really well. (Our performances themselves would in general have been better if we had had more than ten minutes to rehearse, but c'est la vie.)

One reason I felt free to spend so much time on writing parodies was the other reason this was an odd Hunt for me: this was the first time we weren't actually trying to win. (Actually, we were trying not to win, having not yet quite recovered from the last time we won.) This comes with a whole different attitude to Hunting. If you're not trying to win, it's a lot easier to say "this puzzle doesn't look very interesting to me; I'll skip it and do something else" rather than "I guess someone's going to have to solve this puzzle and it might as well be me; oh well". ...And in my case, "do something else" was often "write parody lyrics" (at one point someone on the team made some very flattering wisecrack like 'obviously Codex knew we didn't want to win, and so they prevented us from winning by writing a Hunt that would distract [livejournal.com profile] dr_whom from solving puzzles and give him something else to do'), but also—atypically for me during the Hunt—involved things like going to sleep at a hotel room rather than crashing unauthorizedly on the couch in our solving space. That said, I didn't really get all that much sleep—although we didn't want to win, we did want to be able to complete the Hunt, and that meant for instance that I had to make sure to wake up early enough Sunday morning to be able to get in our last three performances before it would be too late to begin endgame. In the end, we found the coin location (not the coin itself, of course) at just about the last possible moment before Codex shut down Hunt operations (even though they forgot to send us the metameta for some time after we performed our last musical), which I suppose means we exerted just about the minimum possible amount of effort necessary to complete the Hunt. So... go us, I guess?

10:30pm Saturday is certainly the earliest the coin has been found at least in the modern era of the Hunt, beating the record of 12:30am set by the SPIES Hunt in 2006. Unsurprisingly, I guess, this was a wide-open Hunt with pretty rapid release of puzzles, won by a huge team, whereas SPIES was a fairly narrow-structured Hunt with (not very intentionally) a lot of puzzle-release bottlenecks and was won by a small team of power-solvers. Keeping Hunt operations running till 3pm Sunday no matter what has obviated the most serious problems associated with an early victory, in that non-winning teams can still have a full 51 hours of puzzling; but on the other hand, it seems to me that winning teams—or maybe more relevantly, second-place teams—might also want to have a full weekend of Hunting as well.

Codex did something right this year that we did quite badly in the SPIES Hunt (and therefore didn't do at all in last year's Mario Hunt). In SPIES, when you solved a meta, you had to come have some kind of encounter with a SPIES agent, as a result of which you got some kind of data which was going to be relevant in the endgame. The musical performances had the same structural function in this Hunt. The mistake we made in SPIES, though, was in not making most of the agent encounters very interesting—three or four of them in the second half of the Hunt had amusing skits, but early in the Hunt a lot of solving teams, especially non-leading ones, felt they were being dragged out of their headquarters to do something that wasn't very much fun and wouldn't be likely to ever be useful to them if they didn't get to endgame. The way this Hunt solved that problem was by making the solving teams themselves provide the "amusing skits" at these post-meta encounters, so they were exactly as interesting as the solving teams cared to make them. (Another probably-relevant factor was that in SPIES, when a team called in a meta answer we scheduled their agent encounter immediately. This year, after solving the relevant metas, teams had to take the initiative for themselves to schedule their performance appointment when they were ready to do so—meaning that if they didn't want to, because they weren't interested in performing a musical scene and didn't plan to make it to endgame, they weren't pressured to do so.)

Codex imitated from last year the structural role that group events played in the Hunt—i.e., attending events earns you points that can be redeemed for a free puzzle answer. Our implementation of this last year drew some criticism, chiefly in that each event wasn't worth a large enough fraction of a puzzle answer to seem like a productive use of time for teams that wanted to win. This year the values of the prizes for event participation were higher, and that seems to have solved the problem. The only event I attended this year was the "Name That Show Tune" event, which [livejournal.com profile] redcat9 and I were disappointed to find was about TV show theme songs rather than, you know, showtunes. I've seen elsewhere in the post-Hunt blog posts in the past week a suggestion that maybe it's time for Hunt events to start having more details announced about what they're going to entail, since the idea is that events should be a fun break from ordinary solving. As it is, if there's a party at which teams are going to have to solve a logic puzzle, but you just say "it's a party", you're not necessarily going to get people who actually like solving logic puzzles.

As an aside, I want to say that I really loved—well, I was going to say that I really loved the Hunt theming, but that's obvious already; the theme was The Producers and musicals in general. But I mean that it was in particular really well executed. "Chutzpah" and "bupkis" as names for the puzzle-unlocking and event-attending points was charming, the teaser poster was in retrospect brilliant, doing wrapup as an awards show was inspired (Kartik makes a great emcee), the theming of the meta mechanisms and answers to the rounds was well-done, and the kickoff skit was hilarious. (Aside to the aside: the changing role of Mystery Hunt kickoff is kind of interesting. I think this was the first Hunt kickoff where nothing happened but the skit. Originally of course kickoff was when the actual puzzles were actually distributed; and for many years it was at kickoff that teams at least got their passwords and/or learned the URL for the website the puzzles would be on. This year it was just show up, skit, disperse to Hunt; all the administrative matters were taken care of ahead of time. I don't know if I have any comments in particular on this change, but I was struck by it and wanted to point it out.) I guess that's a pretty long aside.

The only thing about the theme/plot that I wasn't really satisfied by was the relationship of the critics to the musicals they were reviewing. Obviously the personalities and flavortext for the critics were determined by the structure of the critic metapuzzles; but I kind of feel like some kind of flavortext excuse could have been made for why this critic had wound up reviewing that show. ...Actually, I would have been satisfied if it had just been stated that this critic was reviewing that show. The most obvious interpretation—that the critic associated with each show was the critic whose round was unlocked immediately after the show's round—turned out to be correct, but I don't understand really why this wasn't stated anywhere (did I miss it?). I can totally imagine that there might have been some team that spent time frustrated because they couldn't solve the puzzle of how to figure out which critic would be reviewing which musical. (If there was no team that that happened to: great! But that's still something I would have been concerned about from a writing perspective.)

This Hunt did something interesting with the relationship between puzzles and metas that is maybe a counterexample to one of my theses of Hunt design: about half the puzzles in the Hunt fed two metas—one show's meta and one critic's meta (not necessarily the critic associated with that show)—whereas the rest of the puzzles only fed one critic meta each. (I gather that it was a huge hassle to write metas with these properties; and having worked on the Civilization metas from last year I can believe it.) This system violates what I might call my third principle of Hunt structure, namely "all puzzles (at a given level of organization) should be roughly equally important". That's less important than the first two principles ("Every puzzle should be useful in getting you closer to finding the coin" and "No puzzle should be indispensible in the sense that if you can't solve it you can't find the coin"), but I sort of thought of it as important in giving the Hunt a symmetrical structure, and in treating fairly the fact that some teams will find puzzle X easier than puzzle Y but some teams will find the opposite. But in this Hunt the show puzzles appear to be twice as importsnt as the critic puzzles, since show puzzles give you data for twice as many metas as critic puzzles do; and I don't feel like this really consititutes a Hunt-structure problem in any way. This might be because there are a large number of puzzles in each category, so variance between teams is probably evened out on average anyway (it's unlikely that one team found show puzzles significantly easier than critic puzzles while another team found the opposite); it may be that having show puzzles feed more metas makes them enough easier to backsolve to cancel out the fact that they're individually more important to solve; or it may just be that this principle is way less important than I kind of imagined it to be.

I did notice that some of the critic metas treated the show puzzles that fed them differently than the critic puzzles themselves, while other critic metas treated all the puzzles that fed them as an unsubdivided data set. In the former case, the show puzzle answers usually functioned, as it were, as training data, which solvers had to use to figure out how the meta and its answer extraction worked, while the critic puzzle answers were the data from which the answer was extracted. While not all the critic metas worked this way, this struck me as an extremely elegant and innovative method of constructing shell metas that have two distinct sets of puzzles that they draw on. It borrows a bit from the structure of the star metas in the Normalville Hunt in 2005, where generally the dot puzzle answers would act as a field of data from which the star puzzle answers would tell you what to extract, or vice versa; but this is a different model and a nifty contribution to the library of possible relationships between puzzles and metas.

The experience of being a previous Hunt writing team leader who's close friends with current Hunt writing team leaders is kind of amusing; [livejournal.com profile] dumble and [livejournal.com profile] brokenwndw did a great job all year at Not Telling Me things.

Next post: comments on specific puzzles!

Date: 2012-01-22 04:01 am (UTC)
From: [identity profile] jedusor.livejournal.com
I have a friend on Codex who opened every conversation we had in 2011 with a fake Hunt theme (e.g. "Mystery Hunt 2012: 120 'bring food to HQ' puzzles"). The first time we talked after the Hunt was over, he said, "Mystery Hunt 2013: Someone Else's Problem!"

Date: 2012-01-22 03:49 pm (UTC)
From: [identity profile] occultatio.livejournal.com
Look forward to a continuation of the running gag, because BOY do I have a lot of ideas!

Date: 2012-01-22 06:15 am (UTC)
From: [identity profile] dumble.livejournal.com
Re: "but on the other hand, it seems to me that winning teams—or maybe more relevantly, second-place teams—might also want to have a full weekend of Hunting as well."

The pack of 3 second place teams solved their last metas around 10:30pm on Saturday, and finished endgame in the wee hours of Sunday morning, so basically on par with winners of previous hunts.

As I know you know, there's a limit to how much puzzle content you can produce in a year. I don't think it's reasonable to expect winning teams to produce bigger and bigger hunts to keep up with growing teams sizes (not to mention improving solving skills). Also, I hate to be self-congratulatory, but this year seems to be notable for a lack of puzzles that teams consistently got stuck on. Given a choice of a shorter, cleaner hunt, and a longer hunt with annoying sticking points, I'll choose the former!

Re: show/critic pairings

This was meant to be entirely straightforward. No team got stuck on it to my knowledge.

Re: events

This criticism (that we shouldn't keep their contents a secret) is very reasonable. I hope Manic Sages take note!

Re: the third principle of Hunt structure

I think your phrasing of the rule actually hints at a solution here, specifically the "at a given level of organization" bit. Show puzzles and critic puzzles were different levels of organization. Sure, they're more similar to each other than e.g. show puzzles and meta puzzles, but they were explicitly used/reused in different ways.

I'd rephrase your principle to say something like "Puzzles should not vary unsystematically in their importance", which I think gets at the core idea while making it more obvious why you're not bothered by the way we did things.

Anyway, thank you for the positive review, and I look forward to hearing your thoughts on individual puzzles!

Date: 2012-01-22 06:46 am (UTC)
From: [identity profile] brokenwndw.livejournal.com
Re. why certain critics are attached to certain shows: we basically punted on this. (And subsequently even forgot to even state what the pairing was; that was just an oversight, although as you said it's pretty obvious.) I'll have more on this part of the writing process if I ever get around to writing my own Hunt review, but basically we had zero flexibility to choose which round got which critic for non-mechanical reasons. (Betsy Johnson, for example, is a little harder than I would have wanted 1C to be, but she had to go there because she needed very plain-English 1-word show answers, which were barely available in 2S and totally nonexistent in 3S and 4S.) And of course each critic's flavor was tailored to his or her meta mechanic.

Plus, as I pointed out in some long-forgotten list email, it's not like real-life critics only review the shows that they'll be most thematically appropriate for!

Date: 2012-01-22 06:46 am (UTC)
From: [identity profile] brokenwndw.livejournal.com
Re. whether all puzzles should be equally important: this is, in my opinion, probably the biggest weird point about our structure. (Our theme and structure selection process basically meant that this theme and structure were dropped in my lap as a single package. If I had it to do again I would probably have asked for theme proposals only, trusting a smaller group to tailor a good structure to the winner. But I digress.) Fortunately, the fact that only show answers are reused becomes pretty obvious once you hit 2C, so at least you know where to focus your efforts. We didn't make any effort to differentiate the two classes of puzzle in terms of difficulty or crowd appeal, however. In the end we had very high solve / backsolve rates anyway (Sages reached 107/107 not long after winning, with one bupkis purchase), so maybe it didn't matter that much.

The transition from metas that used the answers equally to metas that used them separately (there's a single line in this regard between 2C and 3C) was dictated by simple numbers. Early on there weren't that many answers being reused (4 each in 1C and 2C), so using them separately was just unlikely. Later on, conversely, we had a flood of reused answers, so trying to treat them the same as the new ones would totally marginalize the latter. (Not to mention that 20 is just not the sweet spot for meta size.) I am glad that it went off well, anyway; many of my teammates think that the two-stage critic metas are the cooler ones, and I would agree. I'm particularly happy that our shell metas presented immediate solving opportunities without feeling like standalone puzzles, since you felt like you'd done work to get the feeder data and yet it was instantly available at round release.

I do think that your "third principle" has been proven less important than the other two. Even in perfectly normal structures, some answers are just more important than others, and we're all used to it. I remember reading a writeup of this year's hunt, for example, that pointed out that the T in ALT F FOUR is way more important than its brothers (compare ?LTF???? to AL?FFO??; the former is possibly closer to resolving, despite having 3/8 instead of 5/8!). And one-letter-per-answer is a total staple of meta structures, even if [livejournal.com profile] dumble and I tried to squash it as much as we could this year...

Date: 2012-01-22 06:57 am (UTC)
From: [identity profile] brokenwndw.livejournal.com
Re. flavor: this was totally not my department, but I do think that wrapup-as-awards-show was the general consensus of the admin list way before we had many more important questions settled. In general I'm thrilled with how kickoff and wrapup went, and it's a testimony to how well Codex can do things when [livejournal.com profile] dumble and I aren't in charge of them. ;-) As for how thematic meta answers were, there's clearly a sliding scale here between how constrained your construction is and how thematic your answers can be. Because we needed all twelve answers to be things you could actually do in a skit, we had to err towards the latter; fortunately our critic metas were tending towards the "process arbitrary data" end anyway, thanks to answer reuse, and we just did what we could with show metas. But there was definitely no way we could have had an ultra-constrained construction like the Civ supermeta, since "hooray, I can spell ORANGE!" was just not good enough.

[Yes, I keep posting new bits; this was originally one post but I broke it up for threading reasons.]

Date: 2012-01-22 07:26 am (UTC)
From: [identity profile] brokenwndw.livejournal.com
Re. length: this was totally my department, I guess. My number crunch post already spawned a discussion on the subject, but I think that the biggest factors here were just that we had slightly fewer puzzles than usual and that it was Sages' year to shine (in terms of their drive and in terms of a good match for our content). I do wish that I'd better foreseen the fraction of puzzles that would be solved or backsolved (much higher than usual), because that would have caused me to tweak the constants downward a bit. Then again, if I'd done that we might have had a four-way logjam at endgame, so maybe not!

As I've said elsewhere, I stand entirely by the decision to have no hard gateways between releases. Even if it's a big-team bias on my part, I just loathe the idea of solvers being frustrated or bored, and the only way to reliably prevent that is to keep the puzzles flowing.

More broadly, I don't think it should be the job of release structure to keep a check on big teams powering through hunts. There has to be a better answer to "we're so big and well-organized that we can solve a modern Hunt in 28 hours" than "we're going to force you to commit 15 solvers to each puzzle; too bad if they're tripping over each other and half of them are frustrated". If team size is really becoming a hindrance to having a reasonable-sized Hunt that lasts the right time, I'd rather open a conversation about whether teams should be encouraged to be smaller than take steps that would make those teams less happy.

(FWIW, as long as we're talking records, our endgame was not meant to take the 4.5 hours it took Sages. If it had gone closer to plan, they would probably have found the coin around 9 pm! Oy.)

In the end I think the size question will resolve itself organically anyway. Codex is going to be smaller and less focused on winning in 2013 than it was in 2011, and I doubt it'll be in contention again for several years; I imagine Sages will go the same way in 2014. This kind of "we wrote the hunt and now we're in smaller pieces" effect is probably more likely to hit big teams than small ones-- all of the "dynasties" have been small teams, right? So I think that once the current group of large teams without victories (Codex, Sages, Too Big To Fail) have their days in the sun, we'll see a wave of smaller teams winning, and all the hubbub will prove unnecessary.

Whether teams in general are getting too good, however, is maybe a better question. I have a certain fondness for the more epic hunts, even if those specific ones didn't go very well for Codex, and so do wish that it were possible for a modern hunt to both be clean and let a competitive team be solving on Sunday. But while we produced a slightly shorter hunt than normal, it certainly wasn't that much shorter, which leads me to conclude that the only way any team could have pushed the ending into Sunday morning would have been to break the Hunt, or at least make the puzzles way harder. Our hunt was solvable in 34 hours by any competitive team, and I'm not sure how I feel about that.

Date: 2012-01-22 11:31 pm (UTC)
From: [identity profile] noahspuzzlelj.livejournal.com
Previous to this year the top team was in the 2.5-3 puzzles/hour range pretty consistently. This year the next clump of three teams was right around 3 puzzles/hour, but Sages were close to 4! That's really an outlier historically.

(I've said this elsewhere, but I totally agree with you that we just happen to have a backlog of large teams that are due, and this "problem" will work itself out. On the other hand, I think it's a shame that the ratio of work that the writing team has to do to puzzles that people get to solve is getting worse and worse.)

Date: 2012-01-22 01:59 pm (UTC)
From: [identity profile] devjoe.livejournal.com
Regarding the importance of show vs. critic puzzles: Luck had 10 puzzles unsolved at the end. Exactly one of these was a show puzzle (Snap Judgement from Mayan Fair Lady). This was largely because of a greater emphasis on solving show puzzles (because they impacted two metas). You'd think it was also due to greater chances for backsolving show puzzles (again because they appeared in two metas) but according to our records we backsolved 9 show puzzles and 9 critic puzzles.

We (and by we, I mean Thomas Snyder) solved the Bergman meta with 4 critic puzzles unsolved, and those puzzles were then marked low priority for us to solve. Snap Judgement was marked medium because we expected it would come up in a critic meta, but because we understood the way its group of answers worked in the Watson meta without it, it did not get elevated to high (and probably should have been pushed to low).

Date: 2012-01-23 03:05 am (UTC)
From: [identity profile] kaihuang.livejournal.com
> but on the other hand, it seems to me that winning teams—or maybe more relevantly, second-place teams—might also want to have a full weekend of Hunting as well.

These days, some teams are so ridiculously huge (yes, Codex included) that most participants only see a very small fraction of the puzzles. I don't think it's fair to ask hunt authors to put in so much work and write so many puzzles just to have most of the participants not even see them. If a participant complains about the hunt not lasting until Sunday 3pm for them, then they only have themselves to blame for not joining a smaller team. There are plenty of small teams willing to take on extra solvers (based on a poll we sent out this year).

Personally, I like being on a big team because I don't know pop culture and I hate data collection, so I just let teammates that like that kind of stuff handle those knowledge/data-heavy puzzles. Then, if they're stuck on the aha or on extraction, I try to come in and help crack it. Being on a big team is my own choice and preference. I would never fault the hunt authors if my being on a big team caused my hunt to end early.

Date: 2012-01-23 03:34 am (UTC)
From: [identity profile] kaihuang.livejournal.com
Also, note that only 5 teams finished the hunt out of 40. Personally, as an author, I was hoping that at least 10 teams would finish this hunt and at least 20 teams would get through half of it. I care a lot about the 10th, 20th, and 30th place teams, not just the top few.

For example, you mentioned in your separate post on specific puzzles that you really liked one of my puzzles appearing in the very last round. Well, only 6 teams solved it out of 40. Isn't that just sad? It is for me. I put so much work into it only to have 6 groups of solvers enjoy it, out of over a thousand participants?!

So in my mind, if anything, the hunt should be shortened, not lengthened, and should have more emphasis on enjoyable puzzles for small teams, not tough breaking points to slow down big teams. I think we definitely started moving in that direction this year, not with the number of puzzles, but with the difficulty level and cleanliness. I applaud the Codex admins for that.

Date: 2012-01-25 02:38 am (UTC)
From: [identity profile] oxeador.livejournal.com
I am entirely with you on this. On all the discussions about hunt length, this is often omitted, and it is a very important point.

Date: 2012-01-25 04:25 am (UTC)
From: [identity profile] ppaladin.livejournal.com
I believe all puzzles were released by 9am Sunday at the latest.

Date: 2012-01-23 06:05 am (UTC)
pastwatcher: (Default)
From: [personal profile] pastwatcher
Hey, can I tell the Manic Sages team about this post (or a couple of us)? I'm a blind testsolver, as I don't want to worry about the theme and structure, so I can't directly contribute.

Date: 2012-01-25 08:54 pm (UTC)
From: [identity profile] brokenwndw.livejournal.com
Well, we got the idea from you, and I'm pretty sure Sages got the idea from us, so...

Date: 2012-01-25 09:39 pm (UTC)
From: [identity profile] okosut.livejournal.com
I *think* we invented that. We were worried about Orange Star problems, right? Or at least, we almost certainly invented the term-- my impression is that other, smaller hunts are often "playtested" from beginning to end by people who weren't involved in writing, which is more or less the same as blind testing.

Date: 2012-01-26 01:05 am (UTC)
From: [identity profile] fisherama.livejournal.com
Re: show/critic pairing

We (Knights of the Random Table) got semi-stuck on this, but mostly due to a misunderstanding. One of our team members claimed that someone on Codex explicitly told them that we had to figure out how the shows and critics were paired. In retrospect there was clearly a miscommunication there, but it confused us for a while. When we solved our second show/critic meta pair, HQ basically told us that we should probably schedule our performances soon, and we got the hint.
Page generated Feb. 14th, 2026 03:04 am
Powered by Dreamwidth Studios