![]() |
|
|
#1 | ||
|
Dark Cloud
Join Date: Apr 2001
|
Playtesting Chaos (Viperball + O27 baseball)
Rather than put a bunch of box scores in the original thread, I just wanted a cleaner more "dynasty style" place to write about game action now that I've had for weeks a pretty clean interface with shareable data.
I'll still weave in engine improvements and things I run across, but I just wanted somewhere to actually "play" the game even if I'm mostly doing playtests and sims to see how things work and trying to in many ways reverse engineer context from a sport we can't see. This isn't an unusual problem for any text sim, but at least we know what football or basketball or baseball look like which makes the sims easier to make sense of. We don't know what Viperball actually looks like, so the imagination has to work harder to make sense of what's happening on the page. I've enjoyed this challenge of asking myself with each out "is this plausible?" The biggest issue for a while in the college simulator was too many playoff upsets for my tastes, but the problem was the engine had created players that were all kind of even and so it made sense that no one team was dominant. I've since updated the engine to create far more distance between elite to doormat, but the engine has taken it too far and now you get too many superplayers who kind of break the game and turn it into arena football but outdoors. I've worked pretty hard to get the game to look more like a football variant -- specifically improving defenses so it's not just an offensive game, but I envision a freeflowing game that's fast and tactical. So that's what you're walking into currently, I'll do share outs of various games, seasons and box scores of interest as I continue to build on the engine, a lot has happened in the last month with it.
__________________
Current dynasty: Playtesting chaos (Viperball 26) | OOTP Mod: Managerial Strategy Files | GM Excel Competitive Balance Tax/Revenue Sharing Calc | FBCB Mods on Github Last edited by Young Drachma : 05-04-2026 at 01:59 PM. |
||
|
|
|
|
|
#2 |
|
Dark Cloud
Join Date: Apr 2001
|
I've already improved the engine around player generation to make things less super charged, the issue was largely that the diff between players within a band was very low, I expanded it so that you get wider differentiation between players on even elite teams.
But before I push that change, I'm running a single-season campaign where Rutgers is somehow really good and the Jersey in me is hoping to see if they can pull off an unthinkable playoff run in a sport that does not reward higher seeds. |
|
|
|
|
|
#3 |
|
Dark Cloud
Join Date: Apr 2001
|
This Week 15 match between Montana (3-9) and South Dakota (0-12) isn't notable because of the talent, but because it was a shutout. I'm not sure I'd ever seen a shutout in this game before.
I've done a lot of work on producing an engine capable of a defensive team shining, but in this case it was too pretty inept teams coming together to produce a stinker where USD couldn't get the ball anywhere near the end zone. Some notables: - Average start is where the team got the ball post-score, because of the no-kickoffs thing, teams get the ball +/- their own 20 yard line based on the score differential. This seems like it'd make things unnecessarily unfair, but we track the differentials and good teams are able to overcome it. My thinking was always that if you bake "unfairness" into the game, teams are able to adjust and adapt to those conditions. I think there's a lack of human element involved in a simulated sport that doesn't account for things like bad calls and blown calls that might impact the sport, but I don't think it's anything more drastic than NBA refs or umpires or line judges ability to turn a game. Anyway, shutouts are not a common occurence here so had to freeze one. Code:
|
|
|
|
|
|
#4 |
|
Dark Cloud
Join Date: Apr 2001
|
Playoff runthrough
This isn't the first post-season I've run, I've run lots of sims at this point probably in the thousands. The tool itself is just Python with a GUI that lets me interact with all of it, so it's pretty speedy.
As I've said before, paying attention to the games helps me to see what sort of problems are happening in the engine. Because i'm a sim player, I'm far more interested in the aggregated look of things than I am trying to overindex on things like playbooks and controlling outcomes, I'm far more interested in giving the engine the tools to do all of that itself and to fix and tweak as I sim to see what's going on. So this post-season is interesting, we have a 32-team playoff and seeing what teams get left out (and sent to Bowls) versus make the field really comes down to being in an easy conference, having a strong season. There's a bug that lets playoff teams also play in bowls which I'd never noticed before. But my changes have helped top tier teams play for the title, I've never seen two Top 5 teams in the title game before this year. Rutgers lost to Boise State in the Elite 8. Code:
Code:
|
|
|
|
|
|
#5 |
|
Dark Cloud
Join Date: Apr 2001
|
Code:
Last edited by Young Drachma : 03-25-2026 at 07:57 PM. |
|
|
|
|
|
#6 |
|
Dark Cloud
Join Date: Apr 2001
|
Now I'm about to push several updates that will improve the game logic on timeouts, talent, end of game logic -- which was lacking -- and a few new offensive playbooks to improve schematic diversity. It should result in some differences in how games play, though probably not a drastic change other than hopefully lowering scoring and some of the blowouts.
It'll probably still look like crazy arena football meets Aussie Rules, but I prefer scores in the 40-70 range, not 100+ scoring games. |
|
|
|
|
|
#7 |
|
Dark Cloud
Join Date: Apr 2001
|
Code:
Last edited by Young Drachma : 03-25-2026 at 09:50 PM. |
|
|
|
|
|
#8 |
|
Dark Cloud
Join Date: Apr 2001
|
I realized after the post-Rutgers run that we were still seeing far too many 99 OVR players and it's the main root of the insane scoring.
The game has a bunch of engine-related additions that skill can activate which can "boost" your ratings in context, so there's no real need to have players coming in already overly maxed out, so I lowered the baselines dramatically to see if that tamps down scoring. Also I always think it's silly if you're operating on a 0-99 scale and the lower number aren't used, what was the point of the scale going that low? Like if you can rate someone on a scale 1-5 and you only really can use 4-5 what's the point? I haven't tested this out yet, but it should be closer to what we had before without all of the narrow closeness within the play development windows. |
|
|
|
|
|
#9 |
|
Dark Cloud
Join Date: Apr 2001
|
Another playtest
So the latest update was pretty big because there lots of things I ended up adding and fixing in some cases. The biggest change was to the talent player generation system, which took several PRs to get right. Even with that, it's still not perfect, but I like where we are now.
The thing is, by removing a lot of the arbitrary guard rails around talent generation, it means far worse players -- and far better players -- get created now than when we simming before. The earlier iterations kept scoring mostly out of control, but the game hasn't gotten any real blowout mechanics much yet, I'll add that in the future. This Week 2 non-conference matchup got out of hand quickly. Code:
I'm going to work on improving the box score detail outputs, too. I think the most interesting additions I'm looking for are the coaching decision logic. Because I always intended for this to be a sim where I just watch, I wanted schematic diversity, I wanted coaches to be able to make dumb decisions just like real ones do. One of the things I did in my last big update was add refs because I wanted to see how they'd impact games -- I probably wrote this already -- but I also wanted coach challenges. Initially, the challenge system didn't make any sense. The AI knows what calls are and plays are not really "plays" anyway, it's just dice-rolls that get rendered as plays. So once I realized this before we even had formal challenges, I went back and rebuilt it. Now the coaching system doesn't interact with whether a "play" needs to be challenged. Instead, on decisions that might require review -- a penalty or some close call play -- the coach will challenge, but the coach's decision process has something to do with who is refeering the game, their ratings & a bunch of other factors including the score and when in the game the play happened to decide its challenges. Video review and the refs are a separate system entirely. It won't have a major impact on games, and the failsafes are built to ensure that. Speaking of, this Week 5 matchup was a much better representation of a close game. Code:
Last edited by Young Drachma : 03-26-2026 at 11:55 PM. |
|
|
|
|
|
#10 | |
|
Dark Cloud
Join Date: Apr 2001
|
Snapshot of an elite player, doesn't change her from being far and away better than everyone else.
![]() Building a computer ranking using real-life computer football algos for a viperball sim might be the nerdiest thing I've ever done, which is saying a lot. But I'm so jazzed. ![]() Viperball KenPom Metrics, okay maybe this is nerdier. ![]() Glossary: Quote:
|
|
|
|
|
|
|
#11 |
|
Dark Cloud
Join Date: Apr 2001
|
JK Sometimes people are just gonna drop 100 on you...
PROVIDENCE — By halftime it was 75½–30 and North Dakota State had three timeouts left and nowhere to use them.
Providence ran 75 times for 542 yards. Nine rushing touchdowns. Pınar Çelik had three of them. Amber Pham had two and 92 yards. Ayu Kaewprom scored and ran wild in the second quarter. The Friars didn't need their kick pass game — Maya Mendoza went 12-for-14, got her two touchdowns, and stepped aside while the ground game finished the job. North Dakota State turned the ball over nine times. Three fumbles, two interceptions, four lateral interceptions. Havana Robinson threw into coverage all afternoon, finished 17-for-31 with four picks, and watched her team gain minus-one rushing yards on 13 carries. The only thing that went right was a 10th-down touchdown in the first quarter — a lateral chain that survived long enough to find the end zone — and a garbage-time score in the fourth that made the margin 75½ before Providence kneeled it out. The Bison had 186 total yards. Providence had 693. "We've got some things to clean up," Providence coach Sarah Weber said, generously. North Dakota State falls to 0-1 and has a week to figure out whether this was a mismatch or a diagnosis. Code:
Last edited by Young Drachma : 03-27-2026 at 01:07 AM. |
|
|
|
|
|
#12 |
|
Dark Cloud
Join Date: Apr 2001
|
Another season run. Another close game.
Code:
Last edited by Young Drachma : 03-31-2026 at 03:32 PM. |
|
|
|
|
|
#13 |
|
Dark Cloud
Join Date: Apr 2001
|
Back at it again, this time...with another side quest.
I've long had this idea about baseball vs. cricket. Thanks for a headcold that had me knocked out for a day, I couldn't leave the house, so in-between doing my grading I built out a far cruder baseball sim called O27, the Twenty20 cricket variant of baseball essentially. It doesn't shorten baseball in length, it just reduces it to one inning with 27 outs top/bottom. Tactically it really does a number on the sport because everything is different and yet, the game resembles itself in spots. I've long wanted to test this out in OOTP but with the idea before that 5-inning baseball would simulate the T20 ethos well enough. This test proved to me that's just not true. 5-inning baseball is kind of boring and barely baseball, whereas O27 feels like all of the tension of baseball gets amplified because of the strategy and stakes that come with the decisions you get to make. Obviously, you could also shorten O27 and you'd still get the same fundamental things that come from a cricket innings with the baseball scaffolding. I'm still messing around with the details, but this one came together a lot less organically than the Viperball game where I built it over several weeks. Baseball being what it is required some surgery and I think due to the overindexing on complexity that I did -- a mistake -- there are aspects that are just being worked out now in the engine. But I think there's a lexicon, the game operates and I'm navigating the rest. |
|
|
|
|
|
#14 |
|
Dark Cloud
Join Date: Apr 2001
|
O27 AT A GLANCE
RULES
- One inning per game. Each side bats until they record 27 outs. Home team can stop early if they're ahead in the bottom half. - 9-fielder lineup. The starting pitcher must bat — no DH replacing him. Lineup is ordered by talent, pitcher usually 9th unless he's one of the rare 5-10% of pitchers who can actually hit. - 3 jokers per roster. Tactical pinch-hitter equivalents the manager can insert into any spot in the rotation. Each joker can be inserted once per cycle through the order. Joker insertions add an extra PA to that rotation — the joker bats, then returns to the bench. They don't take a roster slot or a field position. A manager who never uses his jokers is leaving offense on the table; a manager who burns them in low-leverage spots is leaving offense on the table differently. - Pinch hitting still exists separately. A manager can pinch-hit for a regular (replacing him in the lineup AND in the field, permanently) like normal baseball. Jokers are a parallel option that doesn't cost a roster slot. - Second-chance ABs. When a batter hits the ball into the field of play, he can choose to run or stay at the plate. If he stays, the runners advance as the play would have advanced them, the batter is credited with a hit, and he stays at the plate to await the next pitch. The contact event still counts as a strike — count carries normally. So a 1-1 second-chance hit becomes 1-2. - Three contact events per AB max. Whether they end in hits or outs, three is the cap. A foul ball counts as a strike like in baseball, so three fouls is a foul-out (no infinite-foul protection). The maximum hits in one AB is 3, only achievable from a 0-0 start with no called or swinging strikes. - Walks and HBP work normally. Four balls is a walk regardless of how many second-chance hits the batter has accumulated. - Tied games go to super-innings. Each side fields 5 batters and bats until 5 dismissals or until they're ahead at the end of the round. Repeat until a winner. STAT METHODOLOGY The denominator changes. Plate appearances (PA) and at-bats (AB) diverge in a way they don't in MLB, because a single AB can contain multiple PAs (each second-chance hit is its own PA within the same AB). A hitter might post 600 ABs and 750 PAs across a season. PAs > ABs structurally, not just because of walks. That breaks traditional batting average. H/AB can exceed 1.000 (a multi-hit AB produces 2 or 3 hits in 1 AB), so it doesn't read as a rate stat anymore. So: - AVG is renamed PAVG. Hits per plate appearance. Bounded 0.000-1.000. Reads like AVG used to. League average lands around .270-.320 depending on calibration. - BAVG is the secondary stat, kept as H/AB. Can exceed 1.000. Reads as "second-chance productivity" — a hitter with PAVG .380 and BAVG 1.150 is using second-chance ABs effectively. PAVG .380, BAVG 0.950 means he runs on contact and rarely stays. - Δstay (or Δ2C) is BAVG minus PAVG. Quantifies how much value a hitter is generating from second-chance ABs specifically. Other rate stats (SLG, OBP, OPS, ISO, BABIP, wOBA) all use PA as the denominator. The numbers calibrate to a higher run environment but the conceptual stat is the same. For pitchers, the structure of the game requires different lenses: - Innings pitched is gone. Replaced by OUT (the team's out count when the pitcher's last batter's PA ended) for game-level lines, and total outs recorded for season workload. The natural unit in O27 is outs, not innings. - BF (batters faced) is the headline workload counter. - OS% (outs share, percentage of the team's 27 outs the pitcher recorded) is the per-game role indicator. 80% is a workhorse outing; 25% is short relief. - AOR (average out reached) is the season-long version. A pitcher's mean OUT across appearances. Tells you whether he's a workhorse (AOR ~22), a closer (~26), or long relief (~14). - ERA is replaced by wERA (weighted ERA). Earned runs are weighted by where in the 27-out arc they were given up: - Outs 1-9 weighted at 0.85 - Outs 10-18 weighted at 1.00 - Outs 19-27 weighted at 1.20 Runs given up early give the offense runway to respond; runs given up late are more damaging. wERA reflects that. League average tracks the run environment baseline (~11-12 in current calibration). - FIP is broken in O27 — its small-sample behavior produces nonsense at the tails (negative xFIP values for K-heavy pitchers in low BB/HR samples). The replacement under consideration is xRA (expected runs allowed), which sums empirical run values per PA outcome and normalizes per 27 outs. Bounded, robust to small samples, methodologically consistent with wOBA on the offensive side. - Decay is the genuinely O27-only stat. Measures how much a pitcher's K-rate falls between outs 1-9 and outs 19-27, restricted to appearances where he faced batters in both phases. 0 = perfectly durable. 30+ = significant late-arc fade. Negative = better late than early. MLB doesn't measure this because pitchers don't pitch long enough in single appearances for arc-degradation to be a stable skill. In O27 a starter routinely faces 25-35 batters in one continuous half, so the rate at which his stuff degrades is a real, measurable skill. - GSc (Game Score) is the per-appearance summary number. Roughly Bill James' Game Score formula adapted for O27 — bounded 0-100, includes a small bonus for foul-outs (the 3-foul-rule retirement, which is pitcher-credited). Quick read on whether a start was good without having to interpret the inflated O27 ERA-equivalent. - WAR / pWAR rebased against wERA and the higher run environment. RPW (runs per win) recalibrated per season — in a 24+ R/G environment, RPW is closer to 16-18 than MLB's 10. - K%, BB%, HR% (per PA) are kept as environment-neutral rate stats. K% includes foul-outs as Ks since they're a pitcher-induced retirement through pitch sequencing alone. WHAT THIS DOES TO THE SPORT Run environment is structurally higher than MLB. Three reasons compound: - 12 hitters per side instead of 9 (the 9 fielders + the 3 jokers when used). No pitcher-spot weak link in the order, plus the jokers are deliberately built as power, contact, and speed specialists deployed in leverage spots. - Pitchers don't get inning resets. A starter is on the mound for one continuous half of up to 27 outs. Fatigue accumulates monotonically. The 22nd batter he faces is doing so against a more tired version of him than the 5th. There are no eight breaks across nine innings to reset between. - Second-chance ABs remove the sacrifice from the sport. In MLB, advancing a runner often costs you an out (sac bunt, sac fly, productive ground out). In O27, the second-chance AB rule lets you advance runners without spending an out, in exchange for a strike. Outs are the precious resource — there are only 27 — and the rule lets you advance runners without burning them. Player archetypes shift. Workhorse starters (the structural concept, not a roster designation) become the most valuable arms — a B+ starter who can give you 24 outs is worth more than a lights-out reliever who can only give you 6, because there's no inning-by-inning bullpen ladder to deploy. Stamina is the most valuable pitching attribute. Contact hitters who can use second-chance ABs effectively outperform three-true-outcomes power bats relative to their MLB value, because every contact event in O27 costs a strike — the patient hitter who can keep ABs alive across multiple PAs produces real offense in a way he doesn't in MLB. Manager decisions matter more than in MLB at the tactical level. Joker insertions are a real per-rotation decision (which joker, where in the order, or none at all). Pitching changes are weight-bearing because there are no inning resets to mask a tiring starter. Lineup construction has to think about which hitters get the most PAs across a 27-out arc — top-of-order hitters get 5-7 PAs per game in O27 vs MLB's 4-5, so where a power bat lands in the order has bigger downstream consequences. Variance is lower per game than 5-inning baseball would produce, despite the higher run environment. 27 outs concentrated still produces ~38-45 PAs per side per game — more than enough sample to let true talent assert itself within a single game. 5-inning baseball produces ~15 PAs per side and is dominated by single-inning luck. O27's structural change is concentration, not compression — same total outs, different shape. |
|
|
|
|
|
#15 |
|
Dark Cloud
Join Date: Apr 2001
|
I'll probably fork this thread after this and do a separate O27 thread with simulation results. Unlike Viperball, there's no user intervention at all, I just simulate seasons and watch what happens. Fast-sim as a service!
Unlike Viperball, I had a mental model for what this was going to be and that makes it easier to imagine how to make this work because the whole time you're trying to make a vision of a cricket-y baseball? Viperball I had to conjure it and try to reverse engineer it into a sport. Also, I've had this cricket/baseball hybrid idea since like COLLEGE so maybe that's why it only took me about 4 days total to turn this into something very plausible than Viperball which took over a month of tinkering. The future threads will cover the sabermetrics I've added and other improvements. Right now it's just a web sim running off a deployment server, but it's fun to make a baseball sim after 30 years of playing these games going back to FPS Baseball.
__________________
Current dynasty: Playtesting chaos (Viperball 26) | OOTP Mod: Managerial Strategy Files | GM Excel Competitive Balance Tax/Revenue Sharing Calc | FBCB Mods on Github |
|
|
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|