Shepherd's Dog: A Game by the Most Dangerous AI Model

(koenvangilst.nl)

89 points | by vnglst 4 hours ago ago

71 comments

  • jna_sh 4 hours ago

    ā€œ can it build a game idea I've had for years, in a single shot?ā€

    Do people do no research or introspection when they’ve had an ā€œidea for yearsā€? There are countless examples of this exact game. I played this on the Gameboy Advance! There’s like 50 of them on the App Store right now.

    The standard ā€œthis almost certainly exists wholesale in the training dataā€ applies, but I’m also interested in how you carry an idea for years and don’t notice this, or whether the ā€œideaā€ here was actually ā€œusing this thing that’s been remade thousands of times as an AI benchmarkā€.

    There’s nothing wrong with remaking an old classic formula, especially in game dev. It’s the describing it as ā€œan idea I’ve had for yearsā€ that rings weird.

    • fennecbutt 3 hours ago

      I think that's exactly why AI is suited for 99% of stuff we do.

      I have pointed out on here before that instances of truly unique human ideas not grounded in nature or previous ideas from others is almost nil, there are not many examples that someone can give me. All human ideas and work is derivative.

      Elves? Humans with pointy ears. Werewolves? Humans mixed with wolves. Car tyre? Cart wheel...stone wheel/roller. Etc.

      • bxk76 2 hours ago

        Just because AI can give you a recipe for an sandwich doesnt mean everyone who sells or buys or experiments making sandwiches are going to stop.

      • jna_sh 3 hours ago

        I feel like prior to GenAI, you’d have had to reckon with the true originality of your idea in some form as you did the research. Creatives having to confront their own unoriginality is such a thing it itself is reflected in countless pieces of media.

        So it’s interesting to me that the creator here didn’t encounter the tens of physically published versions, or the hundreds of them shipped to digital app stores, or all the codebases on GitHub, in the course of making this. I’m sure they would have done naturally prior to GenAI. Is that good or bad? I don’t know! But it’s interesting to me.

        • NitpickLawyer 2 hours ago

          > the creator here didn’t encounter the tens of physically published versions

          The simplest counterargument: since there are already tens of similar games out there, why didn't the previous authors, supposedly grass-fed genuine checkmark blood-through-their-veins humans didn't notice the other 9-8-7-6-5... games, and still released their own version? Maybe because it was still that they wanted the game out there? Maybe because originality really isn't that common? Maybe because each individual had their own idea and spin to it? Maybe because they wanted the game out as they made it?

          Same for this author. How they made the game is irrelevant, and nitpicking the "originality" or anything else is silly. Something like this wasn't possible 3 years ago. Now it's possible. Deal with it, and stop trying to find ways to diminish it. It's a huge accomplishment any way you cut it.

          • jna_sh 2 hours ago

            My thoughts are less about the merits of creating something that already exists than they are about _knowing_ you are doing that. Which I think my post made very clear :)

            • NitpickLawyer 2 hours ago

              > I’m sure they would have done naturally prior to GenAI.

              I gave a simple counterargument to this. Since there are "countless" prior games, many of them released before genAI, your argument is pointless.

              • jna_sh an hour ago

                Do you think the only reaction to knowing you’re not the first to do something is not to do it? Do you think I said that?

                To spell it out in case it is still non-obvious: knowing this allows iteration. It allows remixing. It allows you to inspect what has come before and what it did well and where it succeeded and where it fell short and thus what you could _add_. It is an enabler of creativity! Thus I think it is interesting that GenAI may make it harder to have this experience.

          • customguy 2 hours ago

            They said they think they would have encountered those other games without GenAI, not that they or any of those other authors shouldn't have released the game.

        • dijksterhuis 12 minutes ago

          i had a boss. before he was my boss, he was a friend. he took me under his wing, musically speaking. he showed me new music. told me what gear he was interested in. we went to some gigs.

          he used to say ā€œthe best artists have the biggest record collectionsā€.

          they’ve done their research. they developed taste. they’ve been in that battle with the unoriginality demon.

          as a result, they’ve figured out what ā€œgood artists copy, great artists stealā€ means.

          we take small bits. small ideas. small riffs. we turn them into our own. then we repeat that N times to create ā€œa songā€. we borrow. we revere. we obsess.

          this thing isn’t artistic stealing, it’s the most low-effort stealing possible. creativity, originality snd, more importantly, taste appear nowhere here.

          so, is it bad? depends on your perspective on whether you have taste i guess.

      • ai_fry_ur_brain 2 hours ago

        I think this is false. New ideas are born every minute, and llms arent going to help people with those for the most part, they'll end up steering you back towards the gradient if you do.

        • 0xEF an hour ago

          Can you give us an example of a new idea that is not derivative of something that already exists? Should only take about a minute.

          Snark aside (and apologies), there's absolutely nothing wrong with the "no new ideas" take and nobody should think there is. Humans tend to work collectively, try as we might to do or appear otherwise, and often come to the same conclusions through reasoning and logic. No one-person truly invented the light bulb, etc, when really all inventive thought is branches of derivative thought as we build our collective knowledgebase. A better question would be how many novel ideas are the logical conclusion of branches of derivative thought and how many are tangential brought about by the injection of our irrationally.

    • vnglst 3 hours ago

      I also realized this, a quick Google search would’ve told me that this game has been made several times before, also way before I ever had this idea. Apparently it’s a pretty obvious game idea.

      Ah well, it’s still fun and it does appear to measure how good AI is in creating these kind of games.

      • dools 3 hours ago

        Well … it’s a measure of how good it is at reproducing a game that probably already exists in multiple forms in its training data.

        • puttycat 2 hours ago

          The question is more whether this game exists as open source somewhere in the training data (probably does).

          • sevenzero 2 hours ago

            You can't possibly think those models are only trained on open source data?

    • redrobein 3 hours ago

      While I agree that it isn't revolutionary that it could implement this from a single prompt, what's surprising to see is how well done this one is compared to the other tries. The controls and movement are smooth, the animations aren't jittery, the ui makes sense, there's a clear progression in difficulty. This model clearly "understands" the implementation of this game far better than the others did.

    • tripledry 31 minutes ago

      This is a thought I've had about genAI.

      In case it all just comes from training data, "one shotting" a game would be more comparable to "git pull" and changing some assets than "generating code".

      I'm not saying this is how it works, I'm trivializing LLMs with this statement, but when I see someone on linkedin excited about generating checkers and chess my first thought is "you could have done that with git pull for the past 20 years".

    • uludag 2 hours ago

      Same thoughts exactly. I personally started looking into indie game dev and I've just started to realize how naive I was and how hard just game design can be, and that I'll probably never be good at it, and that most of my ideas are pretty garbage (or incomplete at best).

      Even with the perfect AI to write, one would need to iterate through many different ideas, play testing constantly, getting people to play test and analyze what they found fun and where they got stuck. And to get the best ideas you'll need to be playing lots of different kinds of games.

    • Forgeties79 21 minutes ago

      Usually it’s an idea somebody had in a flight of fancy or inspiration but they haven’t really shown much interest in the actual medium prior, so they don’t really have any knowledge of its existence and then they also don’t go out of their way to confirm if it already exists.

      Like I remember in college I had something akin to the idea of ā€œ50 people 1 question.ā€ I was starting to become interested in shooting my own documentaries and was particularly interested in man on the street style interviews. I pitched it to a friend who then told me about 50p1q, which baffled him because it was like the hot thing already a year or two prior haha.

      Anyway that’s just something I think happens a lot. And now with genAI people don’t throw the idea around even, they quickly do a crappy version of the thing, present it, then find out it exists. Which isn’t terrible I guess but it’s one less filter for my better or for worse.

    • neonstatic 2 hours ago

      Well, ā€œan idea I’ve had for yearsā€ and ā€œsomething that has never been done beforeā€ are not the same thing.

      • jna_sh 2 hours ago

        This is fair! I am possibly attaching some notion of originality to the word ā€œideaā€ in the context of a project that others don’t.

  • ciscoriordan 3 hours ago

    My Belgian Tervuren and I have a basic herding title and about 4 years of herding experience.

    The sheep movement is excellent. You could make it even more realistic by having them favor lusher areas and by having one occasionally bolt spastically (hard mode?)

    A handler mode where you play as a human and shout commands at the dog could be cool too!

  • evilturnip 3 hours ago

    I think it’s impressive that an LLM can take you to a local maxima in one-shot.

    But once you start maintaining it, improving it and fixing bugs, you’ll eventually need to rip it apart and put it back together again while understanding how it all works.

    This is why I think the better approach isn’t to one-shot but to have the architecture in your head and build it up piece by piece, with the AI accelerating the code writing.

    • MrScruff an hour ago

      I think this is true for projects beyond a certain complexity. I have 100% vibe coded projects with tens of thousands LOC, and haven't seen any real issues with fully automated maintenance. Will that approach work in every scenario, absolutely not, but the size and complexity of projects where it does is growing with each new model release.

    • dools 3 hours ago

      I’ve found it very easy to maintain, add features to and fix bugs in software I’ve written entirely with LLMs, and in languages and frameworks with which I’m unfamiliar. You just ask the LLM to explain the code and then work with it to come up with the fix.

      • ai_fry_ur_brain 2 hours ago

        How big are those projects.. I dont think this is good for your mental health or physicaly your brains health. Problem solving keeps your brain strong. The laziness in us is inclined to take shortcuts, don't do it. Its like driving your car 3 blocks instead of walking, your physical health will suffer.

        • boredhedgehog 6 minutes ago

          > Its like driving your car 3 blocks instead of walking, your physical health will suffer.

          And be sure to only walk barefoot. Relying on artificial shoes weakens the muscles and the skin of your feet.

        • dools 2 hours ago

          > How big are those projects

          Define big I guess. They're non-trivial, mix of internal enterprise tools, a multiplatform app (android/ios/mac/windows/web currently headbutting its way through review), including a billing system for my small telecommunications business.

          > I dont think this is good for your mental health or physicaly your brains health

          I find the experience of doing it without writing the code to be intellectually pretty similar. I still solve a lot of problems, the LLM couldn't, for example, one shot the event sourcing model I built for synching data between devices. It took quite a few iterations and I had to define a lot of the architecture, but I did it at a level that wasn't "here is a class, here is a module, this module does XYZ", more at the "whitepaper" level or describing how specific bits of the app needed to work in order to solve some problem.

          It's also very similar to managing other developers.

          > Its like driving your car 3 blocks instead of walking, your physical health will suffer

          It's more similar to having staff rather than doing everything yourself. The problem solving just shifts to a different area, and you get more done.

        • matwood an hour ago

          > Problem solving keeps your brain strong.

          Coding is not the sole problem solving skill. In fact, coding may be one of the easier skills much of the time. Deciding what to build, where to focus efforts, understanding a customer's needs, could all be just as if not more challenging than the coding part.

          • dools an hour ago

            Also what the code should do and how it should do it. LLMs regularly cannot come up with the best way to approach something. Once those decisions are made, codifying them is kind of the least interesting part of the entire exercise.

    • hurtigioll 2 hours ago

      LLMs are good now at looking at existing project and suggesting big refactors for technical debt removal and new better architectures after the project grew organically for a while

  • raincole an hour ago

    There were dozens (if not hundreds) of more complex games made by Fable on Twitter the first day it was released. The only reason this is on HN frontpage is the stupid clickbait title.

    Some random examples:

    https://x.com/fe_yukichi/status/2064635098411180374 https://x.com/akiraxtwo/status/2064780732082651402 https://x.com/kieradev/status/2064482704763085202 https://x.com/VincentLogic/status/2064699740936356065 https://x.com/XiaohuiAI666/status/2064994538591223911

    • franze 40 minutes ago

      No, the reason is that it is a follow up on a (multiple) threads march last year and shows the progress of ai.

  • da-x an hour ago

    Curious enough, I tried the same prompt with Qwen3.6-27B.

    One shot produced a game with no sheeps. I had to told it to fix two bugs then.

    Overall, the graphics and games seems good enough and better than most of the closed models that were shown. However, not surprisingly, falls short of Fable.

    I've put the index.html and open code session here:

    https://github.com/da-x/when-ai-fails/tree/qwen3.6-27b/shepa...

  • _pdp_ an hour ago

    If you sit down and write that game by hand you will not only finish it in a week but also learn a lot of things along the way and perhaps even discover something about the game and you did not imagine. That is how programming works. It is a search problem.

    Also this is a game has very simple mechanics I am sure you can generate as easily with Cursor or some other tools.

    • thih9 an hour ago

      Cursor has access to the latest models so it should be equivalent, right?

      Or is there some other AI usage described in this article that is not supported by cursor?

      • _pdp_ 22 minutes ago

        I don't use Cursor so I mentioned it as slightly impartial suggestion but my point is broader. I hear and have seen results from others using Composer 2.5 which is only available in Cursor.

  • fennecbutt 3 hours ago

    Looks kinda like "Sheepherds" which came out recently.

    However as others have pointed out the idea is a common one, probably because many people are exposed to sheep and sheep dogs and farming. Which further reinforces a previous point I made that all human work is derivative and barely anything actually original.

    But that's why it doesn't matter! Make that game/app/website that someone else has made before, make your own interpretation! The beauty and uniqueness is in the skin not the flesh!

    • zkry an hour ago

      But isn't getting an LLM to n-shot something just going to produce non-unique, non-original interpretations of an idea?

    • totetsu 2 hours ago

      I’m sure I saw a blog post about this same mechanic being made by llms back a year or so ago too

  • momocowcow an hour ago

    So it created a trivial game that a teenager could’ve built as a part-time project while acquiring deep knowledge.

  • thih9 3 hours ago

    The article’s title seems needlessly dramatic, the article itself doesn’t reference the LLM’s danger.

    The title could have been just ā€œShepherd’s Dog: A game by Fable 5ā€.

    • vnglst 3 hours ago

      Not sure if it would've gone to the front page of Hackernews with that title! I was also trying to make a little fun about the drama around Mythos/Fable: Even though Fable did this really well, to me it does not appear to be fundamentally different from other top models.

      • dakolli 3 hours ago

        Yeah, fundamentally the same: Worthless.

        • perching_aix 41 minutes ago

          Bit of a funny thing to so proudly assert in your millionth "your favorite show is shit" type comment, don't you agree?

          In close lockstep with @ai_fry_your_brain, who at least makes it clear right on the tin that they're not here to engage in any earnest capacity whatsoever. Always a mixed feeling between being appreciative of that, and finding it blatant.

          Good thing it's AI ruining communities, a thought I have no doubt you also share in. If only people properly recognized the hard work of people like you in this.

        • hurtigioll 2 hours ago

          funny how a worthless LLM belongs to the fastest revenue growing company in the history of Capitalism

  • nickandbro 3 hours ago

    I sure do miss Fable. It just knew how to do things and do them well. Sad it’s now blocked.

    • willtemperley 2 hours ago

      I wonder if this is the real problem: it was too good, and a lobby of companies feeling threatened by the competition decided to push the jailbreak narrative as a scapegoat.

  • ernst_klim an hour ago

    When you say €20 worth of tokens is it fair direct API call price or subsidized claude code?

  • sixhobbits 4 hours ago

    Enjoyed playing it, here's the direct link to play as otherwise you have to click from the article to the GitHub and then find the correct demo link

    https://vnglst.github.io/when-ai-fails/shepards-dog/claude-f...

    • vnglst 4 hours ago

      Thanks for that, I messed up copying the links into the article!

  • tbreschi 3 hours ago

    Brilliant marketing here in the title

  • bloomark an hour ago

    > It's really fun and exactly how I imagined it.

    If this is what you imagined, you need to imagine better.

    * Pathfinding is terrible (if I end up inside the fenced area clicking outside doesn’t lead me out). * Forcing me to go landscape while not even filling the entire screen is terrible (where did you even test this). * Controls are disastrous (I’m either barking all the time or a bark makes my sprite ignore my movements).

    You one-shotted this, and I will admit it’s incredible that these agents can create something like this in minutes.

    But your statements along with the ā€œmost dangerous AI modelā€ in the title are disingenuous. Please do better.

  • stephbook 3 hours ago

    Playing on iphone13 mini.

    It instructs me to rotate my phone. The pasture doesn't get any bigger, but now the top bar blocks half the screen. The tooltip about rotating stays in the middle of the screen. Unplayable. There's a music note indicating sound, but I never heard the dog bark.

    It's exactly the kind of unpolished slop I expected it to be.

  • PUSH_AX 2 hours ago

    In which harness?

  • andrepd 2 hours ago

    He should ask AI to tell him that #aaa text on #eee background is not acceptable.

  • CarRamrod 3 hours ago

    BAA VRAM EWE

  • ai_fry_ur_brain 2 hours ago

    Forces me to rotate to get warning message to disappear (works fine on portrait, but regardless forces me to play with two hands..), when rotate doesnt even fit on phone.

    fROnTEnD DeV Is DeAd

    DeSiGN Is DeAD

    Cool idea tho, could be a fun game if if the UX wasnt so hostile.

  • wg0 2 hours ago

    Now next game - The Boy who cried wolf! Wolf!

  • hbarka 3 hours ago

    That’s one tired sheepdog.

    • vnglst 3 hours ago

      This was my second attempt, I'm still learning! Besides, the wolf was freaking me out.

  • esailija 4 hours ago

    I didn't even have to play. Immediately after opening, some notification about rotating my phone is obscuring the instructions and I cannot read them.

    • fennecbutt 3 hours ago

      Damn I couldn't load it on my Nokia n95 from 2007 either. Damn bruh, these silly devs should make this stuff work on everything.

      • esailija 3 hours ago

        I am on a flagship samsung that runs for example the Red Alert 2 browser port well.

        OP is just pushing slop, the 80% part anyone gets for free. (well 20 bucks)

  • isoprophlex 2 hours ago

    "a game idea I've had for years"

    Bruv, there are already countless games with this exact mechanic...

  • chvid 2 hours ago

    As far as I can tell it is possible to get this sort of quality game with a properly tuned harness out of one of the cheaper models.