I tried writing a short novel using Claude Opus 4.6, I gave it outline and raw draft, and the style is very similar to this writing.
I tried to steer it away from this kind of writing because it feels weird. But it always try to output something similar to this. Or maybe I am just not used to reading novel.
So I was curious, what kind of training data was Claude trained on, that its very hard to steer it out from this style.
So I opened my kindle and looking through the recommended popular novels. Just reading through its free samples.
And the similarities are striking. Now, I dont know whether the recommended novel is the training data, or its actually written by LLM. Or maybe its just how novelist writes.
I even tried writing full chapter from scratch. And asked Claude to ghost write the second chapter for me using my writing style. It still wont follow my style and keeps writing in this kind of style from the article.
Not accusing the article of using an LLM to ghost write. Even so its fine to use LLM to ghost write. Its just one anecdote from my side, on how LLM fails to follow my writing style and keeps coming back to its training data.
> And the similarities are striking. Now, I dont know whether the recommended novel is the training data, or its actually written by LLM. Or maybe its just how novelist writes.
For traditionally published works, it's trivial to exclude LLM-written content, just look for anything published before Nov 30, 2022.
Why stop with traditionally published works? Before dead-internet-day, very-nearly all forms of writing were guaranteed to be hand crafted, organic, and made with 100% Natural Intelligence.
The artificial stuff often has an odd taste, but boy it sure is quick and convenient.
You joke, but I bet every person in this forum, when presented the choice between a bot-filled forum and a guaranteed human-only* forum, they'd go with the latter.
* this is a hypothetical scenario. I don't know any guaranteed human-only digital forums.
Niche hobbyist forums are still safe, for now. There's just not enough commercial interest in petroleum lantern restoration to make it worth anyone's time to poison this particular well.
Even some larger niche hobbies like the saltwater aquarium community seemspretty safe for now (though it also helps that many forums have members who visit each other to trade corals and admire each others tanks).
I converse enough with LLMs for research at this point where I feel I have a good enough structure to hop on/off them to primary sources and stuff, so I don't get annoyed with them too easily.
Whereas I haven't seriously reflected on my social media consumption habits for over 15 years, and over the years I'm getting more and more annoyed at social media.
Not to be a bit misanthropic, but there's something seriously wrong with my social media usage, especially when I know there's a real human on the other side, combined with ever increasing annoyance towards commenters and just the feelings I get after reading social media.
It may be dopamine / self-help related, but no actually, I think all of that is part of the issue (discovered that in high school when it was taking off). Something about the way I'm fundamentally interacting with the medium seems so horrible and icky the more I mature.
On the contrary! The dead-day theorem established earlier states that an 11/22 date filter is a necessary condition for verifiable human-only content, when filtered by content-creation date.
A weaker theorem can be postulated that any such filter provides a second order sufficient condition.
This means we can filter content by account creation date, for example, by hiding all posts and comments from accounts created after the digital death event. This won’t always guarantee human-only content but certainly more than otherwise.
But then we wouldn’t be having this most definitively human-to-human conversation, right?
It's not the launch of GPT, but probably about 4 or 4o that it really became solid. I also don't think video is there just yet, at least for video over 10 seconds.
Who's "people"? The bottom X% (40%?) of the population is already falling for AI slop video scams, but before that, they were also falling for pig butchering and nigerian prince scams, so the "average" person benchmark has already been passed for text, photos, videos, etc. For more astute consumers, video isn't there yet.
There's also the question of whether people are even trying to disguise AI content, and how effective that disguise is. Are you or I missing the AI-generated text that just has a veneer of disguise on it?
why does it matter when it "became solid?" there was plenty of slop generated with ChatGPT, that really was the turning point (because of public access)
4-5 words sentences ted talk style, yes. I hated it even when humans were doing it. It's like motivational speakers trying their hand at writing novels
Pangram says this is 61% AI generated. Thinking of paying for a Pangram subscription so I can spend the valuable time I have to read for pleasure on human created pieces than AI slop.
I actually loved this, and felt moved. While reading, my mind fired rapidly through dozens of personal memes (i.e. tags for my regularly trod thought-paths) that I keep in my knowledge-base. This is the 30mb text corpus where I log all my work and peer conversations and thoughts, and (amongst other things) where I think through what I would consider my spiritual practices... my sensemaking around complex systems, including Daoist teachings. This text basically entangled itself with the work I am doing at the outer edges of my own knowing, where I am working on my rawest and most fragile but precious thoughts.
I don't think this is trite, I think there is something in this that is in contact with "living structure" (in the Christopher Alexander sense[1]), and much exists outside the edges of the text.
To those who dislike this, I am genuinely curious: Would you say you dislike metaphor? Do you tend to feel disconnected and lacking resonance with poetic writing?
EDIT: I experience this writing as giving me many quiet A's, or perhaps a smell of A's in a given direction of thought. I interpret others here as getting either B's or U's, in the sense of this A/B/U system: https://openresearchinstitute.org/onboarding/A_B_U.html
Do you mind me asking what type of system do you use for keeping these notes, the 30mb text corpus with conversations and journaling? Are you using txt, an app like Logseq? I flip flop between apps for this sort of thing and then annoyingly the building of a "system" sucks up my time rather than writing and logging and reflecting. It's a struggle for me any advice would be much appreciated :)
My humanist degree and all those years reading B’s and D’s of French philosophy come extremely useful in strange places. Having had to write long essays sieving through mounds of seemingly near impenetrable (and actually surprisingly banal, after you learn how to read it) prose of post-structuralist philosophy, I learned how to automatically look for the structure of the text first by skimming, starting from the end, creating a mental map of the text so that you can locate the main argument and the amino acid amongst the boilerplate and stock sentences.
Today it saves me time skimming a text, seeking for main sentences by jumoing around and quickly coming to the understanding of “Oh, hi ChatGPT”. In the past it has saved me a lot of time not being tricked to read SEO gurgle, ad-copy and just generally bad writing. If writing is really just editing, reading is mostly filtering, sieving the cereal from the chaff.
Most amazing art isn't really a product of inspiration, but from severe editing (or severe practice, if it's live).
Good writing needs a lot of "post-production" to get the ideas hammered out. Most of it is removing content that isn't central to what the writer wants.
This LLM trend is part of a larger historical pattern that shifts editing away from us having to think things in our brain:
A. At one time, the editing was mental load, since writing was tedious.
B. The typewriter made writing easy, but modifying it required lots of handwritten scrawling, but the mental load was still within reviewing and rewriting the content.
C. By the end of the 20th century, editing and rewriting was a total breeze, but the mental load was still within handwritten note-taking.
D. Once we made a bazillion forms of productivity and note-taking software, the mental load was only in thinking the thought and getting it into a computer. Everything after that was massaging the idea.
E. Now, the regurgitation machine can get you 3/4 of the way to the finish line of your draft without even trying.
But, I'm convinced we lost something on each of these transitions. There is more power in one well-placed sentence assembled over tremendous meditation than 85 paragraphs of slop.
Paul Graham's essay on good writing (https://paulgraham.com/goodwriting.html) defines "right" written ideas as "developing them well — drawing the conclusions that matter most, and exploring each one to the right level of detail".
My opinion is that the absurd complexities of the Over-Information Age make the "right" level of detail the following:
1. Executive summary that children and dumb people can understand.
2. Tightly-defined specifications for everyone who cares or needs to know.
3. Footnotes and background information that you can throw everything and the kitchen sink onto. This includes attempts to persuade, artful descriptions, feelings you had, associations to other things, and that general elegant "waxing on" that everyone gets the fancy for doing sometimes.
“I wish there was a way to know you're in the good old days before you've actually left them”. — Andy Bernard
I tried writing a short novel using Claude Opus 4.6, I gave it outline and raw draft, and the style is very similar to this writing.
I tried to steer it away from this kind of writing because it feels weird. But it always try to output something similar to this. Or maybe I am just not used to reading novel.
So I was curious, what kind of training data was Claude trained on, that its very hard to steer it out from this style.
So I opened my kindle and looking through the recommended popular novels. Just reading through its free samples.
And the similarities are striking. Now, I dont know whether the recommended novel is the training data, or its actually written by LLM. Or maybe its just how novelist writes.
I even tried writing full chapter from scratch. And asked Claude to ghost write the second chapter for me using my writing style. It still wont follow my style and keeps writing in this kind of style from the article.
Not accusing the article of using an LLM to ghost write. Even so its fine to use LLM to ghost write. Its just one anecdote from my side, on how LLM fails to follow my writing style and keeps coming back to its training data.
> And the similarities are striking. Now, I dont know whether the recommended novel is the training data, or its actually written by LLM. Or maybe its just how novelist writes.
For traditionally published works, it's trivial to exclude LLM-written content, just look for anything published before Nov 30, 2022.
Which is also a good filter for web searches to exclude a lot of garbage results (if the specific search makes sense for non-recent results)
Except many search engines have a recency bias.
A sane default previously; as news changes and the status quo also, but it makes you even more likely to encounter slop now.
Why stop with traditionally published works? Before dead-internet-day, very-nearly all forms of writing were guaranteed to be hand crafted, organic, and made with 100% Natural Intelligence.
The artificial stuff often has an odd taste, but boy it sure is quick and convenient.
Don't you remember the endless SEO spam that swamped the Net even before GPT, allegedly written by real humans?
You joke, but I bet every person in this forum, when presented the choice between a bot-filled forum and a guaranteed human-only* forum, they'd go with the latter.
* this is a hypothetical scenario. I don't know any guaranteed human-only digital forums.
I agree with you, but as to your addendum:
Niche hobbyist forums are still safe, for now. There's just not enough commercial interest in petroleum lantern restoration to make it worth anyone's time to poison this particular well.
Even some larger niche hobbies like the saltwater aquarium community seemspretty safe for now (though it also helps that many forums have members who visit each other to trade corals and admire each others tanks).
I converse enough with LLMs for research at this point where I feel I have a good enough structure to hop on/off them to primary sources and stuff, so I don't get annoyed with them too easily.
Whereas I haven't seriously reflected on my social media consumption habits for over 15 years, and over the years I'm getting more and more annoyed at social media.
Not to be a bit misanthropic, but there's something seriously wrong with my social media usage, especially when I know there's a real human on the other side, combined with ever increasing annoyance towards commenters and just the feelings I get after reading social media.
It may be dopamine / self-help related, but no actually, I think all of that is part of the issue (discovered that in high school when it was taking off). Something about the way I'm fundamentally interacting with the medium seems so horrible and icky the more I mature.
On the contrary! The dead-day theorem established earlier states that an 11/22 date filter is a necessary condition for verifiable human-only content, when filtered by content-creation date.
A weaker theorem can be postulated that any such filter provides a second order sufficient condition.
This means we can filter content by account creation date, for example, by hiding all posts and comments from accounts created after the digital death event. This won’t always guarantee human-only content but certainly more than otherwise.
But then we wouldn’t be having this most definitively human-to-human conversation, right?
Is the ChatGPT launch the "low background steel" date for writing?
What's are the dates for images and video? Nano Banana Pro and Seedance 2.0?
And code? Opus 4.6?
It's not the launch of GPT, but probably about 4 or 4o that it really became solid. I also don't think video is there just yet, at least for video over 10 seconds.
Is it "solid" if people can read it and instantly know it's generated content?
No. But you can easily make and post content that is not easily detectable as generated.
You only notice plastic surgery when it's bad, but that doesn't mean all plastic surgery looks bad...
Who's "people"? The bottom X% (40%?) of the population is already falling for AI slop video scams, but before that, they were also falling for pig butchering and nigerian prince scams, so the "average" person benchmark has already been passed for text, photos, videos, etc. For more astute consumers, video isn't there yet.
There's also the question of whether people are even trying to disguise AI content, and how effective that disguise is. Are you or I missing the AI-generated text that just has a veneer of disguise on it?
>Who's "people"?
If you follow this thread up you will see the context is 'people who want to read content written by humans.'
why does it matter when it "became solid?" there was plenty of slop generated with ChatGPT, that really was the turning point (because of public access)
4-5 words sentences ted talk style, yes. I hated it even when humans were doing it. It's like motivational speakers trying their hand at writing novels
Pangram says this is 61% AI generated. Thinking of paying for a Pangram subscription so I can spend the valuable time I have to read for pleasure on human created pieces than AI slop.
I actually loved this, and felt moved. While reading, my mind fired rapidly through dozens of personal memes (i.e. tags for my regularly trod thought-paths) that I keep in my knowledge-base. This is the 30mb text corpus where I log all my work and peer conversations and thoughts, and (amongst other things) where I think through what I would consider my spiritual practices... my sensemaking around complex systems, including Daoist teachings. This text basically entangled itself with the work I am doing at the outer edges of my own knowing, where I am working on my rawest and most fragile but precious thoughts.
I don't think this is trite, I think there is something in this that is in contact with "living structure" (in the Christopher Alexander sense[1]), and much exists outside the edges of the text.
To those who dislike this, I am genuinely curious: Would you say you dislike metaphor? Do you tend to feel disconnected and lacking resonance with poetic writing?
[1]: https://dorian.substack.com/p/at-any-given-moment-in-a-proce...
EDIT: I experience this writing as giving me many quiet A's, or perhaps a smell of A's in a given direction of thought. I interpret others here as getting either B's or U's, in the sense of this A/B/U system: https://openresearchinstitute.org/onboarding/A_B_U.html
Do you mind me asking what type of system do you use for keeping these notes, the 30mb text corpus with conversations and journaling? Are you using txt, an app like Logseq? I flip flop between apps for this sort of thing and then annoyingly the building of a "system" sucks up my time rather than writing and logging and reflecting. It's a struggle for me any advice would be much appreciated :)
AI;DR
Horrible soulless dross.
My humanist degree and all those years reading B’s and D’s of French philosophy come extremely useful in strange places. Having had to write long essays sieving through mounds of seemingly near impenetrable (and actually surprisingly banal, after you learn how to read it) prose of post-structuralist philosophy, I learned how to automatically look for the structure of the text first by skimming, starting from the end, creating a mental map of the text so that you can locate the main argument and the amino acid amongst the boilerplate and stock sentences.
Today it saves me time skimming a text, seeking for main sentences by jumoing around and quickly coming to the understanding of “Oh, hi ChatGPT”. In the past it has saved me a lot of time not being tricked to read SEO gurgle, ad-copy and just generally bad writing. If writing is really just editing, reading is mostly filtering, sieving the cereal from the chaff.
Why is it hacking the back button in my mobile safari browser? And why the title is different from the page?
Most amazing art isn't really a product of inspiration, but from severe editing (or severe practice, if it's live).
Good writing needs a lot of "post-production" to get the ideas hammered out. Most of it is removing content that isn't central to what the writer wants.
This LLM trend is part of a larger historical pattern that shifts editing away from us having to think things in our brain:
But, I'm convinced we lost something on each of these transitions. There is more power in one well-placed sentence assembled over tremendous meditation than 85 paragraphs of slop.Paul Graham's essay on good writing (https://paulgraham.com/goodwriting.html) defines "right" written ideas as "developing them well — drawing the conclusions that matter most, and exploring each one to the right level of detail".
My opinion is that the absurd complexities of the Over-Information Age make the "right" level of detail the following:
And, in this attitude, LLMs are only good for #3.LLM or not, this is just terrible kitsch.