Why can't LLMs just be like a normal developer already?

npub1gm7tuvr9atc6u7q3gevjfeyfyvmrlul4y67k7u7hcxztz67ceexs078rf6

hex

ad48725fd5577f787fc9f042a6dbc9d93dbd10fb608ad11926e079c49f57d7be

nevent

nevent1qqs26jrjtl24wlmc0lylqs4xm0yaj0dazrakpzk3rynwq7wynata00sprpmhxue69uhhyetvv9ujuem4d36kwatvw5hx6mm9qgsydl97xpj74udw0qg5vkfyujyjxd3l706jd0t0w0turp93d0vvunggc82tj

Kind-1 (TextNote)

2026-04-08T07:50:27Z

Why can't LLMs just be like a normal developer already?

So I wanted to get a clear picture about the Great Consensus Cleanup, a Bitcoin soft fork proposal that's supposed to be as uncontroversial as it can get while fixing vulnerabilities.

The amount of tokens and time I threw at this problem is unreal!

I thought I could one-shot it with nostr:npub16g4umvwj2pduqc8kt2rv6heq2vhvtulyrsr2a20d4suldwnkl4hquekv4h 's autoclaw via openclaw. Something like "Find the proposed activation client implementation for BIP 54, modify it to measure instead of enforce the rules and log your findings. I want to know how common were violations or almost-violations historically and especially during the past year. Oh and by the way, you only have 60GB, so you will have to write down your findings as the blocks are downloaded as you will have to prune."

Countless times did the LLM try to guess the results to save time, probe only some blocks from blockchain info or do whatever strange and unintuitive things. Countless times did it throw away perfectly fine IBD data. My concession now: Get 60GB of pruned data and then look what we find. We will have some data from the "stress test" times and isolated proof of concept transactions but not the full historic picture.

原始 JSON

{
  "kind": 1,
  "id": "ad48725fd5577f787fc9f042a6dbc9d93dbd10fb608ad11926e079c49f57d7be",
  "pubkey": "46fcbe3065eaf1ae7811465924e48923363ff3f526bd6f73d7c184b16bd8ce4d",
  "created_at": 1775634627,
  "tags": [
    [
      "p",
      "d22bcdb1d2505bc060f65a86cd5f20532ec5f3e41c06aea9edac39f6ba76fd6e"
    ],
    [
      "client",
      "jumble"
    ]
  ],
  "content": "Why can't LLMs just be like a normal developer already?\n\nhttps://a.pinatafarm.com/450x452/f14b5f5dd3/why-cant-you-just-be-normal-blank.jpg\n\nSo I wanted to get a clear picture about the Great Consensus Cleanup, a Bitcoin soft fork proposal that's supposed to be as uncontroversial as it can get while fixing vulnerabilities.\n\nThe amount of tokens and time I threw at this problem is unreal!\n\nI thought I could one-shot it with nostr:npub16g4umvwj2pduqc8kt2rv6heq2vhvtulyrsr2a20d4suldwnkl4hquekv4h 's autoclaw via openclaw. Something like \"Find the proposed activation client implementation for BIP 54, modify it to measure instead of enforce the rules and log your findings. I want to know how common were violations or almost-violations historically and especially during the past year. Oh and by the way, you only have 60GB, so you will have to write down your findings as the blocks are downloaded as you will have to prune.\"\n\nCountless times did the LLM try to guess the results to save time, probe only some blocks from blockchain info or do whatever strange and unintuitive things. Countless times did it throw away perfectly fine IBD data. My concession now: Get 60GB of pruned data and then look what we find. We will have some data from the \"stress test\" times and isolated proof of concept transactions but not the full historic picture.",
  "sig": "6828a9db4709470150be1be4b952c069bfb8ccd9fe722b2fadc1ac3882648e9ec3721451fd9e61ab534e9da0d0cdc0f996533fd03cc9616be50470c74aa35eaf"
}