• 0 Posts
  • 9 Comments
Joined 1Y ago
cake
Cake day: Jun 17, 2023

help-circle
rss

Ah, even then it could just be a consequence of training samples usually being chronological(most often the expected resolution for conflicting instructions is “whatever you heard last”, with some exceptions when explicitly stated) so it learns to think that way. I did find the pattern also applies to GPT trained on long articles where you’d expect it not to, so wanted to just explain why that might be.


Or I should explain better: most training samples will be cut off at the top, so the network sort of learns to ignore it a bit.


Yes, that’s by design, the networks work on transcripts per input, it does genuinely get cut off eventually, usually it purges an entire older line when the tokens exceed a limit.


I was a curious child, and things spiralled out of control from there…


Ah, that makes sense. Most cloud providers have the full nine yards with online hardware provisioning and imaging I forgot you could still just rent a real machine.


Hmm, wonder if there was some reason they didnt just extract the original certificates from the VPS if it was actually the hosting provider, I mean even with mitigation it should be sitting in a temp folder somewhere, surely they could? Issuing new ones seems like a surefire way to alert the operators, unless they already used Let’s Encrypt of course.


They previously did not use APEX but that seems to have changed recently: https://github.com/GrapheneOS/grapheneos.org/commit/7bf9b2671667828d1553c92bf4f64cc749b74d0b Regardless it will need the verified boot keys it seems so Google can’t update them, likely the devs will take responsibility to update the CAs. No idea if they will restore the user control though.


I feel like this is just describing the future of business processing consultants. Like there’s already a role for this, unless I’m missing something?


I think the part that annoys me the most is the hype around it, just like blockchain. People who don’t know any better claiming magic.

We’ve had a few sequence specific architectures over the years. GRU, LSTM and now Transformers. They were all better than the last at the task of sequence specific transformations, and at least for the last one the specific task was language translation. We eventually figured out these guys have a bit of clairvoyance too, they could make accurate predictions based on past data, or at least accurate enough to bet on, and you can bet traders of various stripes have already made billions off that fact. I’ve even seen a transformer based weather model. It did OK, but transformers are better at language.

And that’s all it is! ChatGPT is a Transformer in the predictive stance. It looks at a transcript of a conversation and thinks what a human is most likely to say next. It’s a very complex transformation of historical data. If you give it the exact same transcript, it gives the exact same answer. It is in the literally mathematically rigorous sense entirely incapable of an original thought. Any perceived sentience is a shadow of OpenAI’s army of annotators or the corpus it was trained on, and I have a hard time assigning sentience to tomorrow’s forecast, which may well have used similar technology. It’s just an ultra fancy search engine index.

Anyways, that’s my rant done I guess. Call it a cynical engineer’s opinion. To be clear I think it’s a fantastic and useful technology, and it WILL change how we interact with machines. It can do fancy things with the combination of “shell” code driving it’s UI like multi-step “agents” or running code, and I actually hope OpenAI extends it far into the future, but I sincerely think any form of AGI will be something entirely different to LLMs, or at least they’ll only form a small part of it as an encoder/decoder for it’s thoughts.

EDIT: Added some paragraph spacing. Sorry, went into a more broad AI rant rather than staying on topic about coding specifically lol