What goes on in artificial neural networks work is largely a mystery, even to their creators. But researchers from Anthropic have caught a glimpse.

archive.is link needed

This sounds promising but I do wonder how undermined any progress they make will be by:

  • the speed of advancements in AI
  • the fact that this research doesn’t necessarily apply to other LLMs
  • the fact that LLMs are being released/leaked to the public, so anyone who has access to them has the potential to jailbreak the AI and circumvent any safety precautions researchers implement as a result of this work
Create a post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

  • 1 user online
  • 59 users / day
  • 169 users / week
  • 619 users / month
  • 2.31K users / 6 months
  • 1 subscriber
  • 3.28K Posts
  • 67K Comments
  • Modlog