Public code repositories like Github are currently being beset by a flood of LLM-generated contributions. It’s becoming a bit of a problem and is one of the facets of the Great Flood the web is currently experiencing.
What does it look like when we are able to use LLMs to handle the flood of contributions? What happens when we’re able to screen and adopt PRs effectively with little to no human intervention?
I use the Voice audiobook app to listen to my DRM-free books. In this app, there’s a configuration setting for auto-rewind. If you pause the book, when you resume, it will rewind by X seconds. I didn’t like that feature, I wanted the amount of seconds to rewind to be based on how long it has been since I’ve paused. So if I resume within a minute, no rewind; within 5 minutes, 10 second rewind; more than that would be 30 seconds.
I can do this because I’m part of a small percentage of people who can clone a repo for an Android app, modify it, rebuild it and push it to my phone. But I don’t want this power to be constrained to a priesthood who know the secret language of coding. I want everyone to be able to do stuff like that.
Imagine a world in which, as I use a specific piece of software, I can request modifications to its behaviour to an LLM-augmented system. That system will pull the open source code, make the necessary modifications (following the project’s contribution guidelines), build it and reload it on my device. Then I can use it and test it, and fix any problems that come along. That modification can then be uploaded to my own repo and made publicly available for anyone else who wants it, or it could even be pushed as a PR to the original system who could scan it for usefulness, alignment, UX, etc., modify it if needed, and then merge it to the main branch.
This wonderful world of personal and communal computing would be unimaginable in a closed source world. No closed source system will accept an external AI to come in and read/modify it at will. This is why open source is more important than ever.
We need to build a Software Commons so that we can give everyone the ability to adapt their digital lives to their liking. So that these intimate, private devices to which we entrust most of our attention, these things which have great effects on our cognitive and emotional functions, remain ours in a real sense. And the way that we do this is to create the tools and processes to allow anyone to make modifications to their software by simply expressing that intent.
And what does communal software development look like? Let’s explore the space of social consensus mechanisms so we can find those that drive the creation of software which promote culture, connection, compassion and empathy.
I want to see the promise of community made by the 90’s web survive the FAANG+ Megacorp Baronies and flourish into a great digital metropolis. The web can still get free to be weird, we just have to make it happen together.
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
I agree that with the current state of tools around LLMs, this is very unadvisable. But I think we can develop the right ones.
We can have tools that can generate the context/info submitters need to understand what has been done, explain the choices they are making, discuss edge cases and so on. This includes taking screenshots as the submitter is using the app, testing period (require X amount of time of the submitter actually using their feature and smoothening out the experience)
We can have tools at the repo level that can scan and analyze the effect. It can also isolate the different submitted features in order to allow others to toggle them or modify them if they’re not to their liking. Similarly, you can have lots of LLMs impersonate typical users and try the modifications to make sure they work. Putting humans in the loop at different appropriate times.
People are submitting LLM generated code they don’t understand right now. How do we protect repos? How do we welcome these contributions while lowering risk? I think with the right engineering effort, this can be done.
Why do the people using LLMs to modify a project need to make a PR back to the remote branch? Why can’t they keep their ‘weird’ contributions on their own personal fork and use as they like?
If the answer is that they don’t have the knowledge to build the app in order to test if the code works before submitting a PR, they shouldn’t be submitting a PR in the first place. Code contributions come with an expectation of due diligence on the part of the submitter, to ensure that their code is not breaking anything or introducing obvious bugs and vulns (and of course, that it even works at all).
Democratizing coding means making the knowledge of how to do it more readily and freely-available, not having a computer spit out something that someone doesn’t understand, and then telling that person, “congratulations, you’re a code contributor”.
By not accepting PRs that do not properly meet contribution guidelines, like having tests that provide reasonable code coverage, etc.
We don’t. These contributions should not be welcomed. At all. And they bring nothing BUT risk.