But AI isn’t all about generating creative works. It’s a store of information that I can query - a bit like searching Google; but understands semantics, and is interactive. It can translate my own text for me - in which case all the creativity comes from me, and I use it just for its knowledge of language. Many people use it to generate boilerplate code, which is pretty generic and wouldn’t usually be subject to copyright.
I disagree with the “limitations” they ascribe to the Turing test - if anything, they’re implementation issues. For example:
For instance, any of the games played during the test are imitation games designed to test whether or not a machine can imitate a human. The evaluators make decisions solely based on the language or tone of messages they receive.
There’s absolutely no reason why the evaluators shouldn’t take the content of the messages into account, and use it to judge the reasoning ability of whoever they’re chatting with.
No, I want a communal, collaboratively managed platform to recommend things to me based on an open source algorithm whose behavior I can adjust the way I want. Alas, this just isn’t a thing.
Just amongst the available options, the closed algorithm optimized for engagement has so far been better at showing me interesting things than an unfiltered chronological feed.
I don’t know why you would expect a pattern-recognition engine to generate pseudo-random seeds, but the reason OpenAI disliked the prompt is that it caused GPT to start repeating itself, and this might cause it to start printing training data verbatim.
I think the test for “free of copyrightable elements” is pretty simple - can you look at the new creation and recognize any copyrightable elements in it? The process by which it was created doesn’t matter. Maybe I made this post entirely by copy-pasting phrases from other people, who knows (well, I didn’t, only because it would be too much work), but it does not infringe either way…
From Wikipedia, “a derivative work is an expressive creation that includes major copyrightable elements of a first, previously created original work”.
You can probably can the output of an LLM ‘derived’, in the same way that if I counted the number of 'Q’s in Harry Potter the result derived from Rowling’s work.
But it’s not ‘derivative’.
Technically it’s possible for an LLM to output a derivative work if you prompt it to do so. But most of its outputs aren’t.
One the contrary - the reason copyright is called that is because it started as the right to make copies. Since then it’s been expanded to include more than just copies, such as distributing derivative works
But the act of distribution is key. If I wanted to, I could write whatever derivative works in my personal diary.
I also have the right to count the number of occurrences of the letter ‘Q’ in Harry Potter workout Rowling’s permission. This I can also post my count online for other lovers of ‘Q’, because it’s not derivative (it is ‘derived’, but ‘derivative’ is different - according to Wikipedia it means ‘includes major copyrightable elements’).
Or do more complex statistical analysis.
I’m sick and tired of this “parrots the works of others” narrative. Here’s a challenge for you: go to https://huggingface.co/chat/, input some prompt (for example, “Write a three paragraphs scene about Jason and Carol playing hide and seek with some other kids. Jason gets injured, and Carol has to help him.”). And when you get the response, try to find the author that it “parroted”. You won’t be able to - because it wouldn’t just reproduce someone else’s already made scene. It’ll mesh maaany things from all over the training data in such a way that none of them will be even remotely recognizable.
They have the right to ingest data, not because they’re “just learning like a human would". But because I - a human - have a right to grab all data that’s available on the public internet, and process it however I want, including by training statistical models. The only thing I don’t have a right to do is distribute it (or works that resemble it too closely).
In you actually show me people who are extracting books from LLMs and reading them that way, then I’d agree that would be piracy - but that’d be such a terrible experience if it ever works - that I can’t see it actually happening.
That same critique should apply to the LLM as well.
No, it shouldn’t. Instead, you should compare it to the alternatives you have on hand.
The fact is,
So, if I have to learn something, I have enough background to spot hallucinations, and I don’t have a teacher (having graduated college, that’s always true), I would consider using it, because it’s better then the alternatives.
I just would never fully trust knowledge I gained from an LLM
There are plenty of cases where you shouldn’t fully trust knowledge you gained from a human, too.
And there are, actually, cases where you can trust the knowledge gained from an LLM. Not because it sounds confident, but because you know how it behaves.
Why is that a problem?
For example, I’ve used it to learn the basics of Galois theory, and it worked pretty well.
So what if it doesn’t understand Galois theory - it could teach it to me well enough. Frankly if it did actually understand it, I’d be worried about slavery.
have no thoughts
True
know no information
False. There’s plenty of information stored in the models, and plenty of papers that delve into how it’s stored, or how to extract or modify it.
I guess you can nitpick over the work “know”, and what it means, but as someone else pointed out, we don’t actually know what that means in humans anyway. But LLMs do use the information stored in context, they don’t simply regurgitate it verbatim. For example (from this article):
If you ask an LLM what’s near the Eiffel Tower, it’ll list location in Paris. If you edit its stored information to think the Eiffel Tower is in Rome, it’ll actually start suggesting you sights in Rome instead.
Yes, I know about the exploitation that happened during early industrialization, and it was horrible. But if people had just rejected and banned factories back then, we’d still be living in feudalism.
I know that I don’t want to work a job that can be easily automated, but intentionally isn’t just so I can “have a purpose”.
What will happen if AI were to automate all jobs? In the most extreme case, where literally everyone lost their job, then nobody would be able to buy stuff, but also, no company would be able to sell products and make profit. Then, either capitalism would collapse - or more likely, it will adapt by implementing some mechanism such as UBI. Of course, the real effect of AI will not be quite that extreme, but it may well destabilize things.
That said, if you want to change the system, it’s exactly in periods of instability that can be done. So I’m not going to try to stop progress and cling to the status quo out of fear what those changes might be - and instead join a movement that tries to shape them.
we should at least regulate the tech.
Maybe. But generally on Lemmy I see sooo many articles about “Oh, no, AI bad”. But no good suggestions on what exactly regulations should we want.
That seems a somewhat contrived example. Yes, it can theoretically happen - but in practice it would happen with a library, and most libraries are LGPL (or more permissive) anyway. By contrast, there have been plenty of stories lately of people who wrote MIT/BSD software, and then got upset when companies just took the code to add in their products, without offering much support in return.
Also, there’s a certain irony in saying what essentially amounts to, “Please license your code more permissively, because I want to license mine more restrictively”.
This isn’t a well-controlled comparison.
It is. If you’re going to virtualize a board game, there’s no need to stick to the limitation of a physical board game. So, once you make full use of the virtual environment, you get a video game. If you compare to just virtualized board games, then you’re artificially disadvantaging the virtual side.
PS. I also added this significant edit to my last post (bad form for discussion, but it makes more sense there than here)
I think the point of the article is to show that the CEOs empty words are empty
Maybe. To me it read more like: “According to Zoom’s CEO, Zoom can’t fully replace in-person interaction for work. Therefore, it’s bad/useless software - or the CEO is bullshitting.” Which is just bad reasoning. The conclusion doesn’t follow from the premises. Maybe I’m just taking it too literally, but I just don’t like when articles use such bad reasoning, even if I agree with their conclusion.
I think the point of the article is to show that the CEOs empty words are empty
Maybe. To me it read more like: “According to Zoom’s CEO, Zoom can’t fully replace in-person interaction for work. Therefore, it’s bad/useless software - or the CEO is bullshitting.” Which is just bad reasoning. The conclusion doesn’t follow from the premises. Maybe I’m just taking it too literally, but I just don’t like when articles use such bad reasoning, even if I agree with their conclusion.
fail to account for spaces critical to trust-building such as water-cooler talk and outside of work events
What do you mean by that? If you are fully virtual there’s going to be no water cooler talk - but that’s a legitimate difference between in-person and virtual that should affect the results of the study. So it makes sense to me that the study shouldn’t try to control for that.
and fail to replicate virtual versions of predominantly in-person activities
I don’t think you can. Take for example board games as an in-person activity. The virtual replacement would be video games. A video game can do everything a board game can (with some exceptions) - but it can do so much more. So, purely from a game design perspective, video games would be much better. The main thing that video games don’t have, while board games do, is the in-person interaction. Yet, there’s plenty of people who play board games, but not video games. Clearly the in-person part is important.
It’s not an article about LLMs not using dialects. In fact, they have learned said dialects and will use them if asked.
What they did was, ask the LLM to suggest adjectives associated with sentences - and it would associate more aggressive or negative adjectives with African dialect.
All (racial) bias in AI models is actually a reflection of the training data, not of the modelling.