• 0 Posts
  • 2 Comments
Joined 1Y ago
cake
Cake day: Oct 17, 2023

help-circle
rss

The problem is that the model is actually doing exactly what it’s supposed to, it’s just not what openai wants it to do. The reason the prompt extraction method works is because the underlying statistical model gets shifted far outside the domain of “real” language. In that case the correct maximizing posterior becomes a sample from the prior (here that would be a sample from the dataset, this is combined with things like repetition penalties).

This is the correct way a statistical estimator is supposed to work, but not the way you want it to work. That’s also why they can’t really fix this: there’s nothing broken to begin with (and “unbreaking” it would almost surely blow something take up)


You are vastly overestimating the amount of storage you need since you are looking at some download which itself has to choose the encoding (which is independent of whatever youtube does: youtube absolutely crushes the quality).
Most estimates assume that youtube has 1 exabyte of storage, let’s say we buy this in bulk from retail (which we wouldn’t do: you wait as long as possible since storage prices are going down and retail stores would give you the finger if you ordered and exabyte worth).
Let’s take that number and run with it:
Buying retail, you can get Seagate Exos X20 20TB drives for 280€, 1 exabyte is 1Mio terabyte, meaning we have 1_000_000/20 * 280 = 14 Mio € (you’d need machines to put those into but you also wouldn’t buy the entire thing upfront, and using retail prices either).

Compute also isn’t that big of a deal if you do it correctly: the expensive part in video hosting is usually video encoding since to get small video sizes you need to spend compute beforehand to compress it.
However, you can shift this in significant parts to the user by implementing the transcoding in WASM and running this clientside (see e.g. https://www.w3.org/2021/03/media-production-workshop/talks/qiang-fu-video-transcoding.html) in that case users would compress locally in the browser before uploading (this presumably wouldn’t even take longer than normal uploads for most people since you trade off transcoding time against upload time).
There are still other compute expenses but those are much more limited.
These mechanisms don’t (at least to my knowledge) exist in peertube yet, but would be possible.

The actually expensive part is always the actual networking: Networking is one of the few things that actually get more expensive at scale due to the complexity explosion, rather than cheaper (e.g. having dedicated transcoding hardware drops in price per user since you have higher utilization).
Networking quickly runs into bottlenecks where you have to account for all the covariances between datasets in the network.
Basically to increase the amount of e.g. storage available everything in the network needs to be increased (from the local machines connections, over the cables and switches up to routers and outgoing connections) due to you increasing the density at one point, you have to increase the network everywhere.
That’s why networking dwarfs everything: you just get crushed by networking being the bottleneck between your increasingly dense devices.

The clue behind peertube is that this is not as extreme of an effect due to

  1. federation (certain connections just aren’t dense due to the overall network topology being distributed)
  2. torrents

The latter is the important part: instead of having network cost rising (super) linearly to the amount of users you have it rise linearly to the amount of simultaneous unique videos.
This is a much smaller number which means you do not need to compete in that space, which is the dominant cost factor. (if you have a method where one user can retain the video and share it without actively watching that same video, you can probably get real-world sublinear scaling)

Mind you, the costs involved here are still large, but not insurmountably large, especially considering there is not one unique organisation that would have to pay for the entire thing and its not an upfront expense. Fundamentally though the system is built such that it won’t be crushed as users flood into the network.