A 4 min 1080p30fps video taken with my phone camera is 518MB, While a 12 min 1080p30fps video ripped from youtube is 341MB, both are using mp4 h.264 as codec and the youtube one isnt of lower quality, so why this big difference?

I don’t know if this is in fact the case for you, but often codecs can provide better compression if they can spend more CPU time trying to find an optimal encoding.

Cameras have to do real-time encoding on a limited-power device. YouTube doesn’t have those constraints and may spend more computation time on encoding.

Here’s the FFmpeg documentation for x264, an open source h.264 encoder: https://trac.ffmpeg.org/wiki/Encode/H.264

So the encoding strategy is tunable for an encoder, but also there are different implementations that might perform differently. They all produce a h.264 video stream that’s decodable by any standard player.

Dept
link
fedilink
43
edit-2
7M

Compression. YouTube videos usually have low bitrate too.

@sarmale@lemmy.zip
creator
link
fedilink
57M

Well yes, they have a lower bitrate, but why same quality as a bigger bitrate video? Same resolution, codec, frame rate…

It’ the individual frames that are compressed, essentially the video is unpacked and detail is culled from averages across multiple other frames beside it. So if the top of the video, for example the sky, doesn’t change then that part will be kept static.

It’s not so much properties about the video, but properties about each frame. I can take a 1080p image and blow it up to 8K in GIMP, but it’s got the same detail as a 1080p image.

Helix 🧬
link
fedilink
English
47M

If you do multiple passes you can alleviate some of the downsides of low bitrates. You can always easily spot it in dark areas. I despise watching space movies or shows on streaming services because of the resulting excessive banding artifacts.

Video encoding has several tradeoffs:

  • Bitrate
  • Resolution/frame rate
  • Perceived quality
  • Computational complexity of encoding
  • Computational complexity of decoding

The cell phone encoding chips for video encoding on device make sacrifices to preserve speed of encoding and preserve battery life (higher computational complexity costs more processing cycles and tends to use more power). So it’s simpler encoding, in exchange for inefficient bitrate compression.

YouTube (and all the social media sites) have huge server farms with highly specialized encoding chips for making the videos more efficient with bitrate for quality. That makes sense because videos designed to be watched millions of times could benefit from even a very slight improvement of bitrate in exchange for a one-time cost of complex encoding. It’s also why YouTube tends not to convert to AV1 (very efficient in bitrate for quality, but computationally complex to encode) until a video has a few hundred views, because it’s not clear whether that tradeoff is worth it until they know a lot of people will be watching it.

Netflix customizes even further for a per-video basis and looks for even more specialized tricks on a scene-by-scene basis, because every single one of its videos only needs to be encoded once for each quality/format but will be watched millions of times.

In other words, it’s like any other engineering problem. The engineers choose different tradeoffs based on context, which means that the cell phone applies a different set of tradeoffs compared to the social media site’s server farm.

Because you can compress video without reducing the resolution, codec, or frame rate. When a camera records two green pixels: it records (green pixel) (green pixel). When the video is compressed, it changes to (two green pixels) which takes up less storage space but retains the same information. Compression is computationally expensive, which is why cameras typically don’t do it on the fly.

HeartyBeast
link
fedilink
48
edit-2
7M

Because they aren’t the same quality? Lossy compression is lossy

Gormadt
link
fedilink
127M

A lot of the data in the video file you take isn’t that visible but it’s there for when you put it in editing software

For example: if you took that lower sized YouTube rip and threw it into editing software to tinker with the brightness (as a simple example) you quickly start getting artifacts whereas the video you took would be able to handle the editing without the artifacts much better as there’s more data to work with

Basically the YouTube video has had all the extra data scrapped off that it doesn’t need because it’s not going to be edited, it’s only going to be viewed

Resolution and quality are different things, especially when different codecs are used for data compression. An AVI i record for acreen capture might hit 250 MB, running that through ffmpeg to get an mp4 might be 20MB, if i drop the quality but same resolution then sometimes 7MB

Create a post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

  • 1 user online
  • 144 users / day
  • 275 users / week
  • 709 users / month
  • 2.87K users / 6 months
  • 1 subscriber
  • 3.09K Posts
  • 64.9K Comments
  • Modlog