This page moves from practical explanation toward a more Wikipedia-like view of the moving parts inside modern codecs.
Spatial redundancy: nearby pixels are similar.
Temporal redundancy: nearby frames are similar.
Perceptual limits: humans do not notice every detail equally.
This is where the site stops being mostly buying advice and starts becoming a technical reference: compression tools, trade-offs, and structure.
Video: prediction, transforms, GOPs, entropy coding.
Audio: psychoacoustics, masking, bit allocation.
Delivery: rate control, profiles, decoding limits.
Modern codecs differ in details, but many share the same broad pipeline: predict, transform, quantize, and entropy-code the remaining information.
Instead of coding every pixel directly, codecs often predict a block from nearby pixels in the same frame. Only the prediction error needs to be stored.
Video frames are usually similar over time. A codec can point to a block in a previous or future frame and store a motion vector plus the remaining difference.
Once a prediction is made, the codec stores what is left over. Good prediction means a smaller residual and fewer bits spent.
When you see a technical page or spec, scan for these questions: