Important File Formats in 2020: What Every Creator, Developer, and Data‑Scientist Should Know

TL;DR – 2020 was the year file formats got leaner, smarter, and more open. Mobile‑first traffic, 5G, and cloud‑based collaboration pushed new compression standards (WebP, AVIF, AV1) and columnar data stores (Parquet, ORC). PDFs stayed king for static documents, while Markdown, JSON, and ONNX became the lingua‑franca for developers and AI pipelines.


Introduction

If you were still using the same file types you learned in 2010, 2020 probably felt like a seismic shift. Over 70 % of web traffic now came from smartphones, 5G made high‑resolution streaming a reality, and cloud suites turned “live‑edit” into a default workflow. All that pressure forced the industry to adopt formats that are smaller, faster, and more interoperable. Below is a quick‑fire tour of the formats that defined the year, why they mattered, and where you’ll likely see them again in 2021‑24.


1. Document & Text Formats – From PDFs to Markdown

Format2020 StatusWhy It MatteredTypical Use‑Cases
PDF (ISO 32000‑2 / PDF 2.0)Still the de‑facto standard for printable, static docs.Better accessibility, digital signatures, and support for embedded 3‑D, video, and interactive forms.Contracts, e‑invoices, government forms, e‑books.
DOCX / ODTDOCX dominates corporate environments; ODT holds ~5 % market share.Open‑XML is a ZIP‑container of XML + media, enabling granular change‑tracking and macro‑free security. ODT is royalty‑free and favoured by open‑source suites.Word processing, collaborative editing (OneDrive, Nextcloud).
EPUB 3.212 % rise in e‑book sales; EPUB 3.2 became the recommended standard.Re‑uses HTML5, CSS3, SVG; supports audio, video, MathML; DRM‑agnostic.E‑books, digital textbooks, interactive publications.
Markdown (.md)Explosive growth in developer docs, static site generators (Jekyll, Hugo).Plain‑text, human‑readable, easy conversion to HTML/PDF; extensible via GitHub‑Flavored Markdown (GFM).README files, blogs, technical documentation.

Live‑edit formats (Google Docs, Office Online) still live as proprietary JSON blobs in the cloud, but they all export to PDF/DOCX for long‑term archiving.

Quick tip

If you need a document that will survive a decade of software changes, export to PDF 2.0. For collaborative writing, keep the source in Google Docs or Office Online, then archive the final version as PDF or DOCX.


2. Image, Video & Audio – The Compression Arms Race

Images

Format2020 RelevanceKey Advantages
JPEG> 80 % of web images.Baseline lossy DCT compression, universal support.
PNGPreferred for lossless UI assets.Deflate compression, alpha channel, no patents.
WebPUsage up ~30 % YoY (Chrome 86+).26 % smaller than JPEG at comparable quality; supports animation & transparency.
HEIF/HEICAdopted by iOS 11+ and Android 9+.Up to 50 % size reduction vs. JPEG; based on HEVC intra‑frame coding.
AVIF (emerging)Early‑adopter browsers (Firefox 78, Chrome 85) support it.AV1‑based, 30‑50 % better compression than WebP, HDR ready.

Takeaway: The web is moving toward royalty‑free, web‑optimized formats—WebP is now mainstream, and AVIF is poised to replace JPEG for high‑quality, low‑bandwidth images.

Video & Animation

Format2020 LandscapeHighlights
MP4 (ISO Base Media File Format)≈ 95 % of streaming deliveries.Supports H.264/AVC, H.265/HEVC, AAC; works with DASH & HLS.
MKV (Matroska)Gaining traction for 4K/HDR content.Unlimited tracks, subtitles, chapters; no licensing fees.
WebMDefault for HTML5 <video> on Chrome/Firefox.VP9 video + Opus audio, royalty‑free, low‑bitrate streaming.
AV1 (inside .mkv/.mp4)Netflix & YouTube start experimental AV1 streams.30‑50 % better compression than HEVC; patent‑pool‑free.
HEVC (H.265)Still dominant for 4K/UHD Blu‑ray and some OTT services.50 % bitrate reduction vs. H.264; licensing complexity limits web use.

Real‑world example: Netflix began delivering AV1‑encoded titles in 2020, cutting bandwidth for 4K HDR streams by roughly a third.

Audio

Format2020 PositionCore Points
MP3> 70 % of consumer audio libraries (legacy).128‑320 kbps, universal hardware support.
AACPreferred for on‑demand streaming (Spotify, Apple Music).Better quality at the same bitrate as MP3.
OpusRapid adoption in WebRTC, Discord, podcasts.Low‑latency, 6‑510 kbps variable bitrate; excels at speech & music.
FLAC+ 15 % YoY growth in high‑resolution audio market.Lossless, open source, rich metadata.
ALACNiche, tied to Apple ecosystem.Same compression as FLAC, but in .m4a container.

Bottom line: Opus is the go‑to for real‑time communication, AAC for streaming music, and FLAC/ALAC for archival‑grade audio.


3. Data & Interchange – From CSV to Columnar Lakes

FormatWhy It Matters in 2020Typical Scenarios
CSVStill the simplest data‑exchange format; > 50 % of imports/exports.Spreadsheet dumps, quick ETL jobs.
JSONDominates public web APIs (≈ 85 %).RESTful services, config files, NoSQL (MongoDB).
XMLDeclining for new APIs but entrenched in enterprise (SOAP, Office Open XML).Legacy systems, industry standards (HL7, XBRL).
ParquetColumnar storage for big‑data; 30 % size reduction vs. CSV.Data lakes, Spark/Hive analytics pipelines.
ORCCompetes with Parquet; favoured by Hive/Presto.Large‑scale batch processing.
AvroSchema‑evolution friendly; used with Kafka.Real‑time streaming, event sourcing.
Protocol BuffersCompact binary format for gRPC.High‑performance microservices.
GeoJSONStandard for GIS data on the web.Mapping apps, location‑based services.

Key concepts to remember

  • Schema evolution – Avro and Parquet let you add fields without breaking downstream jobs.
  • Self‑describing vs. binary – JSON/XML are human‑readable; Protobuf/Avro are compact but need a schema file.
  • Columnar layout – Great for analytical queries because only the needed columns are read from disk.

Pro tip: When building a data lake, store the raw ingest as Parquet (or ORC) and keep a JSON copy for quick inspection.


4. Emerging & Niche Formats Worth Watching

Format2020 Highlight
ONNX> 30 % of new deep‑learning models exported in 2020; enables cross‑framework portability.
Brotli (.br)70 % of Chrome traffic compressed with Brotli for HTML/CSS/JS.
SVGFull browser support; the go‑to for responsive icons and data visualizations.
GLTF/GLB“JPEG of 3‑D”; gaining traction for web‑based AR/VR (Sketchfab, Babylon.js).
Zstandard (zstd)Fast, high‑ratio compression; adopted for container images and Linux kernel patches.
HEVC‑based containers (HEIF/HEIC, MP4)Still patent‑encumbered, but dominate mobile photo capture and 4K video.

These formats are not yet universal, but they’re the early‑adopter playgrounds where the next big standards will emerge.


  1. Open‑source & royalty‑free – WebP → AVIF, AV1, Opus, Brotli, Parquet.
  2. Compression efficiency – 30‑50 % size reductions are now a competitive advantage for mobile and streaming.
  3. Metadata & accessibility – PDF 2.0, EPUB 3.2, and HEIF add richer tags, captions, and colour profiles.
  4. Cross‑platform interoperability – Cloud‑native JSON blobs (Google Docs) export to universally readable formats.
  5. Security & provenance – Digital signatures (PDF‑DS), encrypted ZIP‑AES, and signed JWTs are becoming mandatory for compliance.
  6. AI‑ready data – Columnar, schema‑evolving formats (Parquet, ORC) and model exchange (ONNX) are core to modern data‑science pipelines.

Conclusion

2020 forced the file‑format ecosystem to evolve from “just get the job done” to “do it efficiently, securely, and future‑proof.” Mobile‑first consumption, 5G bandwidth, and cloud collaboration made size, speed, and openness the new holy trinity. Whether you’re a marketer exporting a PDF, a developer writing Markdown docs, a data engineer building a lakehouse, or a video producer streaming 4K, the formats you pick today will dictate how much you pay for bandwidth, how easy it is to collaborate, and whether your assets survive the next five years.

Bottom line: Embrace the royalty‑free, compression‑smart formats (WebP, AVIF, AV1, Parquet, Opus) for new work, but keep a reliable export path to the tried‑and‑true standards (PDF, JPEG, MP4, CSV) for archival and compatibility.


Tags: file-formats 2020-tech-trends digital-media

Slug: important-file-formats-2020