Overfitted a 900KB Transformer to Compress a 100MB CSV into 7MB

I built an experiment that uses an overfitted transformer and arithmetic coding to compress individual files.

Instead of training the model to generalize, I train a 900KB transformer to memorize a single file and predict the next byte. Those predictions are fed into an arithmetic coder to produce the compressed output.

On a 100MB NYC taxi CSV, it compresses to about 7MB (~0.5 bits/byte). On a 100MB slice of enwik9, it compresses to about 21MB (~1.68 bits/byte).

It's pretty slow right now (roughly 20–30 minutes of training and 45 minutes each for compression and decompression on my AMD 7800XT).

Checkout the repo - https://github.com/samyak112/pym-particles

8 points | by spidy__ 1 day ago

3 comments

  • purple-leafy 12 hours ago
    That’s so awesome! I want to try something similar. I’ve been going crazy with compression work. I reckon I can beat that prize link
  • 7373737373 1 day ago
    What does it compress the full 1GB file to? http://prize.hutter1.net/
    • spidy__ 1 day ago
      I tried it on a enwik9 100 mb slice and was able to compress it to 20 mb + 900kb transformer so 21mb.

      I know the top submission was able to get it to 13 mb.

      Still trying some ideas to get better compression.

  • xunevega 21 hours ago
    [flagged]