[TAG] compressed issues of LG
Ben Okopnik
ben at linuxgazette.net
Fri Nov 2 16:02:47 MSK 2007
On Fri, Nov 02, 2007 at 02:13:10PM +1100, Minh Nguyen wrote:
> So far, issues of LG have been compressed using tar and gzip. Is there
> any intention to use tar with bzip2 for future issues? Since most of
> the files in each issue are text files, bzip2 is more efficient (in
> terms of the size of the compressed file) than gzip. Here is a
> comparison of bzip2 and gzip using the current issue; i.e. November
> 2007 (#144):
>
> 1028042 lg-144.tar.bz2
> 1045337 lg-144.tar.gz
>
> IMHO, providing a bzip2 compressed format of LG issues would save some
> download time.
As I recall, we had a similar discussion here in TAG quite a while back
(digging through my 'Sent_mail' says 2002 - but I can't find it in LG.
Annoying, that.) In any case, here's the comparison that I ran then:
``
OK, I'm the curious type... Here's a bunch of files from many walks of
life; let's see who does what.
-rw-r--r-- 1 ben ben 1474560 May 20 05:51 test.bin
-rw-rw-r-- 1 ben ben 102970 Sep 19 2000 test.bmp
-rw-rw-r-- 1 ben ben 121880 Sep 19 2000 test.gif
-rw-rw---- 1 ben ben 939783 Jun 17 15:29 test.jpg
-rw-r--r-- 1 ben ben 1727320 Oct 6 15:51 test.mov
-rw-r--r-- 1 ben ben 1048576 Oct 16 20:59 test.nulls
-rw-r--r-- 1 ben ben 1048576 Oct 16 21:03 test.ones
-rw-r--r-- 1 ben ben 490765 Sep 1 2001 test.pbm
-rw-r--r-- 1 ben ben 197029 Oct 12 13:53 test.ps
-rw-rw-r-- 1 ben ben 1995119 May 29 2001 test.txt
-rw-r--r-- 1 ben ben 36354922 Oct 16 20:29 test.wav
# So then, I was like, "Dude, check out some of *this* stuff:"
rar a ../rar.rar * # Very slow
zip ../zip.zip *
tar czf ../tgz.tgz * # Uses gzip as compressor
tar cjf ../tbz2.tbz2 * # Uses bz2 as compressor, slowest of all
tar cf -|compress -
# And the winnah and champeen is...
-rw-r--r-- 1 ben ben 26653542 Oct 16 21:09 rar.rar
-rw-r--r-- 1 ben ben 33171830 Oct 16 21:26 tbz2.tbz2
-rw-r--r-- 1 ben ben 36128937 Oct 16 21:10 zip.zip
-rw-r--r-- 1 ben ben 36132733 Oct 16 21:14 tgz.tgz
-rw-r--r-- 1 ben ben 43458125 Oct 16 21:21 Z.Z
I'll be darned. Looks like "rar" is it. Whodathunk?
''
Unfortunately, the only method that shows an appreciable savings in size
- 'rar', that is - uses a proprietary algorithm.
Given that there's no appreciable gain to be had by changing - and that
a change may occasion problems (e.g., it would break any automated
scripts that download and decompress the monthly archives), I don't see
it changing any time soon. I'm usually pretty reluctant to change things
like this without a really compelling reason.
--
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
More information about the TAG
mailing list