[TAG] 2 cent tip: reading Freelang dictionaries

Ben Okopnik ben at linuxgazette.net
Sun Sep 5 22:28:53 MSD 2010


On Sun, Sep 05, 2010 at 06:55:24PM +0100, Jimmy O'Regan wrote:
> 
> print $out "00-dummy-entry\n   For dictfmt\n\n";
> 
> here will get rid of the second bug I had

OK, so the "improved" version looks like this (I was trying to remember
what in Perl handles C strings... 'pack/unpack', of course):

```
#!/usr/bin/perl -w
# Created by Ben Okopnik on Sun Sep  5 12:11:02 EDT 2010
use strict;

die "Usage: ", $0 =~ /([^\/]+)$/, " <dict_file> [encoding]\n"
    unless @ARGV;

use open IN => ":encoding(" . (defined $ARGV[1]?$ARGV[1]:'utf8') . ")",
    OUT => ":utf8";

(my $dct = $ARGV[0]) =~ s/\.wb$//;
$dct =~ tr/_ A-Z/-_a-z/;
open my $in, $ARGV[0] or die "$ARGV[0]: $!\n";
open my $out, "|/usr/bin/dictfmt -f --utf8 $dct" or die "Pipe failure: $!\n";

my $src;
print $out "00-dummy-entry\n\tFor dictfmt\n\n";
printf "%s\n\t%s\n\n", unpack("Z31 Z53", $src) while read $in, $src, 84;
close $in;

system ('dictzip', "$dct.dict");
print <<"+EOT+"

database $dct.dict.dz
{
    data  /usr/share/dictd/$dct.dict.dz
    index /usr/share/dictd/$dct.index
}
+EOT+
'''

The amusing part is the amount of work done by that "printf" line. Real
workhorse, that thing. :)


-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *



More information about the TAG mailing list