Perlメモ

岡崎氏のPerlメモのパクリ?と見えるようですが、そうでもないかもです。汗

ほとんどが、PyukiWikiの開発から培ったことです。

URLエンコード

$encoded =~ tr/ /+/;
$encoded =~ s/(\W)/'%' . unpack('H2', $1)/eg;

URLデコード

$s =~ tr/+/ /;
$s =~ s/%([A-Fa-f0-9][A-Fa-f0-9])/pack("C", hex($1))/eg;

もありますが

$s =~ tr/+/ /;
$s =~ s/%([A-Fa-f0-9][A-Fa-f0-9])/chr(hex($1))/eg;

のが早いかな?

URLの正規表現

普通のもの

qq(s?(?:https?|ftp|news)://[-_.!~*'a-zA-Z0-9;/?:@&=+$,%#]+);

file://スキーマ付

q(s?(?:(?:(?:https?|ftp|news)://)|(?:file:[/\x5c][/\x5c]))(?:[-\x5c_.!~*'a-zA-Z0-9;/?:@&=+$,%#]+));

PunyCode?(いいかえれば日本語ドメイン)付

  • UTF8

q{(\b(?:https?|ftp|news)://(?:(?:[-_.!~*'()a-zA-Z0-9;:&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*@)?(?:(?:(?:[a-zA-Z0-9](?:[-_a-zA-Z0-9]*[a-zA-Z0-9])?|[-_0-9a-zA-Z\x80-\xfd](?:[-_0-9a-zA-Z\x80-\xfd]*[-_0-9a-zA-Z\x80-\xfd])?)\.)*[a-zA-Z](?:[-a-zA-Z0-9]*[a-zA-Z0-9])?\.?|[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)(?::[0-9]*)?(?:/(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*(?:;(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)*(?:/(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*(?:;(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)*)*)?(?:\?(?:[-_.!~*'a-zA-Z0-9;/?:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)?(?:\x23(?:[-_.!~*'a-zA-Z0-9;/?:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)?)};

判別は

q{[\x80-\xfd]}

  • EUC

q{(\b(?:https?|ftp|news)://(?:(?:[-_.!~*'()a-zA-Z0-9;:&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*@)?(?:(?:(?:[a-zA-Z0-9](?:[-_a-zA-Z0-9]*[a-zA-Z0-9])?|[-_0-9a-zA-Z\xa1-\xfe](?:[-_0-9a-zA-Z\xa1-\xfe]*[-_0-9a-zA-Z\xa1-\xfe])?)\.)*[a-zA-Z](?:[-a-zA-Z0-9]*[a-zA-Z0-9])?\.?|[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)(?::[0-9]*)?(?:/(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*(?:;(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)*(?:/(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*(?:;(?:[-_.!~*'a-zA-Z0-9:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)*)*)?(?:\?(?:[-_.!~*'a-zA-Z0-9;/?:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)?(?:\x23(?:[-_.!~*'a-zA-Z0-9;/?:@&=+$,]|%[0-9A-Fa-f][0-9A-Fa-f])*)?)};

判別は

q{[\x81-\xfe]};

そのソースは(EUCですが)

$digit = q{[0-9]};
$alpha = q{[a-zA-Z]};
$alphanum = q{[a-zA-Z0-9]};
$hex = q{[0-9A-Fa-f]};
$escaped = qq{%$hex$hex};
$uric = q{(?:[-_.!~*'a-zA-Z0-9;/?:@&=+$,]} . qq{|$escaped)};
#$uric = q{(?:[-_.!~*'()a-zA-Z0-9;/?:@&=+$,]} . qq{|$escaped)};
$fragment = qq{$uric*};
$query = qq{$uric*};
$pchar = q{(?:[-_.!~*'a-zA-Z0-9:@&=+$,]} . qq{|$escaped)};
#$pchar = q{(?:[-_.!~*'()a-zA-Z0-9:@&=+$,]} . qq{|$escaped)};
$param = qq{$pchar*};
$segment = qq{$pchar*(?:;$param)*};
$path_segments = qq{$segment(?:/$segment)*};
$abs_path = qq{/$path_segments};
$port = qq{$digit*};
$IPv4address = qq{$digit+\\.$digit+\\.$digit+\\.$digit+};
$toplabel = qq{$alpha(?:} . q{[-a-zA-Z0-9]*} . qq{$alphanum)?};
$domainlabel = qq{$alphanum(?:} . q{[-_a-zA-Z0-9]*} . qq{$alphanum)?};
$domainlabel_rfc3490_class= q{[-_0-9a-zA-Z\xa1-\xfe]};
$domainlabel_rfc3490_punyonly_class= q{[\xa1-\xfe]};
$domainlabel_rfc3490=qq{$domainlabel_rfc3490_class(?:}
   . qq{$domainlabel_rfc3490_class*} . qq{$domainlabel_rfc3490_class)?};
$domainlabel_rfc3490_punyonly=
		qq{$domainlabel_rfc3490_class(?:}
		 . qq{$domainlabel_rfc3490_class*}
			 . qq{$domainlabel_rfc3490_punyonly_class)?} .
 '|' .
		qq{$domainlabel_rfc3490_punyonly_class(?:}
		 . qq{$domainlabel_rfc3490_class*}
			 . qq{$domainlabel_rfc3490_class)?};# . 
# '|' .
#		qq{$domainlabel_rfc3490_class(?:}
#		 . qq{$domainlabel_rfc3490_punyonly_class?};
#			 . qq{$domainlabel_rfc3490_class)*};
$hostname = qq{(?:(?:$domainlabel|$domainlabel_rfc3490)\\.)*$toplabel\\.?};
$hostname_punyonly = qq{(?:(?:$domainlabel_rfc3490_punyonly)\\.)+$toplabel\\.?}; 
#$hostname = qq{(?:$domainlabel\\.)*$toplabel\\.?};
$host = qq{(?:$hostname|$IPv4address)};
$host_punyonly = qq{(?:$hostname_punyonly)};
$hostport = qq{$host(?::$port)?};
$hostport_punyonly = qq{$host_punyonly(?::$port)?};
$userinfo = q{(?:[-_.!~*'()a-zA-Z0-9;:&=+$,]|} . qq{$escaped)*};
$server = qq{(?:$userinfo\@)?$hostport};
$server_punyonly = qq{(?:$userinfo\@)?$hostport_punyonly};
$authority = qq{$server};
$authority_punyonly = qq{$server_punyonly};
#$scheme = q{(?:https?|shttp)};
$scheme = q{(?:https?|ftp)};
$net_path = qq{//$authority(?:$abs_path)?};
$net_path_punyonly = qq{//$authority_punyonly(?:$abs_path)?};
$hier_part = qq{$net_path(?:\\?$query)?};
$hier_part_punyonly = qq{$net_path_punyonly(?:\\?$query)?};
$absoluteURI = qq{$scheme:$hier_part};
$absoluteURI_punyonly = qq{$scheme:$hier_part_punyonly};
$URI_reference = qq{$absoluteURI(?:\\x23$fragment)?};
$URI_reference_punyonly = qq{$absoluteURI_punyonly(?:\\x23$fragment)?};
$http_URL_regex = q{\b} . $URI_reference;
$http_URL_regex_punyonly = q{\b} . $URI_reference_punyonly;

######################################################################

$test=<<EOM;
http://pyukiwiki.sourceforge.jp/
http://てすと.jp/buying
http://wikiエンジン.jp/ahaha.html#abc
https://うぃきEngins.jp/ahaha.html#abc
https://192.168.0.1/
http://aiうえo.jp/
http://aiうえお.jp/
http://うえおai.jp/
http://あiうeお.jp/ 
http://東京駅.jp/
http://新大阪駅.jp/
http://京都駅.jp/

EOM

foreach($test=~/$http_URL_regex/g) {
	print STDERR "URL:$_\n";
}

print "\n";

foreach($test=~/$http_URL_regex_punyonly/g) {
	print STDERR "PUNY:$_\n";
}

print '$::isurl=q{(' . $::http_URL_regex . ")};\n";
print '$::isurl_puny=q{(' . $::http_URL_regex_punyonly . ")};\n";

メールアドレスの正規表現

なんか、足し算しててすみません。

$::ismail=q({?:[^(\040)<>@,;:&#"'.\\\[\]\000-\037\x80-\xff](?:[^(\040)<>@,;:&#".\\\[\]\000-\037\x80-\xff])*(?![^(\040)<>@,;:&#".\\\[\]\000-\037\x80-\xff])|["'][^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*["'])(?:\.(?:[^(\040)<>@,;:&#"'.\\\[\]\000-\037\x80-\xff](?:[^(\040)<>@,;:&#".\\\[\]\000-\037\x80-\xff])*(?![^(\040)<>@,;:&#".\\\[\]\000-\037\x80-\xff])|["'][^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*["']))*\.?@(?:[^(\040)<>@,;:&#"'.\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:&#"'.\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])(?:\.(?:[^(\040)<>@,;:&#"'.\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:&#"'.\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\]))};

$::IntraMailAddr に 1がセットしてあれば、 というメールアドレスも識別できる

$::ismail.=$::IntraMailAddr eq 0 ? '+' : '*';

そのソースは

$esc         = '\\\\';               $Period      = '\.';
$space       = '\040';
$OpenBR      = '\[';                 $CloseBR     = '\]';
$NonASCII    = '\x80-\xff';          $ctrl        = '\000-\037';
$CRlist      = '\n\015';
$qtext       = qq/[^$esc$NonASCII$CRlist\"]/;
$dtext       = qq/[^$esc$NonASCII$CRlist$OpenBR$CloseBR]/;
$quoted_pair = qq<${esc}[^$NonASCII]>;
$apos        = qq(');
#$atom_char   = qq/[^($space)<>\@,;:\".$esc$OpenBR$CloseBR$ctrl$NonASCII]/;
$atom_char   = qq/[^($space)<>\@,;:\&#".$esc$OpenBR$CloseBR$ctrl$NonASCII]/;
$atom_char2  = qq/[^($space)<>\@,;:\&#"$apos.$esc$OpenBR$CloseBR$ctrl$NonASCII]/;
#$atom        = qq<$atom_char+(?!$atom_char)>;
$atom        = qq<$atom_char2(?:$atom_char)*(?!$atom_char)>;
$atom2       = qq<$atom_char2+(?!$atom_char2)>;
#$quoted_str  = qq<\"$qtext*(?:$quoted_pair$qtext*)*\">;
$quoted_str  = qq<["']$qtext*(?:$quoted_pair$qtext*)*["']>;
$word        = qq<(?:$atom|$quoted_str)>;
#$domain_ref  = $atom;
$domain_ref  = $atom2;
$domain_lit  = qq<$OpenBR(?:$dtext|$quoted_pair)*$CloseBR>;
$sub_domain  = qq<(?:$domain_ref|$domain_lit)>;
#$domain      = qq<$sub_domain(?:$Period$sub_domain)*>;
$domain      = qq<$sub_domain(?:$Period$sub_domain)>;
							#  ↑イントラのドメインなしを自動判別する
#$local_part  = qq<$word(?:$Period$word)*>;
$local_part  = qq<$word(?:$Period$word)*$Period?>;
										#↑dot付を許す
$addr_spec   = qq<$local_part\@$domain>;
$mail_regex  = $addr_spec;

######################################################################
print '$::ismail=q(' . $::mail_regex . ");\n";
$test="example\@test";
print STDERR $test=~/$::mail_regex/g;
print "\n";
$test="example.\@test.com";
print STDERR $test=~/$::mail_regex/g;
print "\n";
$test="'examp'le.\@test.com'";
print STDERR $test=~/$::mail_regex/g;
print "\n";
$test="\@test.com";
print STDERR $test=~/$::mail_regex/g;
print "\n";
$test="お問い合わせはa\@test.com&subject=sendまで";
print STDERR $test=~/$::mail_regex/g;
print "\n";

とありますが・・・・・・

最新の正規表現のがよさそうな気がします。。。。

http://www.din.or.jp/~ohzaki/mail_regex.htm

my $quoted_pair = qq{\\\\[\\x09 -~]};

my $atext = qq{[-!#-'*+/-9=?A-Z^-~]};
my $dot_atom_text = qq{$atext+(?:\\.$atext+)*};
my $dot_atom = $dot_atom_text;

my $qtext = qq{[!#-\\[\\]-~]};
my $qcontent = qq{(?:$qtext|$quoted_pair)};
my $quoted_string = qq{"$qcontent*"};

my $local_part = qq{(?:$dot_atom|$quoted_string)};
my $domain = $dot_atom;
my $addr_spec = qq{$local_part\@$domain};
my $mail_regex = $addr_spec;

print $mail_regex;

(?:[-!#-'*+/-9=?A-Z^-~]+(?:\.[-!#-'*+/-9=?A-Z^-~]+)*|"(?:[!#-\[\]-~]|\\[\x09 -~])*")@[-!#-'*+/-9=?A-Z^-~]+(?:\.[-!#-'*+/-9=?A-Z^-~]+)*

不思議な関数の上書き

前もって、スクリプト上に
sub example { return 0; }

として定義してある所に、

require "example.pl";

として、その中に

sub example { return 1;}
とすると、requireされた側の example が実行され、実行すると1が返る

Base64もどきエンコード/デコードをする。

うちが「子供」の頃、comp.lang.perlに、以下のような投稿がありました。
Article 7465 of comp.lang.perl:
Xref: feenix.metronet.com comp.lang.perl:7465
Path: feenix.metronet.com!news.ecn.bgu.edu!usenet.ins.cwru.edu!
   howland.reston.as.net!ee.und.ac.za!tplinfm
From: barrett@lucy.ee.und.ac.za (Alan Barrett)
Newsgroups: comp.lang.perl
Subject: Re: Base 64?
Date: 1 Nov 1993 14:16:53 +0200
Organization: Elec. Eng., Univ. Natal, Durban, S. Africa
Lines: 202
Message-ID: <2b2unl$ikf@lucy.ee.und.ac.za>
References: <uglykidCFrvtp.11z@netcom.com>
NNTP-Posting-Host: lucy.ee.und.ac.za

In article <uglykidCFrvtp.11z@netcom.com>,
uglykid@netcom.com (Joe McDonald) writes:
> Does anyone have a sub to convert from/to base 64? thanks.

Here's mine.  It's not very well tested.

--apb
Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
RFC822: barrett@ee.und.ac.za

--------------------------------------------------------------
#!/usr/bin/perl
# base64.pl -- A perl package to handle MIME-style BASE64 encoding
# A. P. Barrett <barrett@ee.und.ac.za>, October 1993
# $Revision: 1.2 $$Date: 1993/11/01 12:12:29 $

package base64;

# Synopsis:
#       require 'base64.pl';
#
#       $uuencode_string = &base64'b64touu($base64_string);
#       $binary_string = &base64'b64decode($base64_string);
#       $base64_string = &base64'uutob64($uuencode_string);
#       $base64_string = &base64'b64encode($binary_string);
#       $uuencode_string = &base64'uuencode($binary_string);
#       $binary_string = &base64'uudecode($uuencode_string);
#
#       uuencode and base64 input strings may contain multiple lines,
#       but may not contain any headers or trailers.  (For uuencode,
#       remove the begin and end lines, and for base64, remove the MIME
#       headers and boundaries.)
#
#       uuencode and base64 output strings will be contain multiple
#       lines if appropriate, but will not contain any headers or
#       trailers.  (For uuencode, add the "begin" line and the
#       " \nend\n" afterwards, and for base64, add any MIME stuff
#       afterwards.)

####################

$base64_alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.
                   'abcdefghijklmnopqrstuvwxyz'.
                   '0123456789+/';
$base64_pad = '=';

$uuencode_alphabet = q|`!"#$%&'()*+,-./0123456789:;<=>?|.
                      '@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_'; # double that '\\'!
$uuencode_pad = '`';

# Build some strings for use in tr/// commands.
# Some uuencodes use " " and some use "`", so we handle both.
# We also need to protect backslashes.
($tr_uuencode = " ".$uuencode_alphabet) =~ s/\\/\\\\/;
$tr_base64 = "A".$base64_alphabet;


sub b64touu
{
    local ($_) = @_;
    local ($result);
    
    # zap bad characters and translate others to uuencode alphabet
    eval qq{
	tr|$tr_base64||cd;
	tr|$tr_base64|$tr_uuencode|;
    };

    # break into lines of 60 encoded chars, prepending "M" for uuencode
    while (s/^(.{60})//) {
	$result .= "M" . $& . "\n";
    }

    # any leftover chars go onto a shorter line
    # with padding to the next multiple of 4 chars
    if ($_ ne "") {
	$result .= substr($uuencode_alphabet, length($_)*3/4, 1)
		   . $_
		   . ($uuencode_pad x ((60 - length($_)) % 4)) . "\n";
    }

    # return result
    $result;
}

sub b64decode
{
    local ($_) = @_;
    local ($result);
    
    # zap bad characters and translate others to uuencode alphabet
    eval qq{
	tr|$tr_base64||cd;
	tr|$tr_base64|$tr_uuencode|;
    };

    # break into lines of 60 encoded chars, prepending "M" for uuencode,
    # and then using perl's builtin uudecoder to convert to binary.
    while (s/^(.{60})//) {
	#warn "chunk :$&:\n";
	$result .= unpack("u", "M" . $&);
    }

    # also decode any leftover chars
    if ($_ ne "") {
	#warn "last chunk :$_:\n";
	$result .= unpack("u",
		    substr($uuencode_alphabet, length($_)*3/4, 1) . $_);
    }

    # return result
    $result;
}

sub uutob64
{
    local ($_) = @_;
    local ($result);
    
    # This is the most difficult, because some perverse uuencoder
    # might have made lines that do not describe multiples of 3 bytes.
    # I don't see any better method than uudecoding to binary and then
    # b64encoding the binary.

    &b64encode(&uudecode); # implicitly pass @_ to &uudecode
}

sub b64encode
{
    local ($_) = @_;
    local ($chunk);
    local ($result);
    
    # break into chunks of 45 input chars, use perl's builtin
    # uuencoder to convert each chunk to uuencode format,
    # then kill the leading "M", translate to the base64 alphabet,
    # and finally append a newline.
    while (s/^((.|\n){45})//) {
	#warn "in:$&:\n";
	$chunk = substr(pack("u", $&), $[+1, 60);
	#warn "packed    :$chunk:\n";
	eval qq{
	    \$chunk =~ tr|$tr_uuencode|$tr_base64|;
	};
	#warn "translated:$chunk:\n";
	$result .= $chunk . "\n";
    }

    # any leftover chars go onto a shorter line
    # with uuencode padding converted to base64 padding
    if ($_ ne "") {
	#warn "length ".length($_)." \$_:$_:\n";
	#warn "enclen ", int((length($_)+2)/3)*4 - (45-length($_))%3, "\n";
	$chunk = substr(pack("u", $_), $[+1,
			int((length($_)+2)/3)*4 - (45-length($_))%3);
	#warn "chunk:$chunk:\n";
	eval qq{
	    \$chunk =~ tr|$tr_uuencode|$tr_base64|;
	};
	#warn "translated:$chunk:\n";
	$result .= $chunk . ($base64_pad x ((60 - length($chunk)) % 4)) . "\n";
    }

    # return result
    $result;
}

sub uuencode
{
    local ($_) = @_;
    local ($result);
    
    # break into chunks of 45 input chars, and use perl's builtin
    # uuencoder to convert each chunk to uuencode format.
    # (newline is added by builtin uuencoder.)
    while (s/^((.|\n){45})//) {
	$result .= pack("u", $&);
    }

    # any leftover chars go onto a shorter line
    # with padding to the next multiple of 4 chars
    if ($_ ne "") {
	$result .= pack("u", $_);
    }

    # return result
    $result;
}

sub uudecode
{
    local ($_) = @_;
    local ($result);
    
    # use perl's builtin uudecoder to convert each line
    while (s/^([^\n]+\n?)//) {
	$result .= unpack("u", $&);
    }

    # return result
    $result;
}

うん。意外とコンパクトなコードなんですよね。

で、これを、自分用にいじったものが、こちらになります

#!/usr/bin/perl
$base64_alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.
                   'abcdefghijklmnopqrstuvwxyz'.
                   '0123456789+/';
$base64_pad = '=';

$uuencode_alphabet = q|`!"#$%&'()*+,-./0123456789:;<=>?|.
                      '@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_'; # double that '\\'!
$uuencode_pad = '`';

# Build some strings for use in tr/// commands.
# Some uuencodes use " " and some use "`", so we handle both.
# We also need to protect backslashes.
($tr_uuencode = " ".$uuencode_alphabet) =~ s/\\/\\\\/;
$tr_base64 = "A".$base64_alphabet;

while(<STDIN>){
	$input.=$_;
}
if($ARGV[0] eq "b64encode") {
	print &b64encode($input);
}
if($ARGV[0] eq "b64decode") {
	print &b64decode($input);
}

#!/usr/bin/perl
#--------------------------------------------------------------
#Article 7465 of comp.lang.perl:
#Xref: feenix.metronet.com comp.lang.perl:7465
#Path: feenix.metronet.com!news.ecn.bgu.edu!usenet.ins.cwru.edu!
#   howland.reston.as.net!ee.und.ac.za!tplinfm
From: barrett@lucy.ee.und.ac.za (Alan Barrett)

#Xref: feenix.metronet.com comp.lang.perl:7465
#Path: feenix.metronet.com!news.ecn.bgu.edu!usenet.ins.cwru.edu!howland.reston.a#ns.net!ee.und.ac.za!tplinfm
#From: barrett@lucy.ee.und.ac.za (Alan Barrett)
#Newsgroups: comp.lang.perl
#Subject: Re: Base 64?
#Date: 1 Nov 1993 14:16:53 +0200
#Organization: Elec. Eng., Univ. Natal, Durban, S. Africa
#Lines: 202
#Message-ID: <2b2unl$ikf@lucy.ee.und.ac.za>
#References: <uglykidCFrvtp.11z@netcom.com>
#NNTP-Posting-Host: lucy.ee.und.ac.za
#
#In article <uglykidCFrvtp.11z@netcom.com>,
#<uglykid@netcom.com> (Joe McDonald) writes:
#> Does anyone have a sub to convert from/to base 64? thanks.
#
#Here's mine.  It's not very well tested.
#
#--apb
#Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
#RFC822: <barrett@ee.und.ac.za>
#
#--------------------------------------------------------------
#!/usr/bin/perl
# base64.pl -- A perl package to handle MIME-style BASE64 encoding
# A. P. Barrett <barrett@ee.und.ac.za>, October 1993
# $Revision: 1.2 $$Date: 1993/11/01 12:12:29 $


# Synopsis:
#       require 'base64.pl';
#
#       $uuencode_string = &base64'b64touu($base64_string);
#       $binary_string = &base64'b64decode($base64_string);
#       $base64_string = &base64'uutob64($uuencode_string);
#       $base64_string = &base64'b64encode($binary_string);
#       $uuencode_string = &base64'uuencode($binary_string);
#       $binary_string = &base64'uudecode($uuencode_string);
#
#       uuencode and base64 input strings may contain multiple lines,
#       but may not contain any headers or trailers.  (For uuencode,
#       remove the begin and end lines, and for base64, remove the MIME
#       headers and boundaries.)
#
#       uuencode and base64 output strings will be contain multiple
#       lines if appropriate, but will not contain any headers or
#       trailers.  (For uuencode, add the "begin" line and the
#       " \nend\n" afterwards, and for base64, add any MIME stuff
#       afterwards.)

####################

$base64_alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'.
                   'abcdefghijklmnopqrstuvwxyz'.
                   '0123456789+/';
$base64_pad = '=';

$uuencode_alphabet = q|`!"#$%&'()*+,-./0123456789:;<=>?|.
                      '@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_'; # double that '\\'!
$uuencode_pad = '`';

# Build some strings for use in tr/// commands.
# Some uuencodes use " " and some use "`", so we handle both.
# We also need to protect backslashes.
($tr_uuencode = " ".$uuencode_alphabet) =~ s/\\/\\\\/;
$tr_base64 = "A".$base64_alphabet;


sub b64touu
{
    local ($_) = @_;
    local ($result);
    
    # zap bad characters and translate others to uuencode alphabet
    eval qq{
	tr|$tr_base64||cd;
	tr|$tr_base64|$tr_uuencode|;
    };

    # break into lines of 60 encoded chars, prepending "M" for uuencode
    while (s/^(.{60})//) {
	$result .= "M" . $& . "\n";
    }

    # any leftover chars go onto a shorter line
    # with padding to the next multiple of 4 chars
    if ($_ ne "") {
	$result .= substr($uuencode_alphabet, length($_)*3/4, 1)
		   . $_
		   . ($uuencode_pad x ((60 - length($_)) % 4)) . "\n";
    }

    # return result
    $result;
}

sub b64decode
{
    local ($_) = @_;
    local ($result);
    
    # zap bad characters and translate others to uuencode alphabet
    eval qq{
	tr|$tr_base64||cd;
	tr|$tr_base64|$tr_uuencode|;
    };

    # break into lines of 60 encoded chars, prepending "M" for uuencode,
    # and then using perl's builtin uudecoder to convert to binary.
    while (s/^(.{60})//) {
	#warn "chunk :$&:\n";
	$result .= unpack("u", "M" . $&);
    }

    # also decode any leftover chars
    if ($_ ne "") {
	#warn "last chunk :$_:\n";
	$result .= unpack("u",
		    substr($uuencode_alphabet, length($_)*3/4, 1) . $_);
    }

    # return result
    $result;
}

sub uutob64
{
    local ($_) = @_;
    local ($result);
    
    # This is the most difficult, because some perverse uuencoder
    # might have made lines that do not describe multiples of 3 bytes.
    # I don't see any better method than uudecoding to binary and then
    # b64encoding the binary.

    &b64encode(&uudecode); # implicitly pass @_ to &uudecode
}

sub b64encode
{
    local ($_) = @_;
    local ($chunk);
    local ($result);
    
    # break into chunks of 45 input chars, use perl's builtin
    # uuencoder to convert each chunk to uuencode format,
    # then kill the leading "M", translate to the base64 alphabet,
    # and finally append a newline.
    while (s/^((.|\n){45})//) {
	#warn "in:$&:\n";
	$chunk = substr(pack("u", $&), $[+1, 60);
	#warn "packed    :$chunk:\n";
	eval qq{
	    \$chunk =~ tr|$tr_uuencode|$tr_base64|;
	};
	#warn "translated:$chunk:\n";
	$result .= $chunk . "\n";
    }

    # any leftover chars go onto a shorter line
    # with uuencode padding converted to base64 padding
    if ($_ ne "") {
	#warn "length ".length($_)." \$_:$_:\n";
	#warn "enclen ", int((length($_)+2)/3)*4 - (45-length($_))%3, "\n";
	$chunk = substr(pack("u", $_), $[+1,
			int((length($_)+2)/3)*4 - (45-length($_))%3);
	#warn "chunk:$chunk:\n";
	eval qq{
	    \$chunk =~ tr|$tr_uuencode|$tr_base64|;
	};
	#warn "translated:$chunk:\n";
	$result .= $chunk . ($base64_pad x ((60 - length($chunk)) % 4)) . "\n";
    }

    # return result
    $result;
}

sub uuencode
{
    local ($_) = @_;
    local ($result);
    
    # break into chunks of 45 input chars, and use perl's builtin
    # uuencoder to convert each chunk to uuencode format.
    # (newline is added by builtin uuencoder.)
    while (s/^((.|\n){45})//) {
	$result .= pack("u", $&);
    }

    # any leftover chars go onto a shorter line
    # with padding to the next multiple of 4 chars
    if ($_ ne "") {
	$result .= pack("u", $_);
    }

    # return result
    $result;
}

sub uudecode
{
    local ($_) = @_;
    local ($result);
    
    # use perl's builtin uudecoder to convert each line
    while (s/^([^\n]+\n?)//) {
	$result .= unpack("u", $&);
    }

    # return result
    $result;
}

あれ、ここまでは、大した記事じゃないよ?と思うあなた・・・

ここまで、コンパクトにできます

while(<STDIN>){$i.=$_;}print &bd($i);sub bd{local($_)=@_;my $r;$a='ABCDEFGHIJK
LMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';$b=q|`!"#$%&'()*+,-./01
23456789:;<=>?|.'@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_';($c=" ".$b)=~s/\\/\\\\/;$d
="A".$a;eval qq{tr|$d||cd;tr|$d|$c|;};while(s/^(.{60})//){$r.=unpack("u","M".$
&);}if($_ ne ""){$r.=unpack("u",substr($b,length($_)*3/4,1).$_);}$r;}

これを、シェルスクリプトにかませると、こんなふうな感じで展開できるでしょう

cat <<EOF|perl -e 'while(<STDIN>){$z.=$_;}foreach my $i(0x00 .. 0xFF){$x{sprin
tf("%02X",$i)}=chr($i);}$z=~s/([0-9A-F][0-9A-F])/$x{$1}/g;print $z;'>/tmp/b64d
ecode.pl
7768696C65283C535444494E3E297B24692E3D245F3B7D7072696E7420266264282469293B7375
622062647B6C6F63616C28245F293D405F3B6D792024723B24613D274142434445464748494A4B
4C4D4E4F505152535455565758595A6162636465666768696A6B6C6D6E6F707172737475767778
797A303132333435363738392B2F273B24623D717C602122232425262728292A2B2C2D2E2F3031
32333435363738393A3B3C3D3E3F7C2E27404142434445464748494A4B4C4D4E4F505152535455
565758595A5B5C5C5D5E5F273B2824633D2220222E2462293D7E732F5C5C2F5C5C5C5C2F3B2464
3D2241222E24613B6576616C2071717B74727C24647C7C63643B74727C24647C24637C3B7D3B77
68696C6528732F5E282E7B36307D292F2F297B24722E3D756E7061636B282275222C224D222E24
26293B7D696628245F206E65202222297B24722E3D756E7061636B282275222C73756273747228
24622C6C656E67746828245F292A332F342C31292E245F293B7D24723B7D
EOF

アンケート

選択枝 投票
役に立った 0  

このことに関する話題

お名前:
題名:

このホームページは、[さくらのVPS ]で動いています。
www.daiba.cx
yahoo 出会い