活動日誌

2006-09-09

■ [Plagger]Plagger をインストールしてみた。

今さらながら、Plagger をインストールしてみた。

最初は Debian sarge のマシンに CPAN からインストールしようとしたが、あれがインストールできないこれがインストールできないと2時間経ってもインストール完了が見えてこなかったので、あきらめて sid 環境に apt でインストールしたら5分で完了した。変な気を起こさずに最初からこうしておけばよかった。

170個近い依存パッケージがずらっと出てくるのは壮観。

参考: Debian に Plagger をインストール

実は、インストールした後も YAML のインデントを間違えて10分くらい悩んでしまった。

sarge にも入れた

sid 環境は apt で楽々入ったものの、結局がんばって sarge 環境にも入れた。

何となくグローバル環境を汚したくなかったので http://www.otsune.com/bsd/tips/usercpaninstall.html を参考にホームディレクトリ以下にインストールした。

% mkdir ~/local
% echo no | cpan
% vi ~/.cpan/CPAN/MyConfig.pm
% vi ~/.zshenv
% cpan -i Bundle::CPAN
% cpan

で、後は test Plagger からひたすらモジュールをインストールする。これがとても大変なのだが、依存関係の根本にある方のモジュールから地道にインストールしていくと何とか Plagger のインストールまでこぎつけた。

XML::RSS と HTML::RSSAutodiscovery は force でインストールした。

Net::SMTP::TLS がなかなか入らなくて困ったが、 libio-socket-ssl-perl と libnet-ssleay-perl を apt でインストールしたら入った。 (libssl-dev は以前から入っていた)

あと、libxml2-dev とかもいるはず。

メモ

XML::Parser
Test::Base
Spiffy

YAML
Test::More
Test::Base
Class::Accessor::Fast
File::Find::Rule
UNIVERSAL::require
Template
Template::Provider::Encoding
YAML
Text::Tags
DateTime
DateTime::Format::Mail
DateTime::Format::W3CDTF
DateTime::Format::Strptime
Digest::MD5
LWP
HTML::Parser
URI::Fetch
Cache::Cache
Module::Pluggable::Fast
HTML::ResolveLink
Date::Parse
MIME::Types
Net::DNS
XML::Feed
XML::LibXML
XML::Atom
XML::RSS::LibXML
Encode
Term::Encoding
File::HomeDir

Encode::Detect に必要

ExtUtils::ParseXS
Data::Dump

apt で入れたもの

libxml2-dev
libexpat1-dev
libio-socket-ssl-perl

[ツッコミを入れる]

Subscription::Config config.feed.meta.follow_link だと URL を正規表現でしか指定できない。 CustomFeed::Simple にも設定はないようだし、Filter::EntryFullText でも URL でしか指定できないようだ。そんなに特殊な要望ではないと思うので、どこかで実装されていておかしくないとは思うのだけど、見つけられなかった。

あと、YAML でどんな項目が設定できるのかは perldoc に書いてあるのだけど、抜けもあるようだ。(Subscription::Config の meta とか) これはがんばってソースを読むしかないのかな。

[ツッコミを入れる]

2006-09-11

■ [Plagger]UserAgent 文字列を変える

global.user_agent.agent で指定できる。

global:
  user_agent:
    agent: FooBar

■ [Plagger]Subscription::XPath

otsune さんからコメントをいただきました。 (ツッコミは URI を 5 つまで許すようにしてみました。どうもご迷惑をおかけしました。Trackback は……何でだろう?)

http://subtech.g.hatena.ne.jp/otsune/20060911/followlink

確かに Subscription::XPath がそれっぽそうなので試してみたところ……

% w3m -dump_source http://shakenbu.org/yanagi/d/s/20060911/index.html
<html>
<head>
<title>index</title>
</head>
<body>
<div id="hoge">
[<a href="1.html">1</a>]
[<a href="2.html">2</a>]
</div>
</body>
</html>

% cat test2.yaml
plugins:
  - module: Subscription::XPath
    config:
      url: http://shakenbu.org/yanagi/d/s/20060911/index.html
      xpath: //div[@id='hoge']//a

  - module: CustomFeed::Simple

  - module: Publish::Debug


% plagger -c test2.yaml
Plagger [info] plugin Plagger::Plugin::Subscription::XPath loaded.
Plagger [info] plugin Plagger::Plugin::CustomFeed::Simple loaded.
Plagger [info] plugin Plagger::Plugin::Publish::Debug loaded.
Plagger::Util [debug] Fetch remote file from http://shakenbu.org/yanagi/d/s/20060911/index.html
Plagger::Cache [debug] Cache HIT: Subscription-XPath|http://shakenbu.org/yanagi/d/s/20060911/index.html
Plagger [info] plugin Plagger::Plugin::Aggregator::Simple loaded.
Plagger::Plugin::Aggregator::Simple [info] Fetch http://shakenbu.org/yanagi/d/s/20060911/1.html
Plagger::Cache [debug] Cache HIT: Aggregator-Simple|http://shakenbu.org/yanagi/d/s/20060911/1.html
Plagger::Plugin::Aggregator::Simple [debug] 304: http://shakenbu.org/yanagi/d/s/20060911/1.html
Plagger [error] http://shakenbu.org/yanagi/d/s/20060911/1.html is not aggregated by any aggregator
Plagger::Plugin::Aggregator::Simple [info] Fetch http://shakenbu.org/yanagi/d/s/20060911/2.html
Plagger::Cache [debug] Cache HIT: Aggregator-Simple|http://shakenbu.org/yanagi/d/s/20060911/2.html
Plagger::Plugin::Aggregator::Simple [debug] 304: http://shakenbu.org/yanagi/d/s/20060911/2.html
Plagger [error] http://shakenbu.org/yanagi/d/s/20060911/2.html is not aggregated by any aggregator

とエラーになってしまいました。

Subscription::Config で follow_link を書かなかったときと同じエラーだなぁ。

Plagger の YAML ファイルのサンプルを見てまわると、この follow_link は書かれていたり書かれていなかったりして、どんな場合に必要なのかよくわからない。

% cat test.yaml
plugins:
  - module: Subscription::Config
    config:
      feed:
        - url: http://shakenbu.org/yanagi/d/s/20060911/index.html
#          meta:
#            follow_link: .*

  - module: CustomFeed::Simple

  - module: Publish::Debug
% plagger -c test.yaml
Plagger [info] plugin Plagger::Plugin::Subscription::Config loaded.
Plagger [info] plugin Plagger::Plugin::CustomFeed::Simple loaded.
Plagger [info] plugin Plagger::Plugin::Publish::Debug loaded.
Plagger [info] plugin Plagger::Plugin::Aggregator::Simple loaded.
Plagger::Plugin::Aggregator::Simple [info] Fetch http://shakenbu.org/yanagi/d/s/20060911/index.html
Plagger::Cache [debug] Cache MISS: Aggregator-Simple|http://shakenbu.org/yanagi/d/s/20060911/index.html
Plagger::Plugin::Aggregator::Simple [debug] 200: http://shakenbu.org/yanagi/d/s/20060911/index.html
Plagger [error] http://shakenbu.org/yanagi/d/s/20060911/index.html is not aggregated by any aggregator

ソースを追いかけてもみたけれど、Perl は文法からしてよく知らないので頓挫。

本日のツッコミ(全1件) [ツッコミを入れる]

_ otsune [follow_linkしないのはPerlとは無関係で単にXPathが拾えていないのかもしれないので xpath: /..]

2006-09-12

■ トラックバックを受けられませんでした

調べてみたところ、設定ミスにより、この日記へのトラックバックが不可能になっていました。

現在は問題を修正済みです。

ご迷惑をおかけしました。

[ツッコミを入れる]

2006-09-13

■ [Plagger]assets_path の勘違い

範囲を限定してリンクを抜き出すというのは、#plagger-ja で教えていただいて Subscription::Config や Subscription::XPath で解決しました。

が……

それを動かすまでに、assets_path でおおはまり。

「ブロックプラグインごとに指定した assets_path では、モジュール名は補われない」

http://subtech.g.hatena.ne.jp/miyagawa/20060819/1155980135

■ [Plagger]CustomFeed::Config の extract で先読みを使う

CustomFeed::Config の extract に指定する正規表現で先読み (?=...) を使っているとうまく動かなかったので対応しました。とりあえずちゃんと動いているようには見えます。

--- Config.pm   2006-09-13 01:44:55.518269472 +0900
+++ local/plagger/lib/Plagger/Plugin/CustomFeed/Config.pm       2006-09-13 01:44:29.439234088 +0900
@@ -95,18 +95,23 @@
         Plagger->context->error($@) if $@;
     }

+    my $prev_pos = 0;
+    my $cur_pos = 0;
     while (1) {
         my $data;

         my $extract = decode_content($plugin->{extract});
         if ($content =~ /$extract/sg) {
-            if (my @match = $& =~ /$plugin->{extract}/s) {
+            $cur_pos = pos $content;
+            my $str = substr($content, $prev_pos, length($content));
+            if (my @match = $str =~ /$plugin->{extract}/s) {
                 my @capture = split /\s+/, $plugin->{extract_capture};
                 for my $m (@match) {
                     my $val = shift @capture;
                     $data->{$val} = $data->{$val} . $m;
                 }
             }
+            $prev_pos = $cur_pos;
         }

         if ($plugin->{extract_xpath}) {

本日のツッコミ(全1件) [ツッコミを入れる]

_ TrackBack [http://moteruzyutu.seesaa.net/article/24003734.html 失恋が人生を..]

2006-09-15

■ [Plagger]DLsite.com の新着作品

http://home.dlsite.com/new を Plagger で。

新着作品以外へのリンクがないので、本当は

follow_link: /work/=/product_id/

だけでいいけど、一応 follow_xpath で限定してみる。

config.yaml

plugins:
  - module: Subscription::Config
    config:
      feed:
        - url: http://home.dlsite.com/new
          meta:
            follow_xpath: //div[@class='infomation_2']//a[contains(@href, 'http://home.dlsite.com/work/=/product_id/')]

  - module: CustomFeed::Simple
  - module: Filter::EntryFullText
  - module: Filter::FindEnclosures
  - module: Filter::Rule
    rule:
      module: Deduped

assets/plugins/Filter-EntryFullText/dlsite.yaml

author: Kouhei Yanagita
handle: http://home\.dlsite\.com/work/=/product_id/
extract: <div class="centerbar">.*?<div class="center_bar_messege">.*?</div>.*?</div>(.*?<TD valign="top" align="left" class="works_name1">(.*?)</TD>.*?)<table cellspacing="10" cellpadding="0" align="center">
extract_capture: body title

作ってはみたものの、DLsite.com で買い物をしたことはなかったりする。

■ [Plagger]こもれびのーと

config.yaml

plugins:
  - module: Subscription::Config
    config:
      feed:
        - url: http://komorebi-note.com/top/index.html

  - module: CustomFeed::Config
  - module: Filter::Rule
    rule:
      module: Deduped
      compare_body: 1
  - module: Filter::FindEnclosures
  - module: Filter::BreakEntriesToFeeds
  - module: Publish::Gmail
    config:
      mailto: ...

assets/plugins/CustomFeed-Config/komorebi-notebook.yaml

author: Kouhei Yanagita
match: http://komorebi-note.com/top/index.html
extract: <p>(?:<font size="2">)?(\d\d?/\d\d?)(?:</font>)?</p>(.*?)(?=<p>(?:<font size="2">)?\d\d?/\d\d?(?:</font>)?</p>)
extract_capture: title body
extract_after_hook: $data->{link} = 'http://komorebi-note.com/top/index.html#' . $data->{title}

が必要です。

[ツッコミを入れる]

2006-09-16

■ [Plagger]Preprocess

Plagger でログインが必要なページにアクセスするには、cookie のファイルを用意して

global:
  user_agent:
    cookies: /path/to/cookie-file

とすればいい。

しかし、ログインの寿命が短いページだと、このファイルを頻繁に作りなおさないといけない。

ということで、

plugins:
  - module: Preprocess
    config:
      command: /path/to/make-cookie

みたいな感じでコマンドを用意すると、フィードの処理に先立ってコマンドが実行されるような仕組みがあるといいんじゃないかと思った。

% /path/to/make-cookie; plagger

すればいいというのはまぁそうなんだけど、plagger で完結できるというのはよさげ。

cookie をパイプから読む

……と Preprocess の話を書いてて思ったが、cookies がパイプから読めるようになればそれでもいいのか。

cookie をパイプから読むようにしても警告を出なくするパッチ。

Index: lib/Plagger/Cookies.pm
===================================================================
--- lib/Plagger/Cookies.pm      (リビジョン 1692)
+++ lib/Plagger/Cookies.pm      (作業コピー)
@@ -21,7 +21,7 @@
         Plagger->context->log(debug => "$conf->{file} => $impl");
         $impl->require or Plagger->context->error("Error loading $impl: $@");

-        if ($conf->{file} && !-e $conf->{file}) {
+        if ($conf->{file} && !-e $conf->{file} && $conf->{file} !~ /\|$/) {
             Plagger->context->log(warn => "$conf->{file}: $!");
         }

config.yaml

 global:
   user_agent:
     cookies:
       file: 'cat ~/.w3m/cookie |'
       type: w3m

(もちろん、実際には cookie を作るコマンドを書く)

■ [Plagger]同人ど〜らく

config.yaml

plugins:
  - module: Subscription::Config
    config:
      feed:
        - url: http://www.doujingame.com/
  - module: CustomFeed::Config
  - module: Filter::Rule
    rule:
      module: Deduped
      compare_body: 1
  - module: Filter::FindEnclosures
  - module: Filter::BreakEntriesToFeeds

assets/plugins/CustomFeed-Config/doujingame.yaml

author: Kouhei Yanagita
match: http://www\.doujingame\.com/
extract: <tr>[ \r\n]*<td bgcolor=#c7e3ff align=left valign=top><img src="img/kabe03.gif" width=14 height=14></td>[ \r\n]*<td bgcolor=#c7e3ff align=left valign=top><font size=4 style="font-size:16px">[ \r\n]*<b>.*?<u>(\d\d?).*?(\d\d?).*?</u>.*?</b></font></td>[ \r\n]*<td bgcolor=#c7e3ff align=left valign=top><img src="img/kabe03.gif" width=14 height=14></td>[ \r\n]*</tr>(.*?)<tr>[ \r\n]*<td bgcolor=#c7e3ff align=center valign=top colspan=3><hr size=1 width=94%></td>[ \r\n]*</tr>(?=[ \r\n]*<tr>[ \r\n]*<td bgcolor=#c7e3ff align=left valign=top><img src="img/kabe03.gif" width=14 height=14></td>[ \r\n]*(?:<td bgcolor=#c7e3ff align=left valign=top><font size=4 style="font-size:16px">|<td bgcolor=#c7e3ff align=left valign=top>[ \r\n]*<table border=0 bgcolor=#c7e3ff width=100%><form action="./event/form/form_top.php" method="post" target="_blank">))
extract_capture: month day body
extract_after_hook: $data->{body} = '<table border=0 cellpadding=3 cellspacing=0>' . $data->{body} . '</table>'; $data->{title} = $data->{month} . '/' . $data->{day}; $data->{link} = 'http://www.doujingame.com/#' . $data->{title}

が必要です。

[ツッコミを入れる]

2006-09-18

■ [Plagger]CustomFeed::Simple でフィードのタイトルを指定する

CustomFeed::Simple を使った場合、フィードのタイトルはページの <title> 要素の中身になる。

こちらで指定したい場合があったので、指定できるようにしてみた。

Index: lib/Plagger/Plugin/CustomFeed/Simple.pm
===================================================================
--- lib/Plagger/Plugin/CustomFeed/Simple.pm     (リビジョン 1694)
+++ lib/Plagger/Plugin/CustomFeed/Simple.pm     (作業コピー)
@@ -44,7 +44,7 @@
     }

     my $content = decode_content($res);
-    my $title   = extract_title($content);
+    my $title   = $self->conf->{title} || extract_title($content);

     my $feed = Plagger::Feed->new;
     $feed->title($title);

config.yaml

  - module: CustomFeed::Simple
    config:
      title: ほげほげ

修正

mizzy さんのコメントをもとに、 Subscription::Config での指定を生かすように修正しました。

Index: lib/Plagger/Plugin/CustomFeed/Simple.pm
===================================================================
--- lib/Plagger/Plugin/CustomFeed/Simple.pm     (リビジョン 1694)
+++ lib/Plagger/Plugin/CustomFeed/Simple.pm     (作業コピー)
@@ -44,7 +44,7 @@
     }

     my $content = decode_content($res);
-    my $title   = extract_title($content);
+    my $title   = $args->{feed}->title || extract_title($content);

     my $feed = Plagger::Feed->new;
     $feed->title($title);

config.yaml

  - module: Subscription::Config
    config:
      feed:
        - url: http://....
          meta:
            follow_link: ....
          title: ほげほげ
  - module: CustomFeed::Simple

■ [Plagger]Filter::Pile を使うと途中で切れる

Filter::Pipe を使うと、エントリの body をパイプに渡して処理することができる。

しかし、どうも body が途中で切れてしまうことがあるようだ。

IPC::Run でどうやれば最後まで読めるのかは調べてみたがよく分からなかった。 (とりあえず ad hoc な対処としては、たくさん pump_nb すればそれなりに読めるが、それでも切れるようだ)

本日のツッコミ(全2件) [ツッコミを入れる]

_ mizzy [CustomFeed::Simple のタイトル指定ですが、これだと複数のフィードがある場合も、全部同じタイトルにな..]

_ yanagi [ありがとうございます。複数フィードのことは考えから抜けていました。確かにそちらの方がいいですね。]

2006-09-20

■ [Plagger]さざなみ壊変

assets/plugins/CustomFeed-Config/sazanami.yaml

author: Kouhei Yanagita
match: http://www5\.ocn\.ne\.jp/~yoc/gra\.html
extract: <TR>\s*<TD valign="top" bgcolor="#5111c8" width="100%" height="10" align="center"><A name="(\d+)" href="(#\d+)" target="_self"><FONT size="-1" color="#eeeeee">.*?</FONT></A><FONT size="-1" color="#eeeeee">(.*?)</FONT></TD>\s*</TR>\s*<TR valign="top">\s*<TD bgcolor="#f4f5ff" width="779" valign="top" align="left">(.*?)</TD>\s*</TR>
extract_capture: date link_anchor title body
extract_after_hook: |
  $data->{link} = 'http://www5.ocn.ne.jp/~yoc/gra.html' . $data->{link_anchor};
  $data->{author} = "\x{304b}\x{305a}\x{3074}\x{30fc}";

本日のツッコミ(全1件) [ツッコミを入れる]

_ TrackBack [http://www.board123.com/forums/index.php?mforum=iii discou..]

2006-09-23

■ [Plagger]Subscription::Config の follow_link と Filter::EntryFullText の custom_feed_follow_* の違い

Subscription::Config では follow_link でエントリとして取りこむリンクを設定できる。

一方、Filter::EntryFullText にも custom_feed_follow_link / custom_feed_follow_xpath という似たような設定がある。 http://wiki.shibuya.pl/?HowToEntryFullText

ソースを見ながらこの違いを調べてみたところ、以下のような結論に達した。

Subscription::Config の meta->{follow_link} を見ているのは CustomFeed::Simple だけ。よって、meta->{follow_link} を設定しても CustomFeed::Simple を使わなかったら意味はない。

Filter::EntryFullText で custom_feed_handle と custom_feed_follow_link / custom_feed_follow_xpath を指定した場合、 CustomFeed::Simple はいらない。

「meta->{follow_link} + CustomFeed::Simple」と「Filter::EntryFullText の custom_feed_* の設定」の両方が設定された場合、 follow_link の方が優先される。

■ [Plagger]プラグインがどのフックで実行されるかの一覧

fgrep '=> \&' * で調べたので抜けがあるかも。だいたい実行される順。

plugin.init
  Filter::Babelfish.pm
  Filter::GuessLanguage.pm
  Filter::POPFile.pm
  Filter::SpamAssassin.pm
  Notify::Balloon.pm
  Notify::Command.pm
  Notify::IRC.pm
  Publish::Delicious.pm
  Publish::Feed.pm
  Publish::HatenaBookmark.pm
  Publish::IMAP.pm
  Publish::Maildir.pm
  Publish::Serializer.pm
  Search::KinoSearch.pm
  Widget::Simple.pm

subscription.load
  CustomFeed::AmazonAssociateReportJP.pm
  CustomFeed::Debug.pm
  CustomFeed::FlickrSearch.pm
  CustomFeed::Frepa.pm
  CustomFeed::Mixi.pm
  CustomFeed::POP3.pm
  CustomFeed::SVNLog.pm
  CustomFeed::YouTube.pm
  CustomFeed::iTunesRecentPlay.pm
  Subscription::2chThreadList.pm
  Subscription::Bloglines.pm
  Subscription::Bloglines.pm
  Subscription::DBI.pm
  Subscription::FOAF.pm
  Subscription::Feed.pm
  Subscription::File.pm
  Subscription::HatenaGroup.pm
  Subscription::HatenaRSS.pm
  Subscription::LivedoorReader.pm
  Subscription::OPML.pm
  Subscription::Odeo.pm
  Subscription::PingServer.pm
  Subscription::PlanetINI.pm

customfeed.handle
  Aggregator::Null.pm
  Aggregator::Simple.pm
  Aggregator::Xango.pm
  CustomFeed::2chSearch.pm
  CustomFeed::BloglinesCitations.pm
  CustomFeed::GoogleNews.pm
  CustomFeed::Mailman.pm
  CustomFeed::MixiDiarySearch.pm
  CustomFeed::PerlMonks.pm
  CustomFeed::Simple.pm
  Filter::EntryFullText.pm

aggregator.finalize
  Aggregator::Xango.pm

update.entry.fixup
  Filter::2chNewsokuTitle.pm
  Filter::2chRSSContent.pm
  Filter::Babelfish.pm
  Filter::Base.pm
  Filter::BloglinesContentNormalize.pm
  Filter::BulkfeedsTerms.pm
  Filter::Delicious.pm
  Filter::DeliciousFeedTags.pm
  Filter::Emoticon.pm
  Filter::EntryFullText.pm
  Filter::FeedBurnerPermalink.pm
  Filter::FeedFlareStripper.pm
  Filter::FetchEnclosure.pm
  Filter::FindEnclosures.pm
  Filter::FloatingDateTime.pm
  Filter::GuessLanguage.pm
  Filter::HEADEnclosureMetadata.pm
  Filter::HTMLScrubber.pm
  Filter::HatenaBookmarkTag.pm
  Filter::HatenaDiaryKeywordLink.pm
  Filter::HatenaDiaryKeywordUnlink.pm
  Filter::HatenaFormat.pm
  Filter::HatenaKeywordTag.pm
  Filter::ImageInfo.pm
  Filter::LivedoorKeywordUnlink.pm
  Filter::Markdown.pm
  Filter::POPFile.pm
  Filter::Pipe.pm
  Filter::ResolveRelativeLink.pm
  Filter::RewriteEnclosureURL.pm
  Filter::SpamAssassin.pm
  Filter::StripRSSAd.pm
  Filter::StripTagsFromTitle.pm
  Filter::Thumbnail.pm
  Filter::TruePermalink.pm
  Filter::UnicodeNormalize.pm
update.feed.fixup
  Filter::BlogPet.pm
  Filter::BreakEntriesToFeeds.pm
  Filter::ExtractAuthorName.pm
  Filter::HatenaBookmarkUsersCount.pm
  Filter::ImageInfo.pm
  Filter::TagsToTitle.pm
  Filter::Thumbnail.pm
  Filter::tDiaryComment.pm
update.fixup
  Filter::POPFile.pm
  Filter::URLBL.pm

smartfeed.init
  Filter::CompositeFeed.pm
smartfeed.entry
  Filter::Rule.pm
smartfeed.feed
  Filter::CompositeFeed.pm
  Filter::Rule.pm
smartfeed.finalize
  Filter::CompositeFeed.pm

publish.init
  Notify::Campfire.pm
  Notify::Growl.pm
  Publish::Excel.pm
  Publish::Gmail.pm
  Publish::LivedoorClip.pm
  Publish::PowerPoint.pm
publish.entry.fixup
  Widget::BloglinesSubscription.pm
  Widget::BulkfeedsSpamReport.pm
  Widget::Delicious.pm
  Widget::HatenaBookmark.pm
  Widget::HatenaBookmarkUsersCount.pm
  Widget::Simple.pm
publish.feed
  Notify:: Beep.pm
  Notify:: Command.pm
  Notify:: Eject.pm
  Notify:: MSAgent.pm
  Notify:: NetSend.pm
  Notify:: SSTP.pm
  Notify:: Tiarra.pm
  Notify:: UpdatePing.pm
  Publish::2chdat.pm
  Publish::CHTML.pm
  Publish::CSV.pm
  Publish::Debug.pm
  Publish::Excel.pm
  Publish::Feed.pm
  Publish::Gmail.pm
  Publish::JSON.pm
  Publish::JavaScript.pm
  Publish::MT.pm
  Publish::MTWidget.pm
  Publish::OutlineText.pm
  Publish::PDF.pm
  Publish::PSP.pm
  Publish::PalmDoc.pm
  Publish::Pipe.pm
  Publish::Planet.pm
  Publish::PowerPoint.pm
  Publish::SWF.pm
  Publish::Serializer.pm
  Publish::Takahashi.pm
  Search::Namazu.pm
  Search::Rast.pm
publish.entry
  Notify::Audio.pm
  Notify::Balloon.pm
  Notify::Campfire.pm
  Notify::Growl.pm
  Notify::IRC.pm
  Publish::Delicious.pm
  Publish::HatenaBookmark.pm
  Publish::IMAP.pm
  Publish::LivedoorClip.pm
  Publish::Maildir.pm
  Publish::Playlog.pm
  Search::Estraier.pm
  Search::Grep.pm
  Search::KinoSearch.pm
  Search::Spotlight.pm
publish.finalize
  Notify:: Audio.pm
  Notify:: Beep.pm
  Notify:: Command.pm
  Notify:: Eject.pm
  Publish::2chdat.pm
  Publish::CHTML.pm
  Publish::FOAFRoll.pm
  Publish::IMAP.pm
  Publish::MTWidget.pm
  Publish::Maildir.pm
  Publish::OPML.pm
  Publish::OutlineText.pm
  Publish::PSP.pm
  Publish::PalmDoc.pm
  Publish::Takahashi.pm
  Search::Namazu.pm
  Search::Rast.pm

aggregator.entry.fixup
  AtomLinkRelated.pm
  FeedBurnerPermalink.pm
aggregator.filter.feed
  RSSLiberalDateTime.pm
  RSSTimeZoneString.pm

plugin.finalize
  Search::KinoSearch.pm

useragent.init
  UserAgent::AuthenRequest.pm
useragent.request
  UserAgent::RequestHeader.pm

searcher.search
  Search::Estraier.pm
  Search::Grep.pm
  Search::KinoSearch.pm
  Search::Rast.pm

■ tDiary 更新

tDiary をバージョンアップした。

CVS につながらなかったので調べてみたところ、サーバ名が変わっていたようだ。 http://www.tdiary.org/ml/devel.rb?key=/mailarchive/forum.php%3Fthread_id%3D10315332%26forum_id%3D8349

cvs -d:pserver:anonymous@tdiary.cvs.sourceforge.net:/cvsroot/tdiary login

でログインし、

find . -name Root | xargs ruby -p -i.bak -e '$_.gsub!(/(cvs\.sourceforge\.net)/){ "tdiary.#{$1}" }'

でリポジトリを書き変えて cvs up で更新できた。

[ツッコミを入れる]

2006-09-24

■ [Plagger]EntryFullText で、URI::Fetch の NoNetwork を設定できるようにする

Filter::EntryFullText はエントリの全文を取得します。 URI::Fetch の引数として NoNetwork に 3 時間を渡しているので、キャッシュがあれば 3 時間はサーバへのアクセスは発生しません。

3 時間を過ぎた場合でも、If-Modified-Since 付きでアクセスするので毎回ページをダウンロードしてくることにはならないのですが、それでもエントリが 100 個あったとすると、 100 回のアクセスが発生してしまいます。

1 回取得したエントリは更新されないと仮定してよい場合、 NoNetwork の値を十分長く設定できると、サーバへのアクセスが発生しないのでよさげです。

というわけで、NoNetwork の値を設定できるようにしました。

Index: lib/Plagger/Plugin/Filter/EntryFullText.pm
===================================================================
--- lib/Plagger/Plugin/Filter/EntryFullText.pm  (リビジョン 1698)
+++ lib/Plagger/Plugin/Filter/EntryFullText.pm  (作業コピー)
@@ -106,7 +106,7 @@
     }

     # NoNetwork: don't connect for 3 hours
-    my $res = $self->{ua}->fetch( $args->{entry}->permalink, $self, { NoNetwork => 60 * 60 * 3 } );
+    my $res = $self->{ua}->fetch( $args->{entry}->permalink, $self, { NoNetwork => $self->conf->{no_network} || 60 * 60 * 3 } );
     if (!$res->status && $res->is_error) {
         $self->log(debug => "Fetch " . $args->{entry}->permalink . " failed");
         return;

Filter::Rule は Filter::EntryFullText より後に実行されるのでアクセスの低減には効果ありません。

■ [Plagger]Publish::Feed の RSS だと author は nobody@example.com を変更できないっぽい

括弧付きで名前は入れられるけど、メールアドレスは nobody@example.com がソースに埋めこまれてて変更できないっぽい。

■ [Plagger]SmartFeed::All は各エントリの permalink で重複を削る

よって、permalink がエントリごとに一意についていないと正しく動かない。

■ [Plagger]CustomFeed::Config 修正

http://plagger.g.hatena.ne.jp/Seacolor/20060921/1158820398

かっこの中に日本語を使ったときにマッチがうまくいかなかったので修正。

--- Config.pm   2006-09-23 16:28:40.079617696 +0900
+++ local/plagger/lib/Plagger/Plugin/CustomFeed/Config.pm       2006-09-24 19:41:15.876752960 +0900
@@ -131,7 +131,7 @@
             if ($content =~ /$extract/sg) {
                 $cur_pos = pos $content;
                 my $str = substr($content, $prev_pos, length($content));
-                if (my @match = $str =~ /$plugin->{extract}/s) {
+                if (my @match = $str =~ /$extract/s) {
                     my @capture = split /\s+/, $plugin->{extract_capture};
                     for my $m (@match) {
                         my $val = shift @capture;

[ツッコミを入れる]

2006-09-30

■ キーボードに求めるもの

英語キーボード
A の左にあるキーや Shift キー、Enter キーなどが「欠けていない」こと (キートップがキーの全面でないものがよくある)
Windows キーがあること
ファンクションキーがあること

とりあえず思いつくのはこれくらいかなぁ。

これを全部満たすキーボードはなかなか見つけづらい。

あとは、もちろん実際の打ちごこちがしっくりくるかとかも考慮する。

今自宅で使っているのは昔 (5年以上前?)、2000円もしないで買ったものだけど、とても気に入っている。

[ツッコミを入れる]

2005|02|03|04|05|06|08|09|10|11|12|

2006|01|02|03|04|05|06|07|08|09|10|11|12|

2007|01|03|04|05|06|10|

2008|04|10|

2009|10|

2010|05|08|

2012|01|02|03|

2014|01|

2022|05|

トップ最新

活動日誌

2006-09-09

■ [Plagger]Plagger をインストールしてみた。

sarge にも入れた

メモ

2006-09-10

■ [Plagger]元ページでリンクをたどる範囲を指定するにはどうしたらいいか

2006-09-11

■ [Plagger]UserAgent 文字列を変える

■ [Plagger]Subscription::XPath

2006-09-12

■ トラックバックを受けられませんでした

2006-09-13

■ [Plagger]assets_path の勘違い

■ [Plagger]CustomFeed::Config の extract で先読みを使う

2006-09-15

■ [Plagger]DLsite.com の新着作品

■ [Plagger]こもれびのーと

2006-09-16

■ [Plagger]Preprocess

cookie をパイプから読む

■ [Plagger]同人ど〜らく

2006-09-18

■ [Plagger]CustomFeed::Simple でフィードのタイトルを指定する

修正

■ [Plagger]Filter::Pile を使うと途中で切れる

2006-09-20

■ [Plagger]さざなみ壊変

2006-09-23

■ [Plagger]Subscription::Config の follow_link と Filter::EntryFullText の custom_feed_follow_* の違い

■ [Plagger]プラグインがどのフックで実行されるかの一覧

■ tDiary 更新

2006-09-24

■ [Plagger]EntryFullText で、URI::Fetch の NoNetwork を設定できるようにする

■ [Plagger]Publish::Feed の RSS だと author は nobody@example.com を変更できないっぽい

■ [Plagger]SmartFeed::All は各エントリの permalink で重複を削る

■ [Plagger]CustomFeed::Config 修正

2006-09-30

■ キーボードに求めるもの