由於 debian.org.tw 以及許多社群網站都是多人之團隊維護的系統,團隊中許多人都有 root 權限,如果團隊間沒有協調妥當,常常另一人改了設定,下一人又改回來,造成設定上得困擾。於是我們需要一個系統可以追蹤、紀錄各種設定的修改。

有些方法是把設定檔目錄丟進 svk, bzr, mercurial 等等的版本控制軟體。但是往往會忘記將改過的設定檔儲存 (commit),以至於日後又忘記是誰動的手腳。近日裝了 metche,發現他大致可以滿足我的需求。metche 的基本功能是協助你監視 /etc 下面的設定檔,並在更動的時候 Email 一份 diff 給指定的電子郵件參考。預設 diff 是列出更動過的檔案,也可以要求 metche 顯示所有的修改細節。但是如果要求其顯示所有的修改內容,可能會誤把某些機要資訊像是密碼之類寄出來,為了避免資訊外漏,metche 也可以設定成寄出的信件使用 PGP 加密,只需要確定收件人的 PGP Key 在 root 的 keyring 中即可。

Continue reading

It’s been almost one year for not maintaining the wiki.debian.org.tw web site. Since I joined the current company, I spent all my time for dealing with routine jobs every signal day. I don’t even want to use my laptop at home, after I finish the jobs every day.

Lately, the wiki.debian.org.tw becomes more unstable. People usually see `Service is not available’ pages in the last couple weeks. One of the reason is the disk is full, the other reason is there are too much spam articles.

Finally, I spent a few hours this weekend for the site. First thing I do, it’s to upgrade the server and the mediawiki software. Frankly speaking, it’s not hard at all, since the wiki is installed in a vserver based on the Debian. All I need to do is running `aptitude dist-upgrade’, to upgrade the distribution from sarge to etch. And then I sync the mediawiki source tree, from 1.7.1 to 1.11.1. It’s also very easy, since mediawiki provide a upgrade script for check and modify the database schema.

The real problem is the thousands of spam articles. Since I have been for a long time not handing the spam problem, and more of the wiki moderators do not check the spam frequently. The spammers are easy to posts a lot of articles without supervision. Even through the moderators come to the wiki site ofter, it’s still impossible to delete the spams through the web interface, due to too much spammers.

Anyhow, the result is I got the thousands of spams in the database. Most of them are advertisements of venereal diseases treatment, they help you to deal with syphilis, gonorrhea and herpes. I all most want to change the wiki’s name as `SafeSexpedia‘, it’s become an informative knowledge base.

Still, I can not stand for the spamming situation. The first two things I do is install the reCAPTCHA MediaWiki Extension, so people need to pass CAPTCHA when they try to register an account. Also, I enabled $wgEmailConfirmToEdit which means only allow the account with email confirmed editing the pages. These two approach would be good enough to stop the new spammers. However, the real problem is the spam articles already in the database.

In order to clean up the database, I check several extensions like Nuke. However, I found they are not convenient for clean up thousands of spam articles. I decided to use APIs. The good thing is there are two scripts in the mediawiki/maintenance folder, cleanupSpam.php and removeUnusedAccounts.php.The cleanupSpam seems fit my requirements, it takes url as argument, and find out all the article which contains the url and remove it.

However, I don’t want to check the articles one by one for looking the urls. Since most of the spammers on the wiki.debian.org.tw are from China, most of them use the email address at 163.com. The most easy way for me, is just clean up all the accounts from 163.com and all of the articles posted by these accounts. And of course I can not just delete these articles. Because the spammer can modify any articles they want. In this case, I might remove some important articles by modified by spammers.

So, I need to have a script, the purpose of the script is find out the accounts with special email or nickname. And find out all of the articles modified by the account. For the article, if

  • If the account is not the latest editor, then we ignore the article. Because someone might already fix the content manually.
  • If the account is the latest editor and the article is created by the account, and it has signal version. Then we simply delete it.
  • If the account is the latest editor and there are earlier version, we found the last version which edited by a valid account. And we restore the article to that version. So we could have the right content for the article, before the spammer put the links into it.

I created another script based on the maintenance samples, thanks for these developers. With the script, I deleted hundreds of accounts and more then 2 thousands articles in a few hours. If you are interested about the script, you can download it from here. Put it in your mediawiki/maintenance folder. The usage is very simple

USAGE: php removeSpamAccountsAndPost.php [--delete] email

It takes only one parameter, you can find the articles by nickname or email. My database is mysql, so you can use ‘%’ as pattern matching for LIKE statement.

php removeSpamAccountsAndPost.php chihchun
php removeSpamAccountsAndPost.php chihchun%

The script only give you a list for preview by default, if you are sure that these accounts and articles should be deleted. Please add `–delete’ for let the script REAL DELETING THE ACCOUNT AND ARTICLES for you.

php removeSpamAccountsAndPost.php --delete chihchun

去年年中的時候架了一個 SmokePing 來監測某公司幾個服務的 Network Latency 問題,用 SmokePing 的原因是他支援數種協定,所以我可以一口氣拿來監測 DNS, SSH Daemon, RADIUS, Web, SMTP 等。而且 SmokePing 的架構頗模組化,只要稍加修改幾個 Perl Script 就可以很快的滿足我的需求。

不過既然已經隨時偵測網路服務,光是使用電子郵件通知也稍嫌不夠即時。於是起意做了簡訊通知功能,隨意找了幾個 SMS 服務供應商,決定拿便宜的 PCHOME 一元簡訊來頂著用。感謝 SnowFLY (飄然似雪) 做了 SMS PCHOMENet-SMS-PChome Perl module,省了不少功夫。也因此半夜時常被簡訊吵醒。

不過 CPAN 上的版本是 2006 年,跟目前的 PCHOME 網頁不太相容,稍加修改後如
Continue reading

Here is my little script for `incremental’ dump svn revision trees. The script just check every svn repositories which located at /home/svn, and save it in /home/backup by versions.

#!/bin/sh

for dir in /home/svn/* ; do
    name=$(basename ${dir})
    version=$(svnlook youngest ${dir})
    for ((r=1;r<${version};r++)) ; do
	if [ ! -f "/home/backup/${name}-$r.gz" ] ; then
		svnadmin dump ${dir} -r $r --incremental | \
			gzip -9> /home/backup/${name}-$r.gz
	fi
    done
done

Have fun! This is a tip.

If you ever read my blog entry for setting up the Debian.org.tw, you probably already know that I love to use reverse proxy in the front of my web servers. This approach can solve the signal IP address for multiple Vservers problem, also it can provide web cache which reducing the server loading.

Since the proxy server (Squid) pass the http session to the real web servers, one of the problem is that my web servers always saw signal source IP address, which is the proxy’s IP address. Even through the proxy server still put the client’s IP in the `X-Forwarded-For’ http header, it’s still painful to retrieve the correct IP address from the head in every web application.

Thanks for Thomas Eibner, who wrote the reverse proxy add forward module for apache. The module simply check the IP address to see if it comes from the proxy server, if it is it will put the IP address in `X-Forwarded-Host’ or `X-Host’ to `Host’ header. So you don’t need to worry about the wrong IP address, and track the http requests more easily.

Debian package is ported by Piotr Roszatycki, but it’s still the old 0.5 version. Since the 0.6 is out, I filed a bugreport for remind him. For my etch servers, I back-ported the package with the last version. You can download it from my personal repository.

BTW, Piotr Roszatycki use yada for libapache2-mod-rpaf, who is also the maintainer of yada. After reading the yada’s script file `debian/packages’, I really feel like I went to my `good’ old days with RPM/specs. :p

過年時,為了紓解平日工作的緊張,覺得帶著幾本小說到一個寧靜的地方度過幾日假期,放空一下腦袋。由於今年冬天頗冷,平日在台北偶而週末也會到北投去泡湯。放假時也喜歡或者去山裡泡湯,或者待在海邊見海。花了不少時間調查了東部一帶的民宿,終於選定一家遠離塵囂的民宿 – 知本秘境

這是位在知本樂山/藥山半山腰的民宿,遠離煩人的知本溫泉街,若你前往知本,一定會經過知本溫泉街上總不停煩人的民宿與旅店仲介業務。待你安全在狹窄的街上躲開好客喜騎摩托車攔人的業務們,再往前行約五分鐘車程,就可以尋著知本老爺下的上坡車道沿著小徑到山腰上的民宿。

話說老闆原本住鹿野,為了滿足夢想,等小孩長大後遷移到知本。原本老闆並不打算把這裡蓋成民宿,而是親友飲茶的所在,後來為了種種因素,才轉型成民宿供一般人住宿。我也因此才有機會享受到一邊泡湯、一邊享受美麗的山景的特別情調。

知本秘境大門

這裡的特色是位於山中,周圍沒有其他居民,十分安靜。且主人對周圍環境整理極好,花草繽紛。主人待人親切和善,即便我這種不善言詞的 Geek 也可交談甚歡。居住在此,十分有益健康,且由於房間內沒有提供電視、電話,可輕易切斷所有資訊來源,非常事宜完全放鬆的暫時遠離程式。

房內浴室採乾濕分離,使用檜木和石板建成浴池,兩人入浴尚還有空間,浴池都做了兩面落地窗,一面泡湯,一面可見天空或知本溪面對面之山景,可惜的我一連在知本待了四天三夜,天氣一直沒放晴,沒有機會欣賞到夜晚的星空。且溫泉水溫潤,常常泡完溫泉後,可以頓時入睡到天明。除非有獼猴撒野 (有一晚如此),否則在陽台上,靜下心來都可聽到山脈、鳥叫、蟲鳴。居住期間也享用了健康清爽的食物,常是主人自種的山菜等。

知本密境 房間 知本密境 浴池

平日住宿費用是兩人房每日 2000 元,相較於山旁的知本老爺設備與環境與費用都較佳,十分划算。如不計較民宿與專業飯店所提供之服務落差,絕對是一個值得前往渡假的好地方。

知本森林公園森林步道拍攝,林間位於半山腰的獨棟建築即為知本祕境。

知本祕境

知本祕境民宿

  • 954台東縣卑南鄉溫泉村龍泉路113巷58號
  • 知本老爺大飯店旁邊的小路走到見涼亭處,夜晚無路燈且路窄,建議白日就前往。

延伸閱讀

Update: 2008-09-05 改增文句意思。