上一篇提到可以透過 debconf 的無交互介面的軟體安裝,其實你尚可以用 debconf 來做 preseeding (sarge 的中文手冊也提到 preseeding)的設定檔。Preseeding 是一種製作「無人值守」安裝光碟的方法,基本上就是以預先設定回答所有安裝過程中會詢問的問題,因此你可以先做好一片預定安裝的軟體與設定的安裝光碟/設定檔,然後使用該設定自動安裝機器。透過 debconf 的幫忙,你可以先架設一台 Debian 主機作為「種子」,然後以該種子的設定複製到其它的新安裝機器上。作法相當容易,你可以使用 debconf-get-selections 取得種子機器的所有設定,透過參數程式會吐出一段樣本,再依據需求校改即可。

# debconf-get-selections --installer

相關的語法同手冊描述,修改完成的 preseed.cfg 可於置於網路或磁碟中,再於一般的 Debian 安裝光碟中指定路徑即可。雖說相關的指令不難,但是若要修改到可以順利安裝與調整的程度,還是需要相當多時間調校,相關說明不妨參考 Automating new Debian installations with preseeding

AndrewLee 利用 preseeding 做了一組 Debian for Beginners,內容整合了中文相關的套件與常用的軟體。陸陸續續也校修了不少問題,有興趣使用 Debian 的人不妨試用安裝。至於想偷學技巧人,請往 http://debian.org.tw/d-i/etch/preseed.cfg 挖寶去吧。

Once you have many Debian servers, maintenance would be a problem. I just counted that I have more then 30 etch servers running in several vserver machines. Sometimes, I would like to install Debian package in all of these servers. However, it takes too much time to ssh /vserver enter into every hosts, and answer the installation questions one by one.

Thanks for the debconf(1), it’s quite easy to do non-interactive installation, since debconf already provide a noninteractive frontend. All you need to do is set the configuration before you install the package. It can be done by debconf-set-selections.

First, you have to install the package in one hosts. It would be better if you install/test the package on the same distribution version and package version. Here is an example for install localepurge. localepurge is a software for superfluous locale data, that will save you some disk space. As a Chinese, I usually don’t need Spanish, Franch and any other hundreds of different locale data.

Once you install the localepurge, you can use debconf-get-selections to dump the configuration you did.  The debconf-get-selections is part of the debconf-utils. The command would look like

# debconf-get-selections |grep ^localepurge
localepurge	localepurge/quickndirtycalc	boolean	true
localepurge	localepurge/remove_no	note
localepurge	localepurge/mandelete	boolean	true
localepurge	localepurge/showfreedspace	boolean	true
localepurge	localepurge/verbose	boolean	false
localepurge	localepurge/nopurge	multiselect	en, en_US.UTF-8, zh, zh_TW, zh_TW.UTF-8
localepurge	localepurge/dontbothernew	boolean	false
localepurge	localepurge/none_selected	boolean	false

So, these are the questions the debconf will ask you. (Since the questions has different priorities, you might not be asked for all the questions) The localepurge/nopurge line is the locales data we want to keep, so we also want to let the other servers have the same settings. You can use debconf-set-selections to set the values in the other servers.

# echo "localepurge localepurge/nopurge multiselect en, en_US.UTF-8, zh, zh_TW, zh_TW.UTF-8"|debconf-set-selections

Then you can now install the package, it will use the default value you just gave. If you need to install many servser, and do not want to see the question dialogs. You can use noninteractive fronetend to bypas the questions.

# DEBIAN_FRONTEND=noninteractive dpkg-reconfigure localepurge

This is a Tips for Debian system.

Thanks clkao (高大師) for the great svn-mirror tool. I am using svn-mirror 0.68-3 on Debian Etch for mirror svn repository from the damn far and slow European svn server, so I can enjoy the super faster checkout and show the log messages on local server.

However, since the the svn server is slow and connection is not stable (otherwise I don’t need svn-mirror anyway). The connection might be dropped or the process could be killed by accident in the long mirroring process. The problem is once the program is killed (Ex: by Control+C), then it will run into a dead lock situation. You will keep seeing this messages, and never get the mirror work again.

Waiting for sync lock on /mirror/remote: openwrt:25221.

In order to fix the problem, I wrote a simple script svn-mirror-unlock.pl. It’s for clean the dead lock.

$ svn-mirror-unlock.pl
svn-mirror-unlock.pl: unlock SVMREPOS path
$ perl svn-mirror-unlock.pl unlock /home/svn mirror/remote

This is a tip for Debian.

身為網管,時常需要計算統計網路位址,雖說計算不複雜,一隻鉛筆加腦袋也可以算出需要的網域,但是使用工具輔助總是較為方便。先前是使用 ipcalc,每次計算都要下達指令總是不太方便,最近改用 gipgip 是以 GTK2 為基礎開發的圖形介面 IPv4 位址計算機,拿來做規劃網路的小工具還算實用。(其實通常只用到 IPv4 Address Analyzer)

IPv4 Address Analyzer

IPv4 Range to Prefix Converter.

這是一篇 Debian Tips.

由於 debian.org.tw 以及許多社群網站都是多人之團隊維護的系統,團隊中許多人都有 root 權限,如果團隊間沒有協調妥當,常常另一人改了設定,下一人又改回來,造成設定上得困擾。於是我們需要一個系統可以追蹤、紀錄各種設定的修改。

有些方法是把設定檔目錄丟進 svk, bzr, mercurial 等等的版本控制軟體。但是往往會忘記將改過的設定檔儲存 (commit),以至於日後又忘記是誰動的手腳。近日裝了 metche,發現他大致可以滿足我的需求。metche 的基本功能是協助你監視 /etc 下面的設定檔,並在更動的時候 Email 一份 diff 給指定的電子郵件參考。預設 diff 是列出更動過的檔案,也可以要求 metche 顯示所有的修改細節。但是如果要求其顯示所有的修改內容,可能會誤把某些機要資訊像是密碼之類寄出來,為了避免資訊外漏,metche 也可以設定成寄出的信件使用 PGP 加密,只需要確定收件人的 PGP Key 在 root 的 keyring 中即可。

Continue reading

It’s been almost one year for not maintaining the wiki.debian.org.tw web site. Since I joined the current company, I spent all my time for dealing with routine jobs every signal day. I don’t even want to use my laptop at home, after I finish the jobs every day.

Lately, the wiki.debian.org.tw becomes more unstable. People usually see `Service is not available’ pages in the last couple weeks. One of the reason is the disk is full, the other reason is there are too much spam articles.

Finally, I spent a few hours this weekend for the site. First thing I do, it’s to upgrade the server and the mediawiki software. Frankly speaking, it’s not hard at all, since the wiki is installed in a vserver based on the Debian. All I need to do is running `aptitude dist-upgrade’, to upgrade the distribution from sarge to etch. And then I sync the mediawiki source tree, from 1.7.1 to 1.11.1. It’s also very easy, since mediawiki provide a upgrade script for check and modify the database schema.

The real problem is the thousands of spam articles. Since I have been for a long time not handing the spam problem, and more of the wiki moderators do not check the spam frequently. The spammers are easy to posts a lot of articles without supervision. Even through the moderators come to the wiki site ofter, it’s still impossible to delete the spams through the web interface, due to too much spammers.

Anyhow, the result is I got the thousands of spams in the database. Most of them are advertisements of venereal diseases treatment, they help you to deal with syphilis, gonorrhea and herpes. I all most want to change the wiki’s name as `SafeSexpedia‘, it’s become an informative knowledge base.

Still, I can not stand for the spamming situation. The first two things I do is install the reCAPTCHA MediaWiki Extension, so people need to pass CAPTCHA when they try to register an account. Also, I enabled $wgEmailConfirmToEdit which means only allow the account with email confirmed editing the pages. These two approach would be good enough to stop the new spammers. However, the real problem is the spam articles already in the database.

In order to clean up the database, I check several extensions like Nuke. However, I found they are not convenient for clean up thousands of spam articles. I decided to use APIs. The good thing is there are two scripts in the mediawiki/maintenance folder, cleanupSpam.php and removeUnusedAccounts.php.The cleanupSpam seems fit my requirements, it takes url as argument, and find out all the article which contains the url and remove it.

However, I don’t want to check the articles one by one for looking the urls. Since most of the spammers on the wiki.debian.org.tw are from China, most of them use the email address at 163.com. The most easy way for me, is just clean up all the accounts from 163.com and all of the articles posted by these accounts. And of course I can not just delete these articles. Because the spammer can modify any articles they want. In this case, I might remove some important articles by modified by spammers.

So, I need to have a script, the purpose of the script is find out the accounts with special email or nickname. And find out all of the articles modified by the account. For the article, if

  • If the account is not the latest editor, then we ignore the article. Because someone might already fix the content manually.
  • If the account is the latest editor and the article is created by the account, and it has signal version. Then we simply delete it.
  • If the account is the latest editor and there are earlier version, we found the last version which edited by a valid account. And we restore the article to that version. So we could have the right content for the article, before the spammer put the links into it.

I created another script based on the maintenance samples, thanks for these developers. With the script, I deleted hundreds of accounts and more then 2 thousands articles in a few hours. If you are interested about the script, you can download it from here. Put it in your mediawiki/maintenance folder. The usage is very simple

USAGE: php removeSpamAccountsAndPost.php [--delete] email

It takes only one parameter, you can find the articles by nickname or email. My database is mysql, so you can use ‘%’ as pattern matching for LIKE statement.

php removeSpamAccountsAndPost.php chihchun
php removeSpamAccountsAndPost.php chihchun%

The script only give you a list for preview by default, if you are sure that these accounts and articles should be deleted. Please add `–delete’ for let the script REAL DELETING THE ACCOUNT AND ARTICLES for you.

php removeSpamAccountsAndPost.php --delete chihchun