星期日, 十一月 28, 2010

How to extract/backup/migrate your google group

Just like John Resig said, "google group is dead". Google group has very limited functions and there is not improvement on it for a very long time. Especially for the people in the mainland of China, they even need to find some special ways (SSH tunnel, VPN, Tor, etc) to visit google groups.

 

I maintained a google group for our baseball club for about 5 years. But since last year our club members cannot visit the web interface of our google group, and the new comers even cannot sign in this group. And most of our club members like to use web brower to have discussion but not using emails. So I started to plan a migration.

 

Also like what Resig did for JQuery forum, I searched for a solution to backup the posts in our google group so I can import them to our new forum system. However, google does not provide such a function for us. If I have a email account that only receives the mails from our google group, I think I can extract data from the mail box. But I don't have such a email account.

 

It also took me a long time on choosing the forum system. I considered a lot of open source forum systems, such as phpBB, but most of them cannot match the habit of Chinese people, until I found Discuz. It is written by php and use mysql as backend database, and it has all the features I can image, and the most important: it is written by Chinese. So I decide to migrate our google group to DiscuzX.

 

I took about one week to complete my extract-google-group tool and then successfully extracted all the posts of our google group to SQL format, and imported them into our new discuzX forum. So it is a tool that really works, however, it is a 1-time using tool. So I won't spend time on it anymore.

 

Here is the code location: https://github.com/tangramor/extract-google-group

  • ExtractGoogleGroup.py: it is the core class file, you need to use the methods provided by it to implement you own extract script

     

  • GoogleGroupToDiscuzSql.py: it is my script to extract google group data and tranform them into DiscuzX SQL. I have used it to import more than 4,000 posts into our new DiscuzX forum

     

  • UTF8CSV.py: it is copied out from Python document to read CSV file in UTF-8 format

     

    To use ExtractGoogleGroup.py, you need to export the members information from your google group (will get a groupName_group_members.csv), and it will be used to extract user name and email address from the format user&@gmail.com (google use this to prevent robots). It should have bug here because there may be 2 users have the same tailored email address format, but it is enough for my google group& :D

     

    Then you can create you own transformation script by refering to GoogleGroupToDiscuzSql.py

     

    I hope this tool is useful for the forum administrators or the google group owners who want to backup the data of there groups.



  • 星期三, 十一月 24, 2010

    Confluence 3.4 版本安装的问题及解决方法

    在Ubuntu Server 10.10上安装Confluence 3.4.2,基于已有的Tomcat6服务器,遇到了如下问题,也分别找到了解决办法:

    1. 错误信息:

    You cannot access Confluence at present. Look at the table below to identify the reasons

    Type: bootstrap

    Description: Could not load bootstrap from environment. No server id found.

    Exception: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'bootstrapPluginManager' defined in class path resource [setupContext.xml]: Cannot resolve reference to bean 'bootstrapBundledPluginLoader' while setting constructor argument with key [1]; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'bootstrapBundledPluginLoader': FactoryBean threw exception on object creation; nested exception is java.lang.NullPointerException

     

    这个问题在Ubuntu上大致是由于使用OpenJDK引起的,因为目前Confluence还不支持OpenJDK,需要安装Sun JDK1.6和安装一些Xorg的包。

    1.1. Sun JDK:

    修改/etc/apt/source.list,找到带“partner”的两行,去掉注释,执行

    $ sudo apt-get update

    $ sudo apt-get install sun-java6-jdk

    然后将系统缺省JDK设置成Sun JDK:

    $ sudo update-alternatives --config java

     

    1.2. Xorg 包:

    $ sudo apt-get install libice-dev libsm-dev libx11-dev libxext-dev libxp-dev libxt-dev libxtst-dev

     

    2. 创建PostgreSql数据库

    $ sudo su - postgres

    $ psql

    postgres=# create user confluence password 'XXXXXX' CREATEDB;
    CREATE ROLE

    postgres=# set role confluence;
    SET

    postgres=> create database wiki;

    postgres=> \q

     

    3. java.lang.OutOfMemoryError: Java Heap Space

    这个问题比较麻烦,不过也是已知问题了,有现成解决方案。在Ubuntu server上,就是去修改/etc/default/tomcat6,将里面的一行:

    JAVA_OPTS="-Djava.awt.headless=true -Xmx128m"

    改成:

    JAVA_OPTS="-Djava.awt.headless=true -Xms128m -Xmx1024m -XX:MaxPermSize=256m"

     

    4. 安装完成后页面显示有问题,一些css等资源文件找不到,例如batch.css

    这个问题暂时还没找到解决办法……