1. Basic Info 

1.1. Usage 

echo ./path/to/file.ext | wiki_import.sh --header --link --footer --sect='Customer Support' -R

1.2. Info 

wiki_import.sh — mediawiki automatic file import script

1.3. Source 

http://xpt.sourceforge.net/tools/

https://sourceforge.net/project/showfiles.php?group_id=163815

1.4. Description 

The script is designed to import a whole folder of files into mediawiki, with the folder directory tree mapped as wiki category hierarchy.

1.5. Features 

2. Help 

2.1. Quick Help 

wiki_import.sh $ $Revision: 1.1 $

mediawiki automatic file import script

 Usage: wiki_import.sh [OPTIONS]...

The script is designed to import a whole folder of files into mediawiki, with
the folder directory tree mapped as wiki category hierarchy.

The specification of the file-to-import is passed from standard input.

Options:
  -s, --sect=n     the root category section of the wiki
                     of the imported article (mandatory)
  -1, --header     include standard header (category hierarchy path & notice)
  -l, --link       link to actual file on the web site
  -f, --footer     include standard footer (article category)
  -R, --res[=p]    add restricted tag in the footer
                     as '{{<Res Param|Root Category> Restricted}}'
                     (default=`$_opt_sect')

Configuration Options:
  -p, --php=fn     mediawiki import php script specification
  -r, --root=n     the root category name for the whole wiki site
  -m, --max=n      max_allowed_packet for mysqld to import
  -u, --user=n     wiki user used for the import
  -a, --arch=p     the root url that linked-to archive files based on

Examples:

  echo ./path/to/file.ext | wiki_import.sh -1 -l -f -s 'Customer Support' -R

3. Version 1.1 

3.1. Files 

wiki_import.sh

wiki_import.sh.clp

wiki_import.sh.hlp

wiki_import.sh.ini

3.2. Installation 

make
make install

4. Usage 

4.1. Prefix, what problem is the script meant to solve 

The script is useful for the following scenario.

Wiki is the solution for the collaborate knowledge base building. And the wiki_import is the solution to jump start your knowledge base wiki from your existing collections.

You want to map the existing folders into nice hierarchies, just like how yahoo directories is doing, and all file underneath a certain folders will goes into the mapped wiki "directory". This is in fact what the script is actually doing.

4.2. Using the script 

The script takes the list of files that it should process from standard input, and import the files into the wiki. There are many command line parameters that help guiding the import, but only one is mandatory — the section name, i.e., the root category section of the wiki that the article should be imported to. The simplest form to invoke the script is:

echo ./path/to/file.ext | wiki_import.sh -s 'Example Wiki Section Name'

The 'wiki_import.sh' will parse the file type based on the file extension and properly import the give file automatically. I.e., whether the file extension is txt, or rtf/doc/xls/html/pdf, etc, the 'wiki_import.sh' will take proper action to correctly import the file's content.

To import every file under the current directory into the wiki:

find . -type f | wiki_import.sh ...

4.3. Controlling the script behavior 

4.3.1. simplest form 

The behavior of the script is controlled by it command line parameters. For a file named "./path/to/Test Wiki Entry.txt" with the content "Test wiki content." the following command:

echo "./path/to/Test Wiki Entry.txt" | wiki_import.sh -s 'Example Wiki Section Name'

will import the text file, i.e., create a wiki entry named "Test Wiki Entry", and its content being "Test wiki content." That's it. I.e., the mandatory section parameter is actually ignored.

4.3.2. footer 

To define what the category the imported entry belongs to, use the '—footer/-f' parameter:

echo "./path/to/Sample Wiki Category/Test Wiki Entry.txt" | wiki_import.sh -s 'Example Wiki Section Name' --footer

it will import the following content into the wiki entry:

 Test wiki content.
 [[Category:Sample Wiki Category]]
 ----
 ~~~~

4.3.3. header 

To show what the category the imported entry reside, use the '—header/-1' parameter:

echo "Computers and Internet/Communications and Networking/Email/Protocols/SMTP/Simple Mail Transfer Protocol Introduction.txt" | wiki_import.sh -s 'Example Wiki Section Name' --footer --header

it will import the following content into the wiki entry:

 [[:Category:Knowledge Base]]/[[:Category:Example Wiki Section Name]]/[[:Category:Computers and Internet]]/[[:Category:Communications and Networking]]/[[:Category:Email]]/[[:Category:Protocols]]/[[:Category:SMTP]]
 ----
 {{Import Disclaim}}
 ----
 '''Extracted Content'''

 Test wiki content.
 [[Category:SMTP]]
 ----
 ~~~~

It will import the give file's content as the wiki's content using the title 'Simple Mail Transfer Protocol Introduction', and use the section name from the command line as the base category name for the imported wiki. The mandatory section parameter make it possible to import wiki from different team/department/topic area.

The '{{Import Disclaim}}' is a macro indicating that the content is automatically imported, which might need manual editing.

Since the content is automatically imported, the text formatting most probably need some manual editing. To make it easy for referring back to the original article, it is suggested to mirror importing directory on the web server, and use the '—link/-l' parameter to link back to the original article:

4.3.4. link back 

echo "Computers and Internet/Communications and Networking/Email/Protocols/SMTP/Simple Mail Transfer Protocol Introduction.txt" | wiki_import.sh -s 'Example Wiki Section Name' --footer --header --link

will produce:

 [[:Category:Knowledge Base]]/[[:Category:Example Wiki Section Name]]/[[:Category:Computers and Internet]]/[[:Category:Communications and Networking]]/[[:Category:Email]]/[[:Category:Protocols]]/[[:Category:SMTP]]
 ----
 {{Import Disclaim}}
 ----
 '''Original article'''
 [http://{{Kb Archive}}/Computers%20and%20Internet/Communications%20and%20Networking/Email/Protocols/SMTP/Simple%20Mail%20Transfer%20Protocol%20Introduction.txt]
 ----
 '''Extracted Content'''

 Test wiki content.
 [[Category:SMTP]]
 ----
 ~~~~

The '{{Kb Archive}}' is a macro defining the root url that linked-to archive files based on.

4.4. Configuring the script 

The 'wiki_import.sh' takes the following configuration parameters:

-p, --php=fn     mediawiki import php script specification
-r, --root=n     the root category name for the whole wiki site
-u, --user=n     wiki user used for the import
-a, --arch=p     the root url that linked-to archive files based on

The "—php" parameter defines what command the script uses to import the wiki entry. It is normally

php /path/to/importTextFile.php

The "—root" parameter allows us to change the root category name for the whole wiki, i.e. the "Knowledge Base" shown above.

The "—user" parameter defines what user the script uses for the import.

The "—root" parameter allows us to change the root url of the linked-to archive, i.e. the "{{Kb Archive}}" shown above.

4.5. Save the script's configuration 

The best approach is not to give the above configuration parameters each time the script is invoked, but rather same the default settings. The script reads 'wiki_import.sh'.ini file which is in the same directory that the script runs from. Configure it to your need. Sample file:

File wiki_import.sh.ini
# -*- shell-script -*-

# == wiki configuration area

# mediawiki import php script specification
_opt_php='php ../kb/maintenance/importTextFile.php'
# max_allowed_packet for mysqld to import, in byte
# ref http://dev.mysql.com/doc/refman/5.0/en/packet-too-large.html[]
_opt_max=1048576
# the root category name for the whole wiki site
_opt_root='Knowledge Base'
# wiki user used for the import
_opt_user=auto_import
# the root url that linked-to archive files based on
_opt_arch='{{Kb Archive}}'

# == script customization area

echo='echo -e'
# the tmp dir
ttd=/tmp
# the tmp file
ttf=/tmp/tmpf

Of cause, the command line configuration parameter can overwrite the settings in the .ini file.

4.6. Restricting the wiki access 

Under certain circumstances, it may be desirable to restrict access to certain wiki pages, or enable them for viewing/editing only to certain groups.

If so, we can use the MediaWiki's PageSecurity extension which implements page access control. It can enforce read and write protection for selected pages or for sets of pages. User access is controlled by membership in user groups.

The "—res/-R" parameter of 'wiki_import.sh' can help placing the page restriction/security tags into the imported content:

echo "Sample Wiki Category/Test Wiki Entry.txt"| wiki_import.sh -s Payroll --footer -R

Will yield:

Test wiki content.
{{Payroll Restricted}}
[[Category:Sample Wiki Category]]

Please refer to the original MediaWiki's PageSecurity extension for details on how to setup/use the extension.

5. Preparation 

5.1. Create auto_import user 

The 'wiki_import.sh' uses the use 'auto_import' by default as wiki import user. Here is now to create the user:

php maintenance/createAndPromote.php auto_import p

5.2. Build category hierarchy 

When using the '—header/-1' parameter, the 'wiki_import.sh' script can build a category decending path for the imported entry. However, it is also necessary to build the category hierarchy within mediawiki, so as to enable people to browse the wiki by categories. Here is how:

find . -type d -print0 | xargs -0 -i echo "echo \"[[Category:\$(basename \$(dirname '{}'))]]\"" \> "\"Category:\$(basename '{}').txt\";" php path/to/maintenance/importTextFile.php --user auto_import "\"Category:\$(basename '{}').txt\"" | sh -x

documented on: 2007.06.25