DO NOT USE THIS SCRIPT.

It was written to convert my drupal database to text files for dokuwiki.

It is specific to my setup - I have no revisions, so there's a 1:1 mapping of nodes to revisions. I named each file in the format ./files/<YEAR>/YYmmdd-<title>. I've done some basic substitutions to whip the data into shape from Markdown syntax to dokuwiki syntax.

I repeat, this script will probably not work for you. If you really intend to use it, make sure you check every line. You've been warned.

#!/usr/bin/ruby

require 'mysql'
require 'fileutils'

mysql = Mysql.init()
mysql.connect('localhost', <username>, <password>, <database>)

results = mysql.query("select node.nid, date_format(from_unixtime(node.created),'%Y') as 'year', date_format(from_unixtime(node.created),'%m') as 'month', date_forma
t(from_unixtime(node.created),'%d') as 'day', node.title, node_revisions.body from node, node_revisions where node.nid = node_revisions.nid and node.type ='story';")

results.each do |row|
    #create a new file and write to it  
    name = row[4]

    name = name.gsub(/\//, '_')
    name = name.downcase
    name = name.gsub(/ /, '_')
    name = name.gsub(/-/, '_')
    name = name.gsub(/__/, '_')
    name = name.gsub(/__/, '_')
    name = name.gsub(/[^A-Za-z0-9_-]/, '')

    # I want to store entries in the format /2008/20080312-title.txt
    name = 'files/' + row[1] + '/' + row[1] + row[2] + row[3] + '-' + name + '.txt'
    
    # create the year directory
    FileUtils.mkdir_p 'files/' + row[1]

    File.open(name, 'w') do |f1|  
        
        body = row[5]

	# working: reformat links
	body = body.gsub(/\[(.*?)\]\((.*?)\)/, '[[\2|\1]]')

	# get rid of hard breaks
	body = body.gsub(/<!--break-->/,'')

	# convert <fn>...</fn> style footnotes
	body = body.gsub(/<fn>/,'((')
	body = body.gsub(/<\/fn>/,'))')
	body = body.gsub(/\)\)\)/, ').))')

	# convert <a> links to dokuwiki links
	body = body.gsub(/<a href="(.*?)"(.*?)>/, '[[\1|')
	body = body.gsub(/<\/a>/,']]')

	# convert <em> to /emphasis/
	body = body.gsub(/<em>/,'/')
	body = body.gsub(/<\/em>/,'/')

	# fix broken links by prepending site name
	body = body.gsub(/\[\[\//, '[[http://nic.suzor.com/')

	# change <blockquotes> to '>'
	body = body.gsub(/<blockquote>/,'>')

	# create lists
	body = body.gsub(/<li>/, '  * ')	

	# remove remaining tags
	body = body.gsub(/<(.*?)>/, '')
	
	# add tags
	tags = mysql.query("select name from term_data, term_node where term_data.tid = term_node.tid and nid =" + row[0]);
	stags = "{{tag>"
	tags.each do |tag|
	   stags = stags + '"' + tag[0] + '" '
	end
	stags = stags + "}}"
	f1.puts(stags)

	# add title
	title = '======' + row[4] + '======'
	f1.puts(title)

	f1.puts(body)

    end  

end

mysql.close()
  • Bookmark at
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at del.icio.us
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Digg
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Furl
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Reddit
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Ask
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at BlinkList
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at blogmarks
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Blogg-Buzz
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Google
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Ma.gnolia
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Netscape
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at ppnow
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Rojo
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Shadows
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Simpy
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Socializer
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Spurl
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at StumbleUpon
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Tailrank
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Technorati
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Live Bookmarks
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Wists
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Yahoo! Myweb
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at BobrDobr
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Memori
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Faves
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Favorites
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Facebook
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Newsvine
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Yahoo! Bookmarks
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Twitter
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at myAOL
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Slashdot
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Fark
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at RawSugar
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at LinkaGoGo
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Mister Wong
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Wink
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at BackFlip
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Diigo
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Segnalo
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Netvouz
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at DropJack
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at Feed Me Links
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at funP
  • Bookmark "files:extract_drupal_to_text_files_-_ruby" at HEMiDEMi

Discussion

Enter your comment (wiki syntax is allowed):