Master Regular Expressions

Regular Expressions Memo

Regex to strip MediaWiki

Here the step by step actions based on Rihno JavaScript, which is the built-in JavaScript Engine for Java 6:

// remove the html tag.
/<[^>]*>/g, ""
// remove the html attributes used in wiki
/(class|id|style)=".+"/g, ""
// replace line break with '\xB6'
/(\r?\n)/g, "\xB6"
// Extract the template
// remove template {{ }}
/{{.+}}/g, ""
// extract wiki title


