In writing a script for the ZSH shell, I wanted to extract some bits from a string. I looked for a regex solution, using capture groups. I could not figure out how to do it with sed
but I found that the [[ ]]
format of the test
command allows this with the =~
operator. If the test returns true, values are stored in a $match
array and can be accessed like $match[1]
and so on.
regex posts
Quick regex to strip html tags
Recently, I needed to strip some HTML tags from some data. The goal was to make a field in a database that was a WYSIWYG text area into plain text content that could go inside a link. I did it using a simple regex of /<\/?[^>]+>/
to find the tags so I could replace them with an empty string. In PHP, this looked like:
$string = preg_replace('/<\/?[^>]+>/', '', $string);
This is perhaps a naïve implementation, but it served my purposes fine. Of course, I had totally forgotten about PHP’s built in strip_tags()
function, but on comparing it, it also seems to not do exactly what I want. For instance, it seems to get rid of the content of <a>
tags.