regex posts

ZSH regex capture groups

In writing a script for the ZSH shell, I wanted to extract some bits from a string. I looked for a regex solution, using capture groups. I could not figure out how to do it with sed but I found that the [[ ]] format of the test command allows this with the =~ operator. If the test returns true, values are stored in a $match array and can be accessed like $match[1] and so on.

Continue reading post "ZSH regex capture groups"

Quick regex to strip html tags

Recently, I needed to strip some HTML tags from some data. The goal was to make a field in a database that was a WYSIWYG text area into plain text content that could go inside a link. I did it using a simple regex of /<\/?[^>]+>/ to find the tags so I could replace them with an empty string. In PHP, this looked like:

$string = preg_replace('/<\/?[^>]+>/', '', $string);

This is perhaps a naïve implementation, but it served my purposes fine. Of course, I had totally forgotten about PHP’s built in strip_tags() function, but on comparing it, it also seems to not do exactly what I want. For instance, it seems to get rid of the content of <a> tags.