Skip navigation

On how to delete a chunk of text contained in multiple lines: use sed to catch the range (/a/,/b/)

There were some javascript google ads calls from a page I downloaded via wget, doing:

 wget www.someSite.com/somePage.html > toCleanItUp.htlm

After highlighting all the page C-x h I did a M-x shell-command-on-region and used the following:

sed "/<script .*>/,/<\/script>/d" > nowItIsClean.html

If feeling lazy, (or if I don’t have the file opened already in an emacs buffer), there’s the straightforward way of using cat:

cat toCleanItUp.html | sed "/<script .*>/,/<\/script>/d" > nowItIsClean.html

Ah, the beauty of unix tools!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: