Re: extracting data from html files

Subject: Re: extracting data from html files
From: Bob Crosby (rcrosby@alaska.net)
Date: Fri Jun 28 2002 - 10:58:17 AKDT

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Next message: Peter Q. Olsson: "Re: sed problem"
Previous message: Christopher E. Brown: "Re: apache sec hole.."
In reply to: Christopher Swingley: "Re: extracting data from html files"
Next in thread: Christopher Swingley: "Re: extracting data from html files"
Reply: Bob Crosby: "Re: extracting data from html files"
Reply: Bob Crosby: "Re: extracting data from html files"

Thanks, Christopher,

That works great. It returns just what I asked for. Now I'm wondering if
sed can be used to do even more sophisticated editing? For example, given
a bunch of files, each containing multiple blocks of text like the following:

<TD width="300" valign="top">$48.00 
 <a href="JavaScript: funcname('../doit.asp', '', '', '',
'100');">Widgetname</a> 
 Product_name 
 Product_ID

could I generate a csv file consisting of lines containing
Price,Widgetname,Product_name,Product_ID?

>* Bob Crosby <rcrosby@alaska.net> [2002-Jun-27 08:26 AKDT]:
> > What I'd like to do is step through each file searching for a particular
> > string, then whenever a match is found, write the remainder of the line
> > where it is found, to another file.
>
>Sound like:
>
> $ grep 'string' *.html > string_match
>
>might be the ticket. This will give you each line that contains the
>string. If you really want to put only the remainder of the line in the
>file you might pipe this through sed:
>
> $ grep 'string' *.html | sed 's/.*string$.*$$/\1/' > string_match

---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.

Next message: Peter Q. Olsson: "Re: sed problem"
Previous message: Christopher E. Brown: "Re: apache sec hole.."
In reply to: Christopher Swingley: "Re: extracting data from html files"
Next in thread: Christopher Swingley: "Re: extracting data from html files"
Reply: Bob Crosby: "Re: extracting data from html files"
Reply: Bob Crosby: "Re: extracting data from html files"

This archive was generated by hypermail 2a23 : Fri Jun 28 2002 - 10:59:45 AKDT