Re: extracting data from html files


Subject: Re: extracting data from html files
From: Bob Crosby (rcrosby@alaska.net)
Date: Fri Jun 28 2002 - 10:58:17 AKDT


Thanks, Christopher,

That works great. It returns just what I asked for. Now I'm wondering if
sed can be used to do even more sophisticated editing? For example, given
a bunch of files, each containing multiple blocks of text like the following:

     <TD width="300" valign="top"><FONT class=Price>$48.00</FONT><BR>
     <a href="JavaScript: funcname('../doit.asp', '', '', '',
'100');">Widgetname</a><BR>
     Product_name<BR>
     Product_ID<BR>

could I generate a csv file consisting of lines containing
Price,Widgetname,Product_name,Product_ID?

>* Bob Crosby <rcrosby@alaska.net> [2002-Jun-27 08:26 AKDT]:
> > What I'd like to do is step through each file searching for a particular
> > string, then whenever a match is found, write the remainder of the line
> > where it is found, to another file.
>
>Sound like:
>
> $ grep 'string' *.html > string_match
>
>might be the ticket. This will give you each line that contains the
>string. If you really want to put only the remainder of the line in the
>file you might pipe this through sed:
>
> $ grep 'string' *.html | sed 's/.*string\(.*\)$/\1/' > string_match

---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.



This archive was generated by hypermail 2a23 : Fri Jun 28 2002 - 10:59:45 AKDT