[aklug] Why printed - Re: Re: Anyone interested in a job importing 24k printed emails in Juneau/Anchorage into a database?

From: Jason McEachen <jason@brightshinyobject.com>
Date: Wed Jun 08 2011 - 15:38:20 AKDT

Yeah, this would be asking the people who needed two years to them ready
to be printed. All my guesses are too thick with cynicism to put in
writing.

I'm told that other news orgs (big ones) are flying people and machines
in from the lower 48 to do the same thing, so I don't think it's because
we forgot to check the "hand us in electronic form" box :-)

--Jason

On 06/08/2011 03:31 PM, Joshua J. Kugler wrote:
> First question: WHY ON EARTH are they printed? Why can't they give them
> to you on a CD or DVD?
>
> j
>
> On Wednesday 08 June 2011, Jason McEachen elucidated thus:
>> This Friday at 9am the State of Alaska is going to have a couple
>> boxes of printed emails in Juneau for me to have, and a hand truck to
>> help carry them. We could also pick them up at the Anchorage Airport
>> at 3pm.
>>
>> What I'd like to do is somehow import them into a database and set up
>> a quick and easy web-based interface to allow searches.
>>
>> The problem is my first child is coming into this world that morning
>> at Providence. My wife doesn't like the idea of me either being in
>> Juneau to receive and scan/process these docs, nor me sitting at a
>> machine that morning to write up a script to pull scans, parse them,
>> and populate some tables.
>>
>> So we (AlaskaDispatch.com) are possibly interested in hiring someone
>> to help us with this project.
>>
>> If you think this is a neat intellectual exercise, please respond to
>> the group with your ideas or suggestions.
>>
>> If you're interested in doing this professionally (or can recommend
>> someone), please contact me directly and let me know how you'd
>> propose to do it and what you'd bill.
>>
>> My first thought is to find someone in Juneau (fedex/kinkos) with a
>> big copier/scanner who can convert paper to PDF really quickly (there
>> are, after all, ~24 thousand pages) and ftp/sftp them up to a server
>> that's already got a nice pdf->text (maybe pdftohtml?) tools and
>> "your favorite script language interpreter" to parse them into a
>> table (probably only need fields like index, datetime, from, to, cc,
>> bcc, subject, body, attachments) and a little web front end waiting
>> for search/display.
>>
>> Has anyone on the list handled a similar task and can share what
>> worked and what didn't?
>>
>> Thanks for your help,
>>
>> --Jason
>> ---------
>> To unsubscribe, send email to<aklug-request@aklug.org>
>> with 'unsubscribe' in the message body.
>
>

---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Wed Jun 8 15:39:02 2011

This archive was generated by hypermail 2.1.8 : Wed Jun 08 2011 - 15:39:02 AKDT