[aklug] Re: Anyone interested in a job importing 24k printed emails in Juneau/Anchorage into a database?

From: Joshua J. Kugler <joshua@eeinternet.com>
Date: Wed Jun 08 2011 - 15:31:06 AKDT

First question: WHY ON EARTH are they printed? Why can't they give them
to you on a CD or DVD?

j

On Wednesday 08 June 2011, Jason McEachen elucidated thus:
> This Friday at 9am the State of Alaska is going to have a couple
> boxes of printed emails in Juneau for me to have, and a hand truck to
> help carry them. We could also pick them up at the Anchorage Airport
> at 3pm.
>
> What I'd like to do is somehow import them into a database and set up
> a quick and easy web-based interface to allow searches.
>
> The problem is my first child is coming into this world that morning
> at Providence. My wife doesn't like the idea of me either being in
> Juneau to receive and scan/process these docs, nor me sitting at a
> machine that morning to write up a script to pull scans, parse them,
> and populate some tables.
>
> So we (AlaskaDispatch.com) are possibly interested in hiring someone
> to help us with this project.
>
> If you think this is a neat intellectual exercise, please respond to
> the group with your ideas or suggestions.
>
> If you're interested in doing this professionally (or can recommend
> someone), please contact me directly and let me know how you'd
> propose to do it and what you'd bill.
>
> My first thought is to find someone in Juneau (fedex/kinkos) with a
> big copier/scanner who can convert paper to PDF really quickly (there
> are, after all, ~24 thousand pages) and ftp/sftp them up to a server
> that's already got a nice pdf->text (maybe pdftohtml?) tools and
> "your favorite script language interpreter" to parse them into a
> table (probably only need fields like index, datetime, from, to, cc,
> bcc, subject, body, attachments) and a little web front end waiting
> for search/display.
>
> Has anyone on the list handled a similar task and can share what
> worked and what didn't?
>
> Thanks for your help,
>
> --Jason
> ---------
> To unsubscribe, send email to <aklug-request@aklug.org>
> with 'unsubscribe' in the message body.

-- 
Joshua Kugler
Part-Time System Admin/Programmer
http://www.eeinternet.com - Fairbanks, AK
PGP Key: http://pgp.mit.edu/  ID 0x73B13B6A
---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Wed Jun 8 15:31:16 2011

This archive was generated by hypermail 2.1.8 : Wed Jun 08 2011 - 15:31:16 AKDT