Comments on: Geographic Analysis + Text Mining + Big, Messy Data http://bayarea2010.thatcamp.org/10/06/geographic-analysis-text-mining-big-messy-data/ The Humanities and Technology Camp Fri, 14 Jan 2011 18:44:08 +0000 hourly 1 https://wordpress.org/?v=4.9.12 By: Aditi Muralidharan http://bayarea2010.thatcamp.org/10/06/geographic-analysis-text-mining-big-messy-data/#comment-44 Sat, 09 Oct 2010 00:19:52 +0000 http://www.thatcampbayarea.org/?p=324#comment-44 This sounds like a fun session Cameron. I look forward to it.

]]>
By: George Oates http://bayarea2010.thatcamp.org/10/06/geographic-analysis-text-mining-big-messy-data/#comment-43 Wed, 06 Oct 2010 20:42:39 +0000 http://www.thatcampbayarea.org/?p=324#comment-43 Hi Cameron,

I, too, am interested in mining big ol’ uncontrolled place-related fields to expose information in a more better way.

On openlibrary.org, we have 2 place-related fields that I’m particularly interested in:

– the place a book was published
– place as subject heading

It’s remarkably messy in there, so I’d love to brainstorm ideas about how to keep the mess, but increase the signal. Perhaps some sort of “merge” of place names, and the nomination of a master.

Try a search for any place here to see what I mean:
openlibrary.org/search/subjects

I’d love to see if we could mush together this uncontrolled data with, say, Freebase’s Gridworks, and then perhaps the Yahoo! Geo toolset to see what we could do…

]]>