Folks are worried about creating a mess of useless and undifferentiated information with GeoURLs. To solve the problem, people are proposing controlled vocabularies, or auto-categorization approaches (I’ll talk about auto-categorization in a future post).
Prentiss Riddle articulately expresses the concern here:
I think GeoURLs are delightful but I don’t understand how people can proceed down this path without addressing questions of scale and of additional metadata (e.g., a taxonomy with an associated controlled vocabulary) to permit useful lookups.)
What happens if every business, blog, or blog entry — in the standard metaphor, every lightbulb — has a GeoURL? Aside from the question of whether a central GeoURL server can handle the load, won’t the concept soon cease to be useful if every GeoURL report consists of a jumble of 500 “things” in the immediate neighborhood?
Josh Schachter, who created the GeoURL service, writes (in comments conversation with Prentiss on the GeoURL site):
obviously, geourl is just a first pass at the idea. the problem with, as you mention, letting people choose where into the “controlled vocabulary” they fit is that it will be abused. you’ll note that the meta keywords tags are basically unused due to spamming abuse. so basically you’d need google-like powers to sort through it all. perhaps we need some sort of zen of what should be geourl’d and what shouldn’t? some sort of policing mechanism? obviously this becomes complicated very quickly indeed.
Maybe it’s not necessary to worry so much.
With classic transactional applications, it’s imperative to have distinct categories. In a hospital database, it is important to clearly distinguish which user has the role of “Doctor” and which has the role of “Patient.”
But GeoURL and metablog applications are just content. A little bit of ambiguity won’t cause the wrong person’s appendix to be removed.
At the Austin Metablog, we’re starting with no categories. Human editors will categorize posts. We’ll see which categories emerge,and then maybe decide how to automate them. The categorization emerges from the application as it matures.
The meta tag system for web sites failed miserably, since people learned to game and spam the system. But web site meta tags were intended to be a general-purpose system. There was no domain or community to confine the use of meta tags. So the system got out of control.
But GeoURL and metablog applications are build around specific communities with specific applications.
On another community metablog project we’re working on (that we’ll talk about soon!), we’re planning to use a defined set of categories. We think this will work because it’s a specific application, with a particular relevant set of categories.
Communities can add policing mechanisms when they are necessary. If people start to spam Austin Metablog, we can start thinking about human moderation, or automated moderation. There’s no need to develop strict security policies in advance for every possible misbehavior.
For people who’ve done content repositories for many years in other domains, please talk to me about which wheels we are reinventing!
via conversations with Greg Elin
I’ve corresponded with JS about the possibility of looking at a list of neighboring GeoURLs and consolidating based on repeated keywords in them. In other words, auto-categorization.
This kind of approach can run into problems. Think about library filters that ban breast cancer support sites along with pornography. How would we avoid that sort of miscategorization?
Synthetic metadata is a bad idea. While it’s a wonderful idea to think the machines will do it all for us, they more often end up making us not trust the judgments made. Google’s synthetic indexing is among the best. And it’s not available out there on our level. Take care that projecting what “looks like” a good idea is actually something that can be done in the field.
But for the folks that would be ‘worried’ about the downsides of filtering, is the idea then to do nothing at all?
austinbloggers.org
austinbloggers.org is a new metablog which combines Austin originated weblog entries from bloggers around town. Anyone can contribute using TrackBack