When will open ebooks cross the chasm?

Over the weekend, Cory Doctorow wrote and then updated a blog post referring to new and simpler ways of removing DRM from Kindle ebooks.

Cory Doctorow’s BoingBoing readers are no strangers to digital rights. In the comments section, many readers explained why they were thankful for the new tools. They want to be able to read ebooks they buy on any device that they own. They want to be able to protect their investment in case they want to buy a different ereader in the future. They understand that the policies of publishers and device makers are hostile to customers, trying to lock customers into a single device platform, and force customers to buy new media when they switch devices.

Doctorow’s readers also are more tech-savvy than the average ereader user. They are able to use command-line scripts and geeky tools, and follow detailed-instructions for multi-step processes. BoingBoing readers are technology early adopters. They understand the limitations of ereaders, and are willing to go through substantial hassle in order turning ebook files into assets they can use in the future.

Twenty years ago, Geoffrey Moore wrote “Crossing the Chasm” the technology marketing classic which described gulf between technology early adopters and mainstream buyers. It can take a long time to bridge that gap. Most buyers of e-readers probably don’t yet know that if they want to buy a different brand, their investment in the ebooks they purchased will be worthless.

It took seven years between Apple’s launch of iTunes and iPod, the first massively popular tools for digital music, and Amazon’s sale of un-encumbered digital music. Only one year later, Apple started offering DRM-free music as a standard offer. In the ebook market, Amazon and Apple are the market leaders, and they have mutually incompatible DRM’s book formats.

How long will it take for the knowledge about the limitations of ereader technology reach the majority of buyers of ebooks? What story will finally break into mainstream media and result in mass awareness that ereader lock-in is bad for buyers? What surge in customer demand, and what competitive trend will cause ebook providers to finally stop inconveniencing customers in the vain hope of longterm lockin?

The Kindle debacle, DRM and SaaS

A blogger who’s a library professional gives more wonderful examples of a subscriber losing access to digital content. But these examples conflate the issues with DRM, and with “software as a service” contracts.

  • Due to an oversight, a bill for an e-book service was paid one day after the due date. As a result, access to about 1000 titles was denied for the entire calendar month.
  • The Library subscribed to an e-journal for a few years, then cancelled the subscription. The publisher removed access to the entire journal; the Library could no longer access even the volumes that it had paid for.
  • An e-book publisher went out of business; the Library lost access to hundreds of titles at once.
  • Sometimes, technical/connection problems occur that make hundreds of titles (they are usually bought in packages) temporary unavailable.

However, these examples conflate several related issues with digital content – DRM and the software as a service model. Part of the problem with the Amazon 1984 is DRM and the associated metadata. When I purchase mp3s from Amazon, I can back them up, and play them in different players. If the file is a generic file without source metadata or locking capability, Amazon couldn’t take the files back if they tried.

This issue is related but different to purchasing content as a service. A better analogy to the library’s digital subscriptions are the “web songs” from lala.com. The online music service enables you to access music in three formats, CDs, mp3s, and “web songs”, which are available for 10 cents and can be streamed only. I expect that web songs would not outlast Lala’s corporate lifespan, and might go away any moment. Digital subscriptions are more like web songs. They are perpetually dependent on the existence of the provider and the terms of service.

The final issue is contract terms. There is a long tradition of contracts that give you temporary access to you don’t own. This is called “renting” or “leasing”. The library contracts are clearly rental contracts where the agreement is that service will continue as long as the library pays its bill. Rental contracts, of course, can apply to physical objects. You can lease a car, and the leasing company expects the car back when the lease is done, or if you stop paying your bill. If the lease terms are “in perpetuity”, this practically means “for the lifetime of the provider”. And even perpetual lease terms typically allow the provider to change the terms, too.

With “content as a service” buyers need to be especially aware if the service supports content they themselves contributed. When I upload content to a photo sharing service, for example, I explicitly want the right to get my content back at any time.

The issues of DRM and SaaS go together, in that it’s easier with DRM to turn contracts that seem like purchase at first glance into conditional rentals. This is what Amazon appears to have done with the Kindle. When a contract is explicitly a rental contract, the subscriber should expect to be tied to the lifetime and the changing discretion of the provider. When a contract is for downloaded, non-DRM content, it’s at least possible to create a traditional agreement of sale.

When I’ll get a Kindle

The current Kindle debacle, in which Amazon deleted copies of George Orwell’s 1984 from the devices of people who thought they bought the book, highlights one of the three main reasons that I haven’t bought a Kindle yet.

DRM is first. The 1984 scenario shows in a dramatic fashion that when you “buy” DRM’d product you don’t own it. DRM fragments the bundle of rights you have when you buy a paper book – you can lend it, sell it, read it outloud, take it wherever you go. DRM enables the provider to set the terms, for example restricting the use of an audio feature that reads the book outloud. DRM prevents one of the main reasons that I buy books to begin with – the ability to share books with friends. The 1984 example shows the limitation clearly. In a DRM world, users do not have rights to stuff that historical experience leads them to mistakenly think they own. I’m not at all fooling about staying away from DRM, I avoided digital music until the industry walked away from DRM as the norm.

The second reason is social. If I’m getting books online, I want to be able to choose to share them. The internet makes it possible to create wonderful social applications for reading books together, commenting, annotating, creating clubs and discussion groups, discovering other peoples’ collections. LibraryThing goes a little of the way there. The Kindle experience is isolated – it’s even less social than physical books that you can at least lend to a friend. It’s less social than the bookshelves that disclose the history of your reading interests to your friends. After (and only after) the DRM is gone, good social capabilities would make it much more compelling to use a Kindle-like device – paired with a service for sharing.

The third reason is inventory. One of the big benefits of digital music is that publishers have digitized a large portions of back catalog. This means that one can search and acquire a wide variety of music, ranging from the hyper-popular to the moderately obscure. Almost everything I want to listen to, with a small number of exceptions, is available digitally. This isn’t the case for books. Kindle inventory is clustered at two ends of the spectrum. New popular books are all on Kindle. Old, classic, public domain works are on Kindle. But there’s a large number of moderately obscure, somewhat older books that aren’t. And this sort of book represents a good portion of the books I buy. The last two books I bought: an biography of an author, published in the early 90s. A music tutorial (thanks Tracy for the recommendation). Turning around to look at my bookcase – “Merchants of Desire” – a superb history of retail and mass merchandising, published in 94. More Work for Mother, Ruth Schwartz Cowan’s brilliant classic on the history of household technology, published in the mid-80s. Not on Kindle. Until the large majority of books I want to buy are on Kindle, it’s not so helpful for me.

So can I see having a Kindle-like device. Eventually. After DRM is gone. And I believe it eventually will be, just as it’s gone for music. When the social experience is better than reading printed books. And when the majority of books I want to read, including a couple of decades of back catalog are online. These things will eventually and I can wait.

Update. A blogger who’s a library professional gives wonderful examples of losing access to digital content. But these examples conflate the issues with DRM and with Software as a Service content. I unpack these issues in a separate post.

Update 2. Same reasoning for when I’ll get an iPad. The iPad will have a greater variety of content for it. But I’m not much of a gamer. Seems like a nice platform for graphic novels but seems like an expensive investment to read comic books. I’m not as opposed to consumption devices as, say, Cory Doctorow – I’ll get one when DRM is gone and when the experience can be social.

Google book settlement – what can we do?

The Google book settlement may be the most important undercovered tech policy / digital rights issue live today. And because it deals with the trailing edge of the long tail, it isn’t getting the attention it deserves.

What is this about? The Google books project was sued by the representatives of a small fraction of the millions of out of print books that it scanned. Instead of taking the case to trial, Google hammered out a settlement, which now needs to be approved by a judge.

The problem is that the settlement as it was written, has major anti-trust and anti-consumer implications. The settlement as written gives Google the exclusive right to scan out-of-print books and make them available. Anyone else – a competitor like Amazon, or an academic institution, or a person – would risk getting sued if they also wanted to digitize books. Legal scholar Pamela Samuelson called out the problems in The O’Reilly Radar Blog.

So is this a done deal? Not necessarily. The judge has the power to accept the agreement as written, require modifications, or reject the agreement. There are strong arguments to keep it but fix it.

Professor James Grimmelmann of New York Law School has written a clear paper explaining how to fix the settlement — remove the monopoly aspect by giving the same terms to future participants, and putting the public at the table by mandating library and reader representation on the registry board. A more detailed summary of Grimmelman’s recommendations can be found at the Law Librarian Blog. Months before the #amazonfail debacle, Grimmelmann recommended a provision that would prevent secret censorship by silently delisting books.

The US Justice Department is investigating anti-trust implications of the settlement, and the judge extended the deadline for parties to file briefs in the case by five months, from July to September.

So what can we citizens do? While individuals can’t lobby directly — this isn’t a legislative issue where you can contact your congress critter — parties with an interest in the case can express opinions to a judge.

The Electronic Frontier Foundation is recommending modifications including putting book scans in escrow. Public Knowledge also plans to file comment advocating keeping the settlement but modifying it to address problems with competition and access to orphan works. The American Library Association has weighed in. The Internet Archive and Consumers Watchdog have registered comments with the justice department, as reported by the New York Times.

But this issue isn’t getting as much attention as it deserves. It deals with the trailing edge of the long tail – works that are out of print, where the authors can’t be found. Issues that deal with new technology and consumer access – net neutrality and social media privacy seem more sexy. But over time, access to the scholarly knowledge and cultural resources locked in “out of print” works has cumulative value. Granting monopoly privileges is a slow drain on freedom, rights and knowledge.

Some of the larger organizations that often get engaged to protect consumer rights and digital freedoms — including Free Press, Consumers Union, and the Center for Democracy and Technology had not yet gotten involved when I last checked (before the original deadline). They may have gotten involved since — I’ll check, and would love to hear from folk in the know.

If you are a supporter of EFF, Public Knowledge, or the Internet Archive, thank them for their support on this issue. If you are a supporter of other groups that advocate digital rights, consumer protection, and academic freedom, please let them know that you care about this issue, and ask them to weigh in.

Netizen ghosts, or what makes the internet “real”

It reads like a Cory Doctorow satire, but it’s true. Bruce Sterling, the eminent science fiction author and his wife of four years, Jasmina Tesanovic, received an INS notification of pending deportation for Jasmina. A globetrotting couple who organize most of their lives online, they don’t jointly own a house, didn’t go for traditional paraphernalia like wedding china, and have separate bank accounts. Where would one find evidence of their lives together? flickr photos, YouTube videos, a BoingBoing wedding announcement. Bruce needed to make a special Wired Magazine plea for people who know them personally to write the INS before April 15 and testify that they are in fact married. I’ve met Bruce, but don’t know them well enough for that INS form; if you do known them personally please stop reading this right now, tell the INS that they’re for real and then come back.

After the bureaucratic nightmare for Bruce and Jasmina is fixed, what’s interesting is the difference of opinion about what’s considered “evidence” and “real.” The INS is still stuck with an old-fashioned definition of evidence, even though courtrooms have been using email as evidence for a while. The US Federal Rules of Civil Procedure were updated in 2006 with detailed guidelines on how to use email and other electronic information in court.

The epistemological conflict doesn’t just pertain to the dusty bureaucrats at INS. Even Wikipedia has trouble with online sources, as can be seen in this dispute about whether to keep a Wikipedia page on RecentChangesCamp. The event, a regular gathering for a distributed tribe of wiki-keepers, is well-documented in blog posts, online photos, a Twitter stream and so on. But what eventually persuaded the wikipedia editors was an article in the Portland Oregon newsprint business paper. The most chilling aspect of the Wikipedia policy is that blogs are not considered notable. In other words, evidence in the endangered Boston Globe counts, and evidence in the prospering and clearly journalistic Talking Points Memo apparently doesn’t. Another problematic piece of Wikipedia’s policy is the requirement for secondary sources. An event like TransparencyCamp or EqualityCamp is documented by numerous attendees. But unless the San Francisco Chronicle sends a reporter, EqualityCamp doesn’t exist. Attacked by curmudgeons as “unreliable”, Wikipedia ironically places excessive credence in offline sources. As more traditional papers go extinct, and more reporting is provided by online media and peer media, what on earth will Wikipedia do to prove that things are real?

The answer, of course, is that there will develop stronger norms about what makes internet evidence valid. Of course there are many internet sources that are bogus, just as there are forged documents and lies. But there are also plenty of techniques for evaluating the authenticity and reliability of electronic sources. We use them in a common sense manner every day when reading email, evaluating blog comments, and rejecting the fraudsters and spammers.

Surely, there are other government agencies that have developed guidelines that INS could use to update their policies. If you know of any, here is the contact information for Janet Napolitano’s office at the Department of Homeland Security. Do any Wikipedia community members know of efforts to update the notability policy to take TalkPointsMemo and primary event coverage by numerous blogs and other online sources as evidence of notability?

The Bruce and Jasmina INS jam and the RecentChangesCamp kerfuffle show that policy rules and norms haven’t yet caught up with internet reality.

Database journalism – a different definition of “news” and “reader”

Politifact is an innovative journalism project built by Matt Waite, as a project of the St. Petersburg Times, inspired by Adrian Holovaty’s 2006 manifesto on “database journalism”. Waite and Holovaty both focus on the “shape” of the information presented by database journalism – stories that have a consistent set of data elements that can be gathered, presented, sliced, and re-used. This structure is foreign to traditional journalism which thinks of its form as the story, with title, date, byline, lede, body.

The Politifact site started by fact-checking politicians’ statements during the 2008 political campaign. Each statement is rated as on a one to five scale, from “True, Mostly True, Half True, Barely True, or False. Today, the most compelling piece on the site is the “Obamameter” tracking the performance of the president against over 500 campaign promises. Examples include: No. 513: Reverse restrictions on stem cell research – Promise Kept, No. 464: Reduce energy consumption in federal buildings – In the Works, and No. 446: Enact windfall profits tax for oil companies – Stalled.

The shape of the data is part of the picture. It’s certainly the biggest day-to-day difference if you’re composing news or tools for news. But I don’t think it’s the lede. What’s different here is a a different conception of what’s “new” and what a “reader” does.

What’s new Traditional journalism is based on a “man bites dog” algorithm. What’s newsworthy is the dramatic reversal of expectations. Slow, gradual changes are not newsworthy. Large static patterns are not newsworthy. I suspect that this is part genre and part technology. The technology limitation is space; there isn’t room to publish many stats in a newsprint paper, and minimal affordances for navigation.

The emphasis on concise and dramatic “news” leaves our society vulnerable to “frogboiling”, the urban legend in which the frog in gradually heated water gets accustomed to the change, doesn’t jump out, and boils to death. The decline of the North Atlantic cod fishery or the Sacramento-San Joaquin delta are not newsworthy until the cod and the salmon are gone. Wage stagnation isn’t newsworthy until the middle class is gone. Tens of thousands dead on US highways each year isn’t newsworthy, though a traffic jam caused by a fatal accident is news. Many eyes hunting through financial data may find dramatic scandals, to be sure. With database journalism, perilous or hopeful trends and conditions can become worthy of storytelling and comment.

What a reader does The rise of the internet has made reader participation a much greater part of news than the limited “letter to the editor” section. Dan Gillmor, former editor of the “ur-blog” Good Morning Silicon Valley liked to say “my readers are smarter than me” because of the high-quality corrections and tips he’d get from his readers. Database journalism takes the trend a few steps further. Where a traditional news reader consumes the news, a database user interacts with it, looking for information and patterns. The “news” itself may be found by readers doing queries and analysis of the database, such as the database of Prop 8 contributions published by the San Francisco Chronicle.

So database journalism isn’t just about having some fields that are different from “title” and “body”. It’s about different conceptions of time, space, and participation.

Spec work isn’t crowdsourcing

The term “crowdsourcing” is being borrowed by services that solicit design work “on spec”. Design services like CrowdSpring and 99 designs solicit designers for “contests” where everyone does the work for the client, but only one will get paid.

The familiar use of “crowdsourcing” is for services where people contribute freely to something of mutual benefit – wikipedia, a support FAQ, an open source project, tips for a an investigative story. In some uses of crowdsourcing, there is a commercial provider that aggregates the benefit of free labor – technology companies gain when their customers add FAQ entries, and a newspaper or commercial blog benefits when readers submit tips for a published story. But in all of these uses, everyone who contributes benefits too.

With spec work contests, many people do contract design work for no charge and only one of them gets paid. Spec work has long been common during recessions. More buyers are looking to save money, and more contractors are underemployed and willing to put in time doing work that they may not get paid for. These services take this pattern to the extreme by soliciting dozens or hundreds of free contributions. It’s unpaid labor plus a lottery ticket.

Call it unethical, call it lottery labor, but don’t call it crowdsourcing. And if this practice does become called crowdsourcing, we need another sort of term for freely contributed work that benefits everyone doing the contributing.

Journalists and bloggers – twitter for tips

One of the reasons the innovative political news blog Talking Points Memo is great is Josh Marshall’s practice of soliciting tips and research from the reading community. Since @joshtpm is now on Twitter, this could be a great new source for quick tips. One of the common and powerful uses of Twitter is to ask a question of one’s readers. It’s a quick way to learn from the collective knowledge of the community.

When TalkingPointsMemo solicits tips, they can put the twitter address among the ways to reach TPM. Journalists new to twitter should know that you don’t need to follow people to see their tips. All the TPM team would need to do is to go to Twitter search, enter @joshtpm and see mentions of that Twitter handle. TPM could even set up a public tip line by sharing a “hashtag” – a word that starts with a hashmark – like #tpmtips. Then they can use a persistent search to see all the tweets with #tpm tips. Of course this works only for public tips – but there is plenty of information about our political system that hides in plain sight – for example, what is a politician saying in his own district.

The fact that WSJ reporterJulia Angwin sees Twitter as primarily a broadcast medium – even though she learned about it by following colleagues – suggests that using Twitter as a means of learning from your audience hasn’t yet gotten good adoption yet in the journalist community.

Sidney software dev faces lawsuit for iPhone app

ZDNet Australia reports that a software developer in Sidney, Alvin Singh, is being threatened with a lawsuit for writing an iPhone app that is the second most popular item in Australia’s iPhone app store. Rail Corporation of New Zealand, the government body that administers the railroad, charges the developer with violating copyright, but offers no way to authorize developers to access the schedule data.

TransparencyCamp was all about getting social benefit from publishing and reusing government data. Bay Area TransitCamp was about ways of creating tools and services with transit information to improve transit service. There’s a movement around providing more public access to government data so citizens like Alvin Singh can provide useful services

Developers who write useful applications with government data should get awards not lawsuits.

What’s different about Enterprise Twitter

Twitter has taken off on the public web, and there are a variety of vendors who are offering “Twitter for the Enterprise.” As with social networking, it’s not enough to simply clone Twitter and deploy it for business users. Here are some of the key ways that enterprise microblogging is different.  This is part two of a series on what’s different about enterprise software. Part 1 is crossposted here and here.

With public Twitter, people use nicknames. Many people add a profile link that identifies who they are in the real world. Many do not, and tweet pseudonymously. In a business setting, the signal is tied to the user’s real-world identity, derived from their company directory entry and business activities. You can navigate from a signal to a profile, and discover a lot about the person in their work context.   A significant part of the value the people get from enterprise social software is finding the smart and plugged-in people in their organization.  Microblogging helps discover the interesting people, and the links to rich work-context profiles reveal more about what the person does and what they know.

With public twitter, one of the common usage patterns is to share links. Well-informed, insightful people scan the news, and share interesting tidbits with their followers. This valuable pattern on the public net gains power inside an organization. People can share links and commentary about to documents they are working on, for example, a marketing plan or a budget. And they can share private commentary about public links. For example, there can be a company-private discussion about a move by a competitor. Enterprise microblogging allows users to share links to private content, and to share private discussion about public content.

Confidentiality

The main difference between Twitter and enterprise microblogging is confidentiality. You’re not sharing information with the big wide world, only with your colleagues.  As in personal life, confidentiality frees people to share more openly about nonpublic topics.    Of course, people need to be still cognizant about what they share, as they do in meeting rooms or around water coolers.

Inside an enterprise, microblogging has a different balance of transparency and privacy than email. With email, your message is visible only to the people you choose to send it to.  With enterprise microblogging, the recipient chooses who to follow, and whose messages to see. This provides useful “ambient transparency” in an organization, for example spreading useful knowledge about products in development and customer relationships.  Enterprise microblogging is more private than public Twitter, and more transparent than email.

The Art of Enterprise Social Software

As you can see, it’s not enough to take an existing piece of social software and run it behind the firewall. Adapting social software to the enterprise requires consideration about how business and social environments are different, and how social software can be used to provide business value.