Skip to content →

Google News RSS opens up a whole set of rights issues

As you will no doubt be aware, Google have released an RSS and Atom service that outputs any Google News search or subject listing in RSS or Atom format. This is achieved by adding &output=rss or &output=atom.

The feeds themselves include a Google-produced summary (usually ignoring the provider’s summary) and the image associated with that story, not forgetting the obligatory headline and link to the story.

In addition to making this service available for newsreaders, Google have also created a licence to allow end-users to reproduce the RSS and Atom feeds on their own websites.

All well and good I guess. Except there are two potential issues here:

  • Most news providers have their own RSS terms and conditions, which may differ from the Google News RSS licence
  • The images included with the stories are usually agency stills (AP, Reuters, Getty,etc). These agencies exercise strict controls over the reproduction of their images and it’s not clear how Google can be in a position to sub-licence their reproduction on third party sites.

Contradicting Licence
So, a news provider might release a licence to accompany the RSS feeds made available from their site.

That licence may include a definition of what types of sites can and can’t reproduce the news contained within the feeds. Political, pro-terrorism, and pornographic websites are all examples one could easily expect to fall in this category.

Google News offers the functionality to search articles from just one provider by using the source: provider modifier in the query string (eg “Space Shuttle” from CNN: Web, RSS).

An end user could use that modifier to create a feed of stories from a single news provider, via the Google News RSS service, and include it on their site under the authority of the Google News RSS licence.

That licence is going to be different to the original news provider’s licence which would have come into play if those same stories were sourced from the news provider directly.

In fact the Google licence will probably be less restrictive than the original news provider’s licence because Google are working to a different dynamic to most news providers. Google want as many people to see their feed, whereas a news agency may wish to protect its brand by restricting who can take their feed.

And if individual news providers were able to successfully negotiate with Google that their own terms and conditions must prevail in any resyndication of their stories, how would that work in a normal situation where multiple news stories are returned in a single feed?

Finally, there’s an issue of warranty and feed integrity. It’s likely Google will include advertising in their Google News RSS/Atom feeds at some point in the future (to pay for the service).

Regardless of whether the original news provider is interested in also providing advertising in their direct feeds or not, this is an issue. Commercial news providers will be missing out on potential advertising revenue being leveraged on their content, and non-commercial news providers might feel their position is being compromised.

Agency photo reproduction

Perhaps the biggest issue of all, however, are the images contained in the Google News RSS feed.

Most news providers don’t actually own the stills they use in their articles. In most cases they have deals with the agencies such as AP, Reuters and Getty, to reproduce stills on their respective sites (and only their site).

When Google News first went live, many were surprised that Google were reproducing licensed images on their site – without holding a licence directly with the agencies. I guess Google’s interpretation is that it’s “fair use”. AFP doesn’t agree, however.

The big issue here is that those images are now being redistributed by Google for reproduction on non-commercial third party websites.

It would be impossible for Google to have an agreement in place with every picture agency and so it can only be assumed that, once again, Google are arguing that this is fair use of the images.

It’s likely that the picture agencies themselves will think otherwise – they make money from licensing stills. (I would argue it’s unlikely they would be missing out on much business from “non-commercial” use, but they would probably feel their position is being compromised none-the-less).

There is another issue here, too: exclusives.

What if a rights holder signs an exclusive, ‘one-time use’ licence with a newspaper for a set of photographs? That newspaper might publish the photos to their websites as part of the deal, but that might be the only place on the Internet licenced to reproduce the photos.

The Google News spider can’t know the rights on a given picture, so it would pick up the image in the same way it does for all images, making it available for reproducion on the Google News website.

Google would also make that image available for reproduction on third party sites via their RSS and Atom feed – potentially leaving the end-site owner liable for action to be taken against them by the newspaper, and the original rights holder of the photo. The claim would be that they had reproduced a still that Google wasn’t licensed to give them nor were they licensed to publish directly.

Who knows where this is all going to lead.

I would like to think that the reproduction of agency stills on “non-commercial” websites would be classed as ‘fair use’. The news provider who used it will have paid the agency for the image to be used in association with that story. Assuming that each image is being reproduced as part of the representation of the original news item it was syndicated with on the non-commercial third party site, then I feel that should constitute ‘fair use’.

(To give an example, I would hope a run down of the current news from CNN, with images, would be considered ‘fair use’ on a non commercial website. But a picture gallery of otherwise unconnected images, with no reference or link to the news stories they originally accompanied, could not be considered fair use.)

But that’s just my own view – as an individual, interested in promoting “open media” – and not in any other capacity or on behalf of any company or organisation. As always, the views I express on my blog are my own and not necessarily shared by my employer.

Published in News Thoughts and Rants

8 Comments

  1. Crikey. This legal argument seems to almost go as far as the one over web server caching If Google is inserting its own summaries, and is simply re-using the headline and a link, then Google could probably claim that they own the copyright to that feed, and can thus sub-license it. Yes, this means that any website could then contain a BBC news RSS feed, but then, there’s nothing to stop a website from manually linking headlines to BBC news stories anyway.

    The images, as you rightly point out, are a different issue though.

    Google could argue that news websites could, if they felt so incline, stop Google from aggregating its image by simply using an appropriate robots.txt, which Google obeys. That could even allow or disallow certain images if neccessary.

  2. Crap, can you fix the html in that above comment?

    Ben: Sure, done.

    (And delete this one to save my embarassment)

    Ben: Er, no! 🙂

  3. Ben Ben

    Frankie Robero said:

    Google could argue that news websites could, if they felt so incline, stop Google from aggregating its image by simply using an appropriate robots.txt, which Google obeys. That could even allow or disallow certain images if neccessary.

    This wouldn’t work because Google parse and spider the HTML pages to obtain the img source urls, not the img binary files themselves.

    So doing a no-follow on an /img/ directory wouldn’t stop this.

  4. Hi Ben

    First off let me say I don’t know the answer either but this certainly opens up a can of worms. Reading through the Atom 1.0 spec and the RSS 2.0 spec it seems Atom 1.0 may possibly have some of the answers here regarding rights and re-use.

    1. Atomalso has support for aggregated feeds, where entries from multiple different feeds are combined, with pointers back to the feed they came from.

    2. Below is an AtomEntry and it has elements for , and most importantly as an example of a pointer. This I think this maybe similar to the Trackback feature for Posts?

    *atomEntry =
    element atom:entry {
    atomCommonAttributes,
    atomAuthor
    & atomCategory
    & atomContent
    & atomContributor
    & atomId
    & atomLink
    & atomPublished
    & atomRights
    & atomSource
    & atomSummary
    & atomTitle
    & atomUpdated
    & extensionElement

    Atom has the ability to reference entry content by URI. Below for instance, illustrates how an Atom feed for a photo weblog might appear. The element references each individual photograph in the blog. The summary element provides a caption for the image. The element could also be a reference pointer back the originator just as newspapers and BBC websites do with their images.

    *
    http://www.example.org/pictures
    My Picture Gallery
    2005-08-10T15:00:00Z

    Sam K Sethi

    http://www.example.org/entries/1
    Trip to San Francisco

    2005-08-10T15:00:00Z
    A picture of my hotel room in San Francisco

    http://www.example.org/entries/2
    My new car

    2005-08-10T12:00:00Z
    A picture of my new car

    *

    3. Also Included in Atom 1.0, there is the support for XML Digital Signature on entries. It might be possible to place a digital signature on the image object in the original feed. Again would this stop the image being repurposed in another feed. I do not know as have not tried this.

    4. Atom can also use the rel tag which is coming very popular for many things including XFN social networking but because Atom has rel support for:

    * — Identifies an alternate version of the feed or entry e.g the originating webpage?
    * — Identifies a resource that is described in some way by the content of the entry
    * — Identifies a resource that provided the information contained in the feed or entry; for example, if the entry was distributed through an online aggregation service. i.e Google News

    It is this last one that might once again be a way to provide a reference pointer back to the originator.

    Or maybe like Direct Marketing with its default opt out clause you could you also create a default option using the *rel= . So just as we have *rel=”nofollow” we could have *rel=”norights”.

    Then the user would have to implicitly create a relationship with the originating source i.e tick a box or accept some form of license/cookie. I doubt this would ever work as we would have empty image boxes in feeds everywhere. But I think people would never click on the image.

    6. Extended Atom might fix the problem. Namespace extensions involve mixing new XML elements and attributes with the core Atom elements. For example, Atom defines elements that describe the moment when an entry was created and when the entry was published. However, imagine an application that produces entries whose content must expire at a given point in time (for example, a feed representing special sale offers or a weekly top-ten list). Atom does not provide any core elements that can be used to specify an expiration date. It is possible, however, to declare such an element in a separate namespace and include it in the Atom feed. Consumers of the feed who are not aware of the expiration extension element can simply choose to ignore it. But this might let the originator have an expiration on their content thus preventing its reuse beyond a certain period.

    7. Also there are two media extensions that I know very little about mediaRSS from Yahoo and the Apple Extended Chapter extensions. Apple’s extensions are great but have nothing at the moment to do with the issue of rights we are discussing here other than to possibly in the future the chapters might show the originator or copyright next to each element of an aggregated RSS feed and the mediaRSS extension might come up woth something in regard ot licensing of content.

    8. And finally and sadly Google News is only being published with Atom 0.3 feeds at the moment. I’m surprised they didn’t go ahead and use Atom 1.0 instead of Atom 0.3, but I guess they will soon.

    What was funnier was Dave Winer’s reaction who is quite predictable. He attempted to chide Google for using the generic term “News feeds” rather than “RSS”.

    In Dave’s words, “Like it or not… the technology is called RSS…. Like it or not … the format is RSS 2.0”. This, of course, is incorrect. RSS is one very popular way of publishing syndicated feeds, but it is not the only way and it’s not the only format. The technology is called syndicated feeds. The format is RSS and/or Atom. I think based on my research and work with both Atom will win out as people start to ask questions like yours Ben and find RSS 2.0 is not up to the mark.

    Sam

    P.S Found this reference below which made me smile. Please delete out the elements of this comment as it is far too long. i will probably right the same post on my site.

    Aug 2005 Kembrew McLeod is a self-professed prankster. In 1998 he trademarked the phrase “Freedom of Expression®” as a comment on how the intellectual property law is being used to fence off culture and restrict the way in which people can express their ideas. He is the author of two books: “Owning Culture” and, most recently, “Freedom of Expression®: Overzealous Copyright Bozos and Other Enemies of Creativity”.

    Ben, maybe that is the porblem in the first place!

  5. Finally! Having news alerts e-mailed to me never really interested me but having them as an rss feed! Ahhh, thank you Google! And thank you for bringing them to my attention – going to add them to my feed reader now.

  6. Nick Nick

    “Goole News RSS opens up a whole set of rights issues” — would this be news feeds exclusively for East Yorkshire coastal towns, or just a slip of the keybaord?

Comments are closed.