Zoho Writer
Home   Download  Email This Page  
Apress Standard Book Design

8

Pro (Web 2.0) Mashup Development: Remixing data and services (forthcoming book from Apress). (http://blog.mashupguide.net)

Introduction

How many times have you seen a website and said, "This would be exactly what I wanted – if only…..” If only I could combine the statistics here with data from my company’s earnings projections? If only I could take the addresses for these restaurants and plot them on one map. How often have you entered the date of a concert into your calendar with a single click instead of retyping? How often do you wish that you can make all the different parts of your digital world -- your email, your word processor documents, your photos, your search results, your maps, your presentations --work together more seamlessly? (After all, it's all digital and malleable information – shouldn't it all just fit together?)

In fact, below the surface, all the data, websites, and applications you use do fit together. This book teaches you how to forge those latent connections, -- to make the web your own -- by remixing information to create your own mashups. A mashup -- in the words of the Wikipedia -- is a "website or web application that seamlessly combines content from more than one source into an integrated experience."1 Learning how to draw content from the Web together into new integrated interfaces and applications, whether for yourself or for other others is the central concern of this book.

Let's look at a few examples to see how people are remixing data and services to make something new and useful:

* Housingmaps.com brings together the housing and rental listings from craigslist.com with Google Maps. Note that it was invented by neither Google nor Craigslist by an individual programmer, Paul Radamacher. Housingmaps.com adds to the functionality of Craigslist, which will show you on a map where a specific listing is located but not all the rentals or houses in an area.2

* Google Maps in Flickr (GMiF). Brings together Flickr pictures, Google Maps, Google Earth, and the Firefox browser via Greasemonkey.3

* The Library Lookup bookmark is a JavaScript bookmark let that connects Amazon.com and your local library catalog.4

In order to create our own mashups and customize the web, we will look at these examples in greater detail, in addition to many other large and small examples. There are countless specific problems you can solve through remixing information. Here are some examples of strategies we will learn:

* quickly extract bookmarks in your web browser and post them to your weblog-- or format them for an email or wordprocessing document

* display your photos in Google Maps and Google Earth

* create a “dream journey” map of locations that you want to visit based on addresses that you find on various websites. It is especially interesting to make a map of more than one address

* synchronize your calendars and share them with other people

* take book information from diverse sources-- Amazon.com, Barnes & Noble, and library catalogs--and format them into a common bibliography to include in the word manuscript.

* republish word documents that are custom-formatted for your website

* take a book that you found amazon.com and instantly locate it in your local library

* take data from the Web and it automatically into a polished wordprocessing document

* make it easier to add tags to anything on your computer.

* add an event listed on the Web to your calendar and e-mail it to other people with one mouse click.

Mashups are certainly hot right now. (Which is interesting because it makes you part of a shared undertaking, a movement,) Mashups are fun, often educational. There’s delight in seeing familiar things brought together to create something new that is greater than the sum of its parts. They don’t necessarily ask to be taken that seriously. And yet they are also powerful – you can get a lot of functionality without a lot of effort. They may not be built to last forever – but you often can get what you need from them without having to invest more effort than you want to in the first place.

The Web 2.0 movement

The Web 2.0 bandwagon is an important reason why mashups are popular now. Mashups have been identified explicitly (under the phrase "remixable data source" and "the right to remix") by Tim O’Reilly in "What is Web 2.0?" 5 Not only the attention via the hype, but the development of what might be accurately thought of as "Web 2.0 technologies/mindsets" to remix/reuse data, web services, and micro-applications to create hybrid applications. Recent developments bring us closer to enabling users recombine digital content and services:

* increasing availability of XML data sources and data formats in business, personal, and consumer applications (including office suites)

* wide deployment of XML web services

* widespread current interest in data remixing or mashups

* AJAX and the availability of Javascript-based widgets and micro-applications

* evolution of web browsers to enable greater extensibility (e.g., Firefox extensions and greasemonkey scripts)

* explosive growth in "user-generated content" or "lead-user innovation"

* wider conceptualization of he internet as a platform ("Web 2.0")

* increased broadband access

These developments have transformed creating mashups from being technically challenging to nearly mainstream. It is not that difficult to get going, but you need to know a bit about a fair number of things and you need to be playful and somewhat adventurous.

Will mashups remain hot foreover? Undoubtedly, no, but not because they will be completely forgotten—however, I would argue that because the functionality we see in mashups will eventually be subsumed into the ordinary what-we-expect-and-think-has-always-been-there functionality of our electronic society.

Moreover, mashups reflect deeper trends, even the deepest trends of human desire. As the quality, quantity, and diversity of information grow, users long for tools to access and manage this bewildering array of information. Many users will ultimately be satisfied by nothing less than an information environment that gives them seamless access to any digital content source, handles any content type, and applies any software service to this content. Consider, for example, what a collection of bloggers expressed as their desires for next generation blogging tools6:

Bloggers want tools that are utterly simple, and allow them to blog everything that they can think, in any format, from any tool, from anywhere. Text is just the beginning: Bloggers want to branch out to multiple media types including rich and intelligent use of audio, photos, and video. With input, having a dialog box is also seen as just a starting place for some bloggers: everything from a visual tool to easy capture of things a blogger sees, hears or reads point to desirable future user interfaces for new generations of blogging tools.

Overall Flow of the Book

A central question of this book is: "how can both non-technical end-users and developers recombine data and internet services to create something new for their own use for and for others?" Although this book focuses primarily on XML and web services and the wide variety of web applications, I also examine the role played by desktop applications and operating systems.

Mashups combine data from more than one source into a coherent whole. You can get to an understanding of a mashup by asking a number of fundamental questions:

* What is being combined?

* Why are these elements being combined?

* How are they being combined, both in the interface, but also behind the scenes in the technical machinery?

* How can the mashup be extended or elaborated upon?

The overall flow of the book is: What can be done with no programming -> programming of one system (through its API) -> figuring out how to combine 2 or several systems -> creating "service composition frameworks" for combining arbitrary systems.

It would be easy to veer off into heavy-duty theory in this book. Instead, we will keep grounded in "practical interoperability" (a grab-what-we-can-from-wherever approach) while dipping into the deeper pools of grand unification efforts (such as the full semantic web vision) that have so far not come to full fruition.

The book's structure

The following is a breakdown of the parts and chapters of the book.

Part I: "Remixing without programming." introduces mashups without demanding programming skills from the reader and teaches skills for deconstruct applications for their remix potential.

* Chapter 1: "Learning from a Study of Specific Mashups" analyzes in detail a selection of mashups/remixes (specifically, housingmaps.com, Google Maps in Flickr, and librarylookup bookmarklet) to get readers oriented to mashups in general and to some general themes we will continually revisit throughout the book.

* Chapter 2: "Looking at Flickr, Delicious, Google maps, and Amazon.com as end-user tools" analyzes Flickr (as our primary extended example) for what makes it the remix platform par excellence for learning how to remix a specific application and exploit its many features that make it so remixable. The chapter compares and contrasts flickr with other remixable platforms: del.icio.us, Google Mpas, and amazon.com.

* Chapter 3: "Tagging and Folksonomies." Tagging, which allow users to attach words to pictures, websites, people – almost anything on the Web -- is glue that holds many things together, both within and across websites. This chapter illustrates how tags are used in Flickr, delicious, and technorati and discusses how to create interesting tag-centric mashups, how people are "hacking" the tagging system to create ad hoc databases, and how tags related to other classification systems.

* Chapter 4: "RSS and Atom and syndication; integration with news readers" presents RSS, perhaps the most widespread dialect of XML, as both a potent technology for remixing in its own right – and also as a specific way to learn about XML more generally. Not to be missed is the section on the various RSS/Atom related formats and their significance for information remix.

* Chapter 5: "Integration with Weblogs and Wikis" uses Flickr's integration with weblogs as a jumping off point for an exploration of weblogs and wikis and their programmability. Integration with weblogging is an important topic since blogs represent a type of remixing in a narrative, as opposed to "data-oriented" remixing via tags and straight RSS so far discussed. Integrating with wikis, particularly the Wikipedia, is also covered in this chapter.

Part II. "Remixing a single web application using its API" concentrates on teaching the broad classes of web-based APIs by studying exemplars of each class.

* Chapter 6: "Learning XML Web services APIs through Flickr" In addition to be an exemplar for a range of non-programming remixing techniques in Part I, Flickr is also an excellent playground for learning XML web services. This chapter provides an explanation of REST vs XML-RPC vs SOAP and teaches the basics of web programming, including CGI, static vs dynamic content, and the HTTP protocol, and emphasizes what one can learn by spotting correlates between the UI and the API. Though Flickr is the central example in this chapter, I will strive to make the focus not Flickr but XML web services.

* Chapter 7: "Other XML Web Services APIs" explicates commonalities and contrasts among various API providers, specifically those between Flickr and other systems; and surveys the types of services available and how to think about the sheer range of APIs. This chapter looks at sites such as programmableweb.com that documents these various APIs and the challenges faced in doing so.

* Chapter 8: "Learning AJAX/Javascript widgets and their APIs" describes the other large class of web application remixability: those of Javascript-based widgets, many of which are AJAX applications. This chapter contrasts "old-style" web applications with AJAX approaches through specific examples in Flickr and other applications, introduces the various Javascript widget libraries (e.g., Yahoo UI vs scriptaculous vs Prototype vs Dojo etc) , and uses one of the Javascript widget libraries to demonstrate how to program widgets.

Part III: "Remixing several applications" builds upon the techniques explicated in Parts I and II to discuss techniques for remixing several applications.

* Chapter 9: "Dissecting mashups and remixes" turns to a study of how to combine two or several services together. This chapter dissects one or two specific examples in detail but also tries to draw out general mashup design patterns to look for when looking at a range of mashups. The Google Maps in Flickr (GMiF) greasemonkey script (http://webdev.yuan.cc/gmif/ ) (which mashes up Flickr, Google maps, Google Earth in the Firefox browser via greasemonkey) will be dissected in detail.

* Chapter 10: "Creating Mashups of Several Services" teaches how to write mashups by providing a detailed examples that we build from the ground-up: a stripped-down version of (what used to be) geobloggers.com functionality (Flickr pictures + map).

Part IV: "Remixing arbitrary applications" explores the process of remixing large numbers of services and data. Since there are no clear-cut answers yet and numerous challenging issues, we will focus on the practical and doable while sketching out the challenges and articulating some abstract principles.

* Chapter 11: "Service Composition Frameworks" introduces the notion of a "service composition framework," which allows for facile integration of services and data with minimal custom-work. The chapter explores frameworks for end-users looking to reuse data with minimal level of programming and customization, as well as frameworks for those who can and are willing to do some programming. The chapter also surveys emerging products for generating "situational software."

* Chapter 12: "The Quality and Remixability of APIs" shifts the focus of the book briefly from the consumption to the production of APIs. The chapter discusses the characteristics of high quality APIs, what makes services and data more or less mixable, and appropriate boundaries to set for how data should be recombined.

Part V. "Special topics" covers how to remix and integrate specific classes of applications, using the core conceptual framework of Parts I-IV to guide the discussion.

* Chapter 13: "Online Maps and Google Earth" covers popular online maps and "virtual globes", offering examples of map-based mashups. We look at making maps without programming, data exchange formats (GeoRSS KML) and turn to the various APIs: Google Maps, Yahoo Maps, Microsoft's maps, and Mapquest maps. We also take up geocoding and JSON output, Google Earth shows up in the culminating examples: geocoding a few addresses and generate online maps and KML and showing Flickr pictures in Google Earth.

* Chapter 14: "Social Bookmarking and bibliographic systems" covers how social bookmarking responds to a fundamental challenges -- the job of keeping found things found on the Web: at a basic level, URLs, but other digital content such as images, data sets. Social bookmarking, is interesting not only for the extensibility/remixability being built into these systems, but also for the insight it offers into other systems. Delicious, the grand-daddy of social bookmarking sites, is generally credited with kicking off not only the latest wave of social bookmarkin. This chapter walks through a select set of social bookmarking systems and their APIs, as well as interoperability challenges among these systems.

* Chapter 15: "Online calendars and Event Aggregators" shows what data you can get in and out of calendars without programming (using iCalendar, XML feeds, hCalendar microformats), how to program individual calendars (GCal, Yahoo calendar (some day), 30boxes), how to move data from one calendar to another and move event data into calendars, and republish calendars into public event data.

* Chapter 16: "Online storage" surveys a potentially important and growing area, reviews online storage solutions, shows the basics of using Amazon S3 and box.net and compares the two systems.

* Chapter 17:"Office Suites (ODF vs Microsoft OpenXML)" shows how to do some simple parsing in ODF and OpenXML, demonstrates how to create a simple document in both ODF and OpenXML, explains some simple scripting of Microsoft Office and OO.o, and surveys the world of possibilities unleashed by manipulating open document formats.

* Chapter 18 " XML, RDF, microformats" outlines the XML vs RDF debate, illustrates practical examples of how RDF can be used, as well as simple examples of microformats as used in popular services.

* Chapter 19: "Search" shows how to use the Google and Yahoo search APIs and comparesthem to the search-like APIs in services such amazon.com and library-oriented search protocols.

Part VI: Appendices, including a how-to on setting up your computer to create mashups and background tutorial on HTML, CSS, and XML.

Intended Audience

This book will be accessible to a wide range of readers, including those who are curious about Web 2.0 applications and who want to know more about the technical underpinnings. The technical perquisites are a good understanding of HTML, basic CSS, and basic JavaScript. References to appropriate background materials will be provided.

At the same time, the experienced developer will also be able to learn much from the book. Although there will be a breadth of coverage, I will strive to state deep, essential things about the technologies in question (with respect to its applicability to remix)– things that might not be obvious at first glance.

Information remix can easily come across as a confusing grab bag of techniques. Beginners have a hard time understanding the significance of XML, web services, AJAX, COM, metadata for remixing data. It is not that difficult to get going, but you need to know a bit about a fair number of things and you need to be playful and somewhat adventurous. The answers that students of this topic are looking for are mostly to be found scattered throughout a large selection of books – but they need a guide to know where to begin. My book is aimed at serving as that guide.

Note on Platform

Ideally, this book will have examples that will be useful to readers, regardless of the platform and languages they use or are familiar with. The reality is that there are limitations on both the writer and readers’ part. On the writers’ part, while I have a sense of some general principles, I have implemented them in specific ways (languages, platforms) and not in all possible ways, or even in a representative range of techniques.

In this book, most of the server-side code will be presented in PHP. Some code will be in Python; others will be worked out within the .NET and Java frameworks. (What I hope to do is expand my knowledge of specific platforms, draw in contributors to offer examples in other languages – and certainly write about how to work in a multi-language world.)

1 http://en.wikipedia.org/wiki/Mashup_(web_application_hybrid), accessed as http://en.wikipedia.org/w/index.php?title=Mashup_%28web_application_hybrid%29&oldid=98002063

2 http://housingmaps.com

3 http://webdev.yuan.cc/gmif/

4 http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookup.html

5 http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

6 http://www.cadence90.com/blogs/2004_03_01_nixon_archives.html#107902918872392913

DRAFT Version: 2007-04-10 21:32:03 Copyright Raymond Yee.


0 Comments