I've noticed in the Yahoo! Pipes forums (http://discuss.pipes.yahoo.com/) a steady stream of people asking essentially the same question:- how to combine information from two feeds so that:
a) Fields are taken from both feeds, and combined into a single output feed.
b) Items are retained from the first feed only if they have a corresponding matching item in the second feed.
In a nutshell, what they want to do is to join the two feeds in the same way that they would join two tables using a SQL statement like this:
SELECT * FROM Table1, Table2
WHERE Table1.keyValue = Table2.keyValue
Here I present a Pipes example where I do just that.
The tables
In this example I have a PUBLISHERS table stored in a CSV file at http://www.btinternet.com/~lotontech/pipes/data/Publishers.csv, and a BOOKS table stored as a CSV file at http://www.btinternet.com/~lotontech/pipes/data/Books.csv.The PUBLISHERS table contains:
PUBLISHER,URL
LOTONtech,http://www.lotontech.com
Wrox,http://www.wrox.com
Wiley,http://www.wiley.com
SAMS,http://www.sams.com
The BOOKS table contains:
PUBLISHER,BOOK,URL
LOTONtech,Mashup Case Studies with Yahoo! Pipes,http://www.lulu.com/content/2403205
LOTONtech,Creating Google Mashups...,http://search.barnesandnoble.com...
LOTONtech,Working with Yahoo! Pipes,http://www.amazon.com...
LOTONtech,Introduction to Microsoft Popfly,http://www.amazon.com...
Wrox,Professional Visual Studio 2005 Team System,http://www.amazon.com...
Wrox,Professional UML with Visual Studio .NET,http://www.amazon.com...
Wiley,Web Content Mining with Java,http://www.amazon.com...
Apress,Foundations of Popfly,http://www.amazon.com...
The results
My Pipe at http://pipes.yahoo.com/lotontech/cs_selectandjoin shows the result, in which:- The item Title is taken from BOOKS.Book
- The hyperlink (if you click on the title) is the book's URL taken from BOOKS.URL
- The description is the publisher's web site URL taken from PUBLISHERS.URL
Items are only included where the key values BOOKS.PUBLISHER and PUBLISHERS.PUBLISHER match, so the results exclude publisher SAMS (which appears in PUBLISHERS but not in BOOKS) and Apress (which appears in BOOKS but not in PUBLISHERS).
The Pipes
The basic pattern is to have a Pipe which traverses one of the feeds / tables, which — for each entry — invokes a sub Pipe that traverses the other feed / table.My Pipe LOTONtech-CS-SelectPublishersJoinBooks at http://pipes.yahoo.com/pipes/pipe.edit?_id=ehxw9rcl3RGkb6_InkartA reads the PUBLISHERS CSV file and, for each entry, passes the two fields PUBLISHER and URL into the sub Pipe LOTONtech-CS-SelectBooksJoinPublisher.
My Pipe LOTONtech-CS-SelectBooksJoinPublisher at http://pipes.yahoo.com/pipes/pipe.edit?_id=gPCyKrcl3RG9ih2gyp1_DQ reads the BOOKS CSV file and combines into every entry the two fields (from PUBLISHERS) that were passed in; thus forming a cross-product of all possible combinations. Since we don't want all possible combinations, it then filters to retain only those where the PUBLISHER field in BOOKS matches the PUBLISHER field passed in from PUBLISHERS.
0 comments:
Post a Comment