As I’ve written about previously, although I’m not directly on Twitter at the moment, I am still curating information using FlipBoard Magazines and currently “publishing it” weekly as the Weekly RiczWest, which is basically a static page on WordPress with some links to my FlipBoard magazines that is available as a menu item at the top and is also tweeted using IFTTT (If This Then That).
This is not really a satisfactory solution as I’d actually like to produce a “custom” magazine which contains only the content of the past week, which is the same idea I had with the Daily RiczWest. For all my gripes against Twitter, at least they make their content available via IFTTT (it wasn’t for a while, but now it’s back). Unfortunately, FlipBoard seems to be rather a closed system. This has obviously offended the “hacker” (and I mean that in a good way, not the common debased usage) so I’m looking to open it up and continue work on the core concept of taking a set of “highlighted posts” (by Flipboard Magazines, Twitter Favourites, …) and put them in to a custom stream and newspaper.
The first step is to free the information from FlipBoard to a neutral format that can be worked with. As usual, “the web” can help rescue us. FlipBoard have (marginally) “opened up” their magazines by giving them a URL which is updated and viewable by others on a browser. This gives us just enough room to actually capture the information, but how?
There are essentially three problems here:
- Sense changes to the web page
- Capture those changes
- Save them somewhere
3 will be solved by our old friend DropBox along with IFTTT for the high level orchestration, but it really could be anything.
For 1, before I found out I could only get my magazines “externally” from FlipBoard I was hoping they would have them exposed as RSS – unfortunately not :-( It’s a good idea though, so this lead to me searching for some way to convert web pages to RSS feeds. Luckily, there are a number of possible solutions, but I’ve chosen page2rss which is very easy to use (i.e. just give it the URL and it “does” the rest).
For 2, you simply set up an IFTTT Recipe that is triggered by your RSS feed and saves the ‘changes’ in the Web Page from 1
Once you have this set up you’ll get way too much information in your file as it seems to be a JSON Object – I’ve given a stripped down example below
"title":"Z Motherboards For Mini-ITX Builds - Reviews - Tom’s Hardware",
"excerptText":"Previous<p>The Mini-ITX Market Is Small, But Growing...<p>
ASRock ZE-ITX<p>ZE-ITX Software<p>ZE-ITX Firmware<p>Asus ZI-Deluxe<p>
ZI-Deluxe Software<p>ZI-Deluxe Firmware<p>EVGA Z Stinger<p>Z Stinger Software<p>Z …",
as in the next post I’ll be looking at parsing this for enough information to produce a WeeklyRiczWest newspaper. I’m hoping that we may even be able to use some of FlipBoards summary extraction features…