Right now, you’ll have to know a little Python to get it going, but if it proves useful it could grow into something for anyone to use via a web interface. The GIF at the top of this post was made like so:
Credit for the idea goes to PastPages users who impressed me with GIFs of their own, including Jeremy Singer-Vine. Andrei Scheinkman, and Zachary M. Seward.
And please keep hacking on that new API!
I’m happy to announce the launch of the PastPages API, which offers a machine-readable version of the site that programmers can use to mine our homepage archive.
While the API is currently free and requires no registration, access is throttled and the system’s structure is likely to change in the future. It was developed using django-tastypie and follows its common conventions.
Today marks the release of the second generation of PastPages' code base, nicknamed “bradlee.” The screenshotting system has been rewritten to make it faster and cheaper by shedding dependencies and introducing a task queue. Here's a quick rundown:
- Firefox -> Webkit
- Selenium -> PhantomJS
- Xvfb headless server -> Nothing!
- One-by-one screenshot script -> Concurrent Celery queue
- Memcached -> Varnish
The result is that a significantly less powerful server now completes a screenshotting run in half the time the old server did before. That saves money in addition to time.
For a few minutes this morning, CNN ran an incorrect headline declaring that a significant part of President Barack Obama’s healthcare law had been struck down by the Supreme Court.
I captured the image above manually several minutes ago. CNN has already corrected the page with a new headline.
The error was missed by PastPage’s hourly script, which visits CNN once per hour. The last visit happened at 10:02 AM EDT, before CNN made a judgement. When it visits again in the next hour, the error will certainly still be gone.
This shows just how quickly news sites can change the framing of stories and proves that even PastPages’ hourly screenshot is wanting. One of my goals with the future development of the site is to increase how frequently it captures data. Al Shaw has suggested we allow for instant on-demand archival when a human spots an error that ought to be captured.
@palewire PastPages totally needs a GO NOW button that you can mash when shit gets crazy— Al Shaw (@A_L) June 28, 2012
If you’re a developer and you’d like to help make this happen, all of the code is open on GitHub and I’d welcome your contributions.
This morning the United States Supreme Court issued a split decision on the legality of a hardline immigration law adopted by the state of Arizona. Four of the law’s provision were reviewed, but only three struck down, according to Kevin Russell at SCOTUSblog.
English-language news outlets in the U.S. and Britain jumped on the news, but disagreed on how to frame the results. Some emphasized that much of the law went down. Others emphasized the survival of a part of the law that, according to the Los Angeles Times, will allow “state officials to begin enforcing a provision that calls on police, when making lawful stops, to check the immigration status of people who may be in the country illegally.”
Fox News and the Los Angeles Times are examples of a “glass three-quarters empty” frame.
Reuters and BBC are examples of the “glass quarter full” frame, framing the news as good news for its supporters.
You can review all of the homepages archived by PastPages for that same hour right here.
Also, the Los Angeles Times is my employer, but in no way associated with PastPages, which I maintain on my own time with the support of a network of individual donors. Read all about it.
Update: Soon after, Reuters changed its play, opting for more ambiguous frame with this revised headline.