Posts tagged development
I’m happy to announce the launch of the PastPages API, which offers a machine-readable version of the site that programmers can use to mine our homepage archive.
While the API is currently free and requires no registration, access is throttled and the system’s structure is likely to change in the future. It was developed using django-tastypie and follows its common conventions.
Today marks the release of the second generation of PastPages' code base, nicknamed “bradlee.” The screenshotting system has been rewritten to make it faster and cheaper by shedding dependencies and introducing a task queue. Here's a quick rundown:
- Firefox -> Webkit
- Selenium -> PhantomJS
- Xvfb headless server -> Nothing!
- One-by-one screenshot script -> Concurrent Celery queue
- Memcached -> Varnish
The result is that a significantly less powerful server now completes a screenshotting run in half the time the old server did before. That saves money in addition to time.
PastPages now publishes automatically generated citations for every screenshot. Visit a screenshot’s detail page and simply click on the new “citations” button to see a popup like the one pictured above.
It currently provides drafted citations in MLA, APA and Chicago styles. It also provides Wikipedia citation markup that can be immediately pasted into an entry and used as a reference.
My hope is that this will make it easier for scholars, both professional and amateur, to use PastPages. The style above was copied from instructions at Purdue University’s Online Writing Lab. If you see any errors, please let me know or file a ticket on GitHub. If you’re interested in how it’s implemented, you can see the code that makes this work here.
The first versions of PastPages had only one user: Me. So I printed all the timestamps in Los Angeles time, since that’s where I live.
Now that approximately 50 percent of PastPages visitors come from outside the United States, that doesn’t make sense anymore.
In response, I’ve tried to globalize how the site reports the time.
Where appropriate, the site now prints a relative timestamp that will be correct wherever it’s viewed. For example, the homepage now reports:
In other locations, the site now presents a default timestamp in the Coordinated Universal Time, also known as UTC or Greenwich Mean Time. It’s the international standard for this sort of thing. If you’re not familiar with it, it’s roughly the current time in London, though it does not change in the summer for daylight saving time.
This may prove a little awkward at first, especially for U.S. users accustomed to the Internet catering to our vantage on the world. But I’ve tried to make it a little easier to swallow by also providing a publication’s local time, where appropriate.
You can see this on site detail pages:
And on screenshot detail pages:
Also, wherever screenshots are grouped by date, the beginning and end of that day is now midnight in UTC time. This seems like the slipperiest thing to me, and I’d be interested to hear opinions on the best way to present that. Should these also be grouped by a publication’s local time?
This is new ground for me as a developer, so there’s a good chance I’ve overlooked a better solution or created a new problem. But I want to get this right, so if you have advice please contact me or file a bug report on GitHub.