Data Sources
This is a list of historical datasets, APIs and other data resources that I may use on this site. I created some or worked on the projects that produced them, and have already used them for research or plan to do so. But I know very little about most of them and the list is a resource to dip into, and try out new things with a variety of data. There is also a list of resources for code, tools, methodologies etc.
The vast majority of the datasets are English-language textual data, and focus on British, north American or Australian sources and history. Further information should be found at the links (and via google searches). Many will be licensed as open data, but this should be verified before use.
I’ll add more as I find them. In part it’s a reference for my own use - to give me ideas and draw together scattered bookmarks and links in a place I might actually be able find them when I want them. But others may also find it useful.
Reflecting the diversity of historical sources and the priorities of projects that have digitised them, the datasets can be expected to range from relatively small ‘boutique’ data to large text corpora. They might have been transcribed by hand or using OCR. They’ll undoubtedly vary in terms of complexity, structure, clean/tidyness and ease of use.
My data projects
APIs
- The National Archives (UK)
- Trove (National Library of Australia)
- Science Museum (UK)
- Library of Congress APIs
- Europeana Labs
- Digital Public Library of America
- Hathi Trust Digital Library
- Old Bailey Online
- Locating London’s Past
- Manuscripts Online
- Caselaw Access Project
- NY Times Archive
- The Keep (East Sussex Archives)
- V & A API
Literary/linguistic text corpora
Women and gender
Political/diplomatic/military
Economic/environmental/material culture
- Feeding the City II : Demesne Agriculture in the London Region, 1375-1400
- What’s on the Menu?
- Trading Consequences
- African Commodity Trade Database, 1730-2010
- International Currencies 1890-1910
- Southeastern Australian rescued observational climate network, 1788-1859
- RICardo dataset (trade flows 1787-1938)
Social history