Methodology · v1.0 · updated 2026-05-16
How records are sourced
Open UAP is a navigable index of 476 primary-source records on unidentified anomalous phenomena, drawn from public government archives and a handful of established mirrors. The corpus began with United States material and now includes official records from France, the United Kingdom, and other national programmes. The archive does not host original interpretation. Every page links back to the canonical file so a reader can verify what is here against the source.
Sources
- Primary US archives: FBI Vault, NARA, the AARO public release tranche, war.gov UFO release, NASA UAP IST publications, CIA Reading Room (CREST), Air Force historical research portals.
- International government archives: GEIPAN / CNES (France), The National Archives (UK), and national air-force releases mirrored on the Internet Archive (New Zealand, Australia, Denmark, Italy).
- Stable mirrors for records that have aged off agency websites or whose agency portal blocks scripted fetches: archive.org, Wikimedia Commons, The Black Vault, NICAP, CUFON, Project 1947.
- External-only entries (no inline file) are marked with a source only badge. These exist because the record matters and the source page is stable, but a clean digital copy was not retrievable at acquisition time.
Provenance taxonomy
- Primary agency · who originated the document — FBI, USAF, CIA, DOD, AARO, NASA, State, Navy. Usually also where it lives today.
- Repository · where the bytes were actually fetched. When the agency hosts the file directly, repository equals primary agency. When the original release has aged off the agency site, the repository field names the stable mirror we linked instead.
- Original file · the URL of the source page or PDF as found at acquisition. Linked from every record page so a reader can verify.
- Rights · per-item field. US-federal materials are typically no known restrictions; mirrors and reconstructions carry their own attribution.
AI summaries
Records carry a short summary generated by a language model. There are three grounding modes, surfaced on the in-canvas lightbox:
- Vision · the model was given page images of the source PDF and produced a structured digest (tldr, page-cited highlights, outliers). 91 records are vision-grounded.
- Text · the model worked from extracted PDF text. Used when OCR is clean but page images would be wasteful (long monochrome reports).
- Metadata · for records with little or no source text (heavily-redacted FBI memos, reference- only placeholders), the summary is generated from title and metadata alone. Treat as a caption, not a citation.
Generated summaries are validated for hallucinated numbers (every integer in the output must appear in the source). They are not substitutes for reading the original. Where the underlying file is ambiguous or contradictory, the summary is restrained.
Updates
New records are added irregularly. The archive timestamp on every page reflects the last build, not the historical document date. Sitemap lastmod follows the same convention so search engines treat re-indexed records correctly.
Curator
Open UAP is curated by @dominikmartn. Notice a missing record or a wrong attribution? Email hi@dominikmart.in.





