The original dataset of PDFs that Facebook released to Congress were processed to extract the relevant metadata and the post image. You can find this code on GitHub if you would like to reuse it. The resulting CSV data was then loaded into this Omeka instance that you are currently looking at. Each Facebook post was then tagged by students while using a codebook developed by the IRAds team.

The resulting augmented dataset is made available for download here under a CC-BY license.

The zip file includes items.csv which lists all the items and their respective metadata. It also includs tag-matrix.csv which is a matrix of tag combinations to help you see which tags co-occur the most.

In addition the data is available through a REST API endpoint at For more information about the API please consult the Omeka documentation.

If you have any questions about the data or ideas for improving it, please contact:

Damien Pfister
Department of Communication
University of Maryland

Cite as:

Lindblad, P., Murphy, N., Pfister, D.S., Styer, M., Summers, E., and Yang, M. Internet Research Agency Ads Dataset. [data file]. Retrieved from