The recently-launched Tweets Of Congress is collecting and publishing daily archives of tweets by congressional representatives, caucuses, and committees. The site only got up and running last week, so there are daily archives starting June 21, 2017. There’s also the Trump Twitter Archive, which has collected more than 30,000 of @realDonaldTrump’s tweets, which can be searched and downloaded in bulk.
But this points to a larger issue of the US government using commercial social media sites and tools to communicate with the public. This time around, the 2016 End of Term crawl included 9,000+ social media accounts (scraped from the .gov social media registry API) and included 44% FaceBook, 37% Twitter, 10% YouTube accounts. We also collected ~130 TB of .gov ftp sites that agencies use to serve out their collected data sets.
Tweets of Congress is my attempt to collate the entirety of Congress’ daily Twitter output using an automated process that checks Twitter on a fixed interval. Archives are available on this site and in JSON form. You can find JSON datasets linked in posts or in this site’s Github repo. Due to size constraints, archives will be limited at some tbd point. This site is open-source, so feel free to fork or whatever to your heart’s content. For any issues or other feedback, file an issue in the repo or send me an email.