TWITTER REPORT

Dataset contains external users as mentions and replies.

Input data: Twitter JSON all_filtered

Start time: Fri Jun 8 11:22:29 2018

Data Description

Import Summary

This is an overview of the import process that was used to create the dataset.

Twitter file(s) imported /Users/justinlittman/Data/usher/beltway_reporters/tweets_feb_to_march_2018/all_filtered.json

Twitter file format Twitter JSON

Dynamic meta-network? No, all tweets are in one meta-network

Import Data Statistics

This is an overview of the tweet activity in the dataset. The dataset contains only one meta-network and all tweets are analyzed.

Network Twitter JSON all_filtered

First tweet date 2006-04-16 21:19:49-04

Last tweet date 2018-03-31 23:58:57-04

Number of tweets 851296

Number of tweets with geotag 469

Number of tweets with URL 498399

Number of retweets 290609

Number of tweeters 2259

Number of verified tweeters 1278

Number of news agency tweeters 50

Number of mentions 146265

Number of distinct hashtags 27288

Number of distinct hashtags used more than once 19859

Number of distinct words 0

Number of distinct words used more than once 0

Number of distinct locations 256

The following links give more detailed statistics by category.

Tweet Statistics

All Tweeters Statistics

Verified Tweeter Statistics

News Agency Tweeter Statistics

Available Analyses

Click on an analysis below for detailed results.

Analysis of Tweeters - Super Spreaders

Analysis of Tweeters - Super Friends

Analysis of Tweeters - Other Influencers

Analysis of Tweeters - Attributes

Analysis of Hashtags

Analysis of Words

Analysis of Locations

Analysis of Tweets

Analysis of Tweets - Attributes

Research Notes

Paths in re-tweet networks tend to be small and hierarchical which makes many traditional social network measures empirically uninteresting.

Closeness is not calculated for twitter data because: a) most nodes are not reachable, and b) it is expensive to calculate given the size of the data.

Traditional betweenness is not calculated because: a) given the length of chains, most nodes have only a small value and it does not discriminate among nodes, and b) it is expensive to calculate given the size of the data.

Eigenvector centrality is not calculated because: a) there are rarely cases of mutual retweeting and so dense interconnected networks that would lead to high values, and b) it is expensive to calculate given the size of the data.

Verified actors have a plus(+) appended to their name. News sources have an asterisk(*) appended to their name.

Network	Twitter JSON all_filtered
First tweet date	2006-04-16 21:19:49-04
Last tweet date	2018-03-31 23:58:57-04
Number of tweets	851296
Number of tweets with geotag	469
Number of tweets with URL	498399
Number of retweets	290609
Number of tweeters	2259
Number of verified tweeters	1278
Number of news agency tweeters	50
Number of mentions	146265
Number of distinct hashtags	27288
Number of distinct hashtags used more than once	19859
Number of distinct words	0
Number of distinct words used more than once	0
Number of distinct locations	256

Produced by ORA-NetScenes, a joint product of the CASOS center at Carnegie Mellon University and Netanomics

Twitter file(s) imported	/Users/justinlittman/Data/usher/beltway_reporters/tweets_feb_to_march_2018/all_filtered.json
Twitter file format	Twitter JSON
Dynamic meta-network?	No, all tweets are in one meta-network