[localhost:~]$ cat * > /dev/ragfield

Thursday, May 21, 2009

Twitter usage patterns

What kind of Twitterer am I? I'm kind of on a role with this data mining of Twitter so I'll take a brief look at my own Twitter habits.
<<Twitter`
session = TwitterSessionOpen["User"->"ragfield"]
TwitterSession[<ragfield>]
Length[tweets = TwitterUserAllTimeline[session]]
1487
DateDifference[TwitterStatusDate@Last@tweets, TwitterStatusDate@First@tweets, {"Year", "Month", "Day"}]
{{1, Year}, {3, Month}, {20.83855324074074`, Day}}
Length[tweets] / DateDifference[TwitterStatusDate@Last@tweets, TwitterStatusDate@First@tweets, {"Day"}][[1, 1]]
3.125009501379522`
1487 total tweets in the last 1 year, 3 months, and 21 days. On average that's 3.125 tweets per day.
DateListPlot[Tally[{#[[1]], #[[2]], 1, 0, 0, 0.}&/@(TwitterStatusDate/@tweets)], Joined->True, Filling->Axis, FrameLabel->{"Month", "Tweets per month"}, PlotLabel->"Ragfield Twitter usage", DateTicksFormat->{"MonthNameShort", " ", "Year"}]
Ragfield Twitter usage
Total[StringLength[TwitterStatusText[#]]&/@tweets]
107442
N[% / Length[tweets]]
72.2542030934768`
107,442 total characters typed. On average that's 72 characters per tweet, roughly half the allotted space.
First@SortBy[TwitterStatusText/@tweets, StringLength]
bed
My shortest tweet was simply "bed".
Length[allWords = StringSplit[StringJoin[Riffle[TwitterStatusText/@tweets, " "]], Except[WordCharacter|"'"]..]]
19188
19,188 unique words typed.
Length[allWords = DeleteCases[allWords, x_/;StringMatchQ[x, DigitCharacter..]]]
18528
18,528 if you don't count numbers.
First@Reverse@SortBy[Tally[allWords], Part[#, 2]&]
{the, 604}
The most common word I've typed is "the". That's not terribly useful. Let's take a look at just nouns to see what kind of topics I mention most frequently.
Length[nouns = Cases[allWords, x_/;MemberQ[WordData[x], "Noun", ‚àû]]]
7883
Grid[Take[Reverse@SortBy[Tally[nouns], Part[#, 2]&], 30], Alignment->{{Right, Left}}]
I431
a419
in259
at126
have77
bike62
work54
time52
out51
are48
ride45
think43
so43
run43
now42
like42
one38
d38
morning36
last36
can36
race35
miles35
mile35
home35
first35
way34
still34
year31
good31
Most common topics: bike, work, time, ride, think, run, like, morning, race, mile(s), home.
We can do the same thing with verbs to see what kind of actions I describe most frequently.
Length[verbs = Cases[allWords, x_/;MemberQ[WordData[x], "Verb", ‚àû]]]
4300
Grid[Take[Reverse@SortBy[Tally[verbs], Part[#, 2]&], 30], Alignment->{{Right, Left}}]
is164
was108
be80
have77
bike62
up59
work54
time52
out51
ride45
been44
think43
run43
has43
like42
last36
can36
race35
home35
still34
had32
got32
get30
do30
back28
long27
will26
see26
did25
know24
Most common actions: bike, work, ride, think, run, race, see, know. I guess there's a lot of overlap between the nouns and the verbs.
It's also pretty easy to determine the other users I mention most frequently.
Grid[Take[Reverse@SortBy[Tally[StringCases[StringJoin[Riffle[TwitterStatusText/@tweets, " "]], "@"~~a:Except[WhitespaceCharacter]..:>Hyperlink[ToLowerCase[a], "http://twitter.com/"<>ToLowerCase[a]]]], Part[#, 2]&], 10]]
Downloads
Download WebUtils.m (required by Twitter.m).

1 comment:

dangrsmind said...

Rob, I think these packages no longer work due to the move to Oauth? Any hope of an Oauth version in the near future? Will Wolfram support Oauth?