What kind of Twitterer am I? I'm kind of on a role with this data mining of Twitter so I'll take a brief look at my own Twitter habits.
<<Twitter`
session = TwitterSessionOpen["User"->"ragfield"]
TwitterSession[<ragfield>]
Length[tweets = TwitterUserAllTimeline[session]]
1487
DateDifference[TwitterStatusDate@Last@tweets, TwitterStatusDate@First@tweets, {"Year", "Month", "Day"}]
{{1, Year}, {3, Month}, {20.83855324074074`, Day}}
Length[tweets] / DateDifference[TwitterStatusDate@Last@tweets, TwitterStatusDate@First@tweets, {"Day"}][[1, 1]]
3.125009501379522`
1487 total tweets in the last 1 year, 3 months, and 21 days. On average that's 3.125 tweets per day.
DateListPlot[Tally[{#[[1]], #[[2]], 1, 0, 0, 0.}&/@(TwitterStatusDate/@tweets)], Joined->True, Filling->Axis, FrameLabel->{"Month", "Tweets per month"}, PlotLabel->"Ragfield Twitter usage", DateTicksFormat->{"MonthNameShort", " ", "Year"}]
Total[StringLength[TwitterStatusText[#]]&/@tweets]
107442
N[% / Length[tweets]]
72.2542030934768`
107,442 total characters typed. On average that's 72 characters per tweet, roughly half the allotted space.
First@SortBy[TwitterStatusText/@tweets, StringLength]
bed
My shortest tweet was simply "bed".
Length[allWords = StringSplit[StringJoin[Riffle[TwitterStatusText/@tweets, " "]], Except[WordCharacter|"'"]..]]
19188
19,188 unique words typed.
Length[allWords = DeleteCases[allWords, x_/;StringMatchQ[x, DigitCharacter..]]]
18528
18,528 if you don't count numbers.
First@Reverse@SortBy[Tally[allWords], Part[#, 2]&]
{the, 604}
The most common word I've typed is "the". That's not terribly useful. Let's take a look at just nouns to see what kind of topics I mention most frequently.
Length[nouns = Cases[allWords, x_/;MemberQ[WordData[x], "Noun", ‚àû]]]
7883
Grid[Take[Reverse@SortBy[Tally[nouns], Part[#, 2]&], 30], Alignment->{{Right, Left}}]
| I | 431 |
| a | 419 |
| in | 259 |
| at | 126 |
| have | 77 |
| bike | 62 |
| work | 54 |
| time | 52 |
| out | 51 |
| are | 48 |
| ride | 45 |
| think | 43 |
| so | 43 |
| run | 43 |
| now | 42 |
| like | 42 |
| one | 38 |
| d | 38 |
| morning | 36 |
| last | 36 |
| can | 36 |
| race | 35 |
| miles | 35 |
| mile | 35 |
| home | 35 |
| first | 35 |
| way | 34 |
| still | 34 |
| year | 31 |
| good | 31 |
Most common topics: bike, work, time, ride, think, run, like, morning, race, mile(s), home.
We can do the same thing with verbs to see what kind of actions I describe most frequently.
Length[verbs = Cases[allWords, x_/;MemberQ[WordData[x], "Verb", ‚àû]]]
4300
Grid[Take[Reverse@SortBy[Tally[verbs], Part[#, 2]&], 30], Alignment->{{Right, Left}}]
| is | 164 |
| was | 108 |
| be | 80 |
| have | 77 |
| bike | 62 |
| up | 59 |
| work | 54 |
| time | 52 |
| out | 51 |
| ride | 45 |
| been | 44 |
| think | 43 |
| run | 43 |
| has | 43 |
| like | 42 |
| last | 36 |
| can | 36 |
| race | 35 |
| home | 35 |
| still | 34 |
| had | 32 |
| got | 32 |
| get | 30 |
| do | 30 |
| back | 28 |
| long | 27 |
| will | 26 |
| see | 26 |
| did | 25 |
| know | 24 |
Most common actions: bike, work, ride, think, run, race, see, know. I guess there's a lot of overlap between the nouns and the verbs.
It's also pretty easy to determine the other users I mention most frequently.
Grid[Take[Reverse@SortBy[Tally[StringCases[StringJoin[Riffle[TwitterStatusText/@tweets, " "]], "@"~~a:Except[WhitespaceCharacter]..:>Hyperlink[ToLowerCase[a], "http://twitter.com/"<>ToLowerCase[a]]]], Part[#, 2]&], 10]]
| melissa_raguet | 84 |
| spoonshake | 69 |
| esmithrunner | 32 |
| erik_d | 28 |
| dbfulton | 18 |
| gutzville | 14 |
| ultrashea | 11 |
| erik__d | 9 |
| adamengst | 9 |
| chockenberry | 6 |
Downloads
Download WebUtils.m (required by Twitter.m).

1 comment:
Rob, I think these packages no longer work due to the move to Oauth? Any hope of an Oauth version in the near future? Will Wolfram support Oauth?
Post a Comment