Email Updates RSS Subscribe
Line

This blog is created and maintained by the technical team at Hook in an effort to preserve and share the insights and experience gained during the research and testing phases of our development process. Often, much of this information is lost or hidden once a project is completed. These articles aim to revisit, expand and/or review the concepts that seem worth exploring further. The site also serves as a platform for releasing tools developed internally to help streamline ad development.

Launch
Line

Hook is a digital production company that develops interactive content for industry leading agencies and their brands. For more information visit www.byhook.com.

Line

Social Sentiment: Real Time artwork with Twitter

Line
Posted on March 21st, 2011 by Derek
Line

Real time awesomeness
Real-time graphics have been an obsession of mine since the day my father came home with an Atari 800xl. I spent hours copying lines of BASIC code out of magazines to play games like “Space Junk” only for my mother to come in and turn the Atari off while I was at school and flushing my hard work.

Thanks to the hard work of others and huge leaps in processing power, exploring the world of real-time graphics is much easier. I’d wanted to build something with Processing ( http://processing.org ) for quite some time. It has a huge following, tons of libraries, and was born at MIT. Also, It’s much more difficult for my mother to unplug the internet and destroy my work.

Twitter Data for real-time artwork
The twitter streaming server is referred to as ‘Firehose’ for good reason. It’s continuously spewing tweets and all of their accompanying meta-data. This provides a great look into real time data from around the world, and is a great source for generative graphics. I find this approach interesting as it really starts to show some patterns in the chaotic reality that is human social behavior. The stream is in JSON format and easily consumed using java or in this case the twitter4j lib. ( http://twitter4j.org ) I’m sure other social media platforms will or do have similar streams. I look forward to using them in future works for something even more visually abstract.

Wrapping it all up
Based on the streaming data from twitters “firehose”, and the very cool opengl capabilities of processing I decided to map each tweet out to a 3D globe. Using natural language processing we color the tweets points, and text to a color representing the sentiment of the tweet. Pinkish is negative, Blue is Positive, and white is neutral. Now not all tweets from the stream are created equal. This is because not all users tweet from a geo enabled device. So, in the case that the tweet does not include geographical location data we simply do not plot it on the globe. But every tweet is analyzed by the NL engine and then used in the data set for the radial graph around the globe.

Using the great toxi libs I was able to track the tweets floating down the the surface as solids in a voxel grid. They are then converted to a triangular mesh and I draw them out to the screen in the render loop. If you watch them closely you will see the mesh uses the color of the tweet it represents, and mixes colors with nearby tweets creating a colorful mesh of voxely awesomeness. I realize this is nothing more than visual noise. But the intention here was to create something visually entertaining as well (semi-)informative.

Making since of tweets
Have you seen the mindless jabber on twitter? It’s a bunch of “Got my #verizon #iphone #today! Can’t wait to download hipstermatic so I can be a photographer!”, “#Superbowl ads are the lolz!” or “#bacon for lunch again!” What’s the use in any of that? Well, using the very cool Lingpipe natural language engine we can make an attempt to extract the sentiment of tweets. While it’s not 100% accurate (the authors claim somewhere in the 80% range) it is very useful for driving visual cues in the visualization. It’s also great at identifying language or allowing us to filter stuff out such as people who are just tweeting links to porn sites or lolcats.

Computational linguistics is a science in itself. Let’s break down the basics used used here to classify the tweets.

In a new Processing sketch we include Lingpipe and connect to twitter to monitor the stream and build a Language Model (LM) that will later be loaded into the main visualization sketch to classify the tweets from the stream.

Pseudo code:

1
2
3
4
5
6
7
8
import com.aliasi.util.*;
import com.aliasi.classify.*;
 
mCategories = [ “positive”, ”negative”,  “neutral”, “url” ];
int nGram = 8;
mClassifier
= DynamicLMClassifier
.createNGramProcess(mCategories,nGram);

We initialize an instance of DynamicLMClassifier using a factory method and pass in an array of categories we will be using, and an n-gram size. In this case we used a 8-gram as suggested by the authors of Lingpipe. Later when we send in a sequence of characters (a tweet) the LMClassifier will break it down into n-grams. This allows Lingpipe to efficiently compare other sequences of characters later when we ask it to classify a tweet from the stream.

Now as tweets come in from the stream I can save them out to a text file categorized in folders to build the language model and save it out for later use. This is just a quick and dirty way to keep twitter data around after closing a processing sketch. After quite a few long hours of classifying each tweet ( and slowly loosing a little faith in humanity after reading that many random tweets ) I can generate a reusable data model to be used in the main sketch:

Pseudo code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
for (int i = 0; i < mCategories.length; ++i) {
String category = mCategories[i];
Classification classification = new Classification(category);
File file = new File( polarityFolder, mCategories[i]);
File[] trainFiles = file.listFiles();
for (int j = 0; j < trainFiles.length; ++j) {
File trainFile = trainFiles[j];
if (isTrainingFile(trainFile)) {
++numTrainingCases;
String tweet = Files.readFromFile(trainFile,"ISO-8859-1");
numTrainingChars += tweet.length();
Classified<CharSequence> classified
 
 
= new Classified<CharSequence>(tweet,classification);
 
 
mClassifier.handle(classified);
}
}
}

Then simply save the instance of the classifier out and we have a model baked and ready to go.

Pseudo code:

1
2
3
4
FileOutputStream fileOut = new FileOutputStream(dataPath("")+"\\polarity.model");
ObjectOutputStream objOut = new ObjectOutputStream(fileOut);
mClassifier.compileTo(objOut);
objOut.close();

You can download this utility sketch here: LingPipePolarityGenerator.zip
(Download the lingpipe.jar from http://alias-i.com/lingpipe/ and drop it into the code folder to run the sketch)

Back in the main sketch it’s trivial to load the model back in and ask it to classify the tweets as they come in from the stream.

Pseudo code:

1
2
3
4
InputStream fileIn = createInput("polarity.model");
ObjectInputStream objIn = new ObjectInputStream(fileIn);
loadedModelClassifier = (LMClassifier)objIn.readObject();
Streams.closeInputStream(objIn);

We now have our model instance ( loadedModelClassifier ) ready to classify tweets like this:

Pseudo code:

1
2
Classification classification = loadedModelClassifier.classify( tweetFromStreamString );
System.log( classification.bestCategory() );

The call to the bestCategory method will yield a string representation of the best match based on the sample data we supplied to the model in the utility sketch. Language detection is done in the same fashion, but it uses a model supplied with lingpipe and is reasonably accurate as well.

Lingpipe manages to encapsulate all of the complex methods of Computational Linguistics and yields some very useful results. Find more information on it here: http://alias-i.com/lingpipe/

The Results
Once things started coming together I started to notice some patterns immediately. The most apparent pattern is the number percentage of positive, negative, and neutral tweets. And to my surprise they tend to stick to about 39%, 13%, 48% respectively. Looks like people on twitter aren’t nearly as negative as I had originally thought them to be!

PROTIP: Global Illumination renders with Sunflow library
I’m no Java evangelist, but some of the open source libraries out there for the java platform are very cool. In this case Sunflow ( http://sunflow.sourceforge.net/ .) I rewrote all the geometry routines in the processing sketch to spit out data that sunflow could load in and render. The results are pretty fun, this was definitely a rewarding endeavor. Hey real-time is awesome, but so is global illumination and ambient occlusion shaders. Some day I’m sure the two will converge!

Have a go at it
Click here to run the sketch in your browser. It will not run in the browser on MacOS, so if that is your platform of choice give grab the standalone version linked below.

Standalone for Windows
Standalone for MacOS

You can download the processing sketch source here: socialSentiment_src.zip

Line
8 Responses to “Social Sentiment: Real Time artwork with Twitter”
  1. Luciana says:

    Hi! I’ve found this entry quite interesting. I have an idea for a project in processing and I was wondering how to do sentiment analysis online without having to use another software.
    I’ve downloaded the zip file to execute the .pde file but when I decompress it, I don’t know which of all the files to open in the processing sketch.
    Could you tell me so? thanks!!

  2. zenmaker says:

    HI – I’ve tried the standalone on mac, and it says login failed. Has anyone had it working yet?

    thanks,
    Josh

  3. @metacowboy says:

    These is far one of the top 7 Twitter visualisation i have seen till now great work thanks a lot

  4. Derek says:

    Hey Gerald, what version of Mac OS are you running? And what version of Java? It’s working for me on Snow Leopard 10.6.7 boxes. It should run, the classes it’s looking for are built into the jar files inside the standalone app.

  5. Gerald says:

    Hi, do you have instructions on how to run the standalone mac version?

    It spewed me the following errors:

    No library found for com.aliasi.util
    No library found for com.aliasi.classify
    No library found for codeanticode.glgraphics
    No library found for com.aliasi.classify
    No library found for com.aliasi.classify
    No library found for com.aliasi.classify
    No library found for com.aliasi.classify
    No library found for com.aliasi.util
    No library found for com.aliasi.util
    No library found for com.aliasi.util
    No library found for peasy
    No library found for codeanticode.glgraphics
    No library found for twitter4j
    No library found for toxi.geom
    No library found for toxi.geom
    No library found for toxi.geom.mesh
    No library found for toxi.math
    No library found for toxi.volume
    No library found for toxi.processing
    Note that release 1.0, libraries must be installed in a folder named ‘libraries’ inside the ‘sketchbook’ folder.

    Thanks.

    • Derek says:

      Hey Gerald!

      I’ll look at it on a real mac asap. Sometimes processing for windows has a hard time building the standalone versions correctly.

      • David says:

        Derek,

        Thank you very much for sharing the code for this. However, I am unable to run it in windows… The windows standalone version fails when trying to login into Twitter (maybe because of twitter authentication policy changes) so I tried to download the source code and run it with Processing 1.5. However, I get the following output:

        GLGraphics VERSION: 0.9.9
        GL_ARB_geometry_shader4 extension not available
        Sentiment Class initalized.
        PeasyCam v0.91
        2012/01/04 14:59:21 toxi.volume.VolumetricSpace
        情報: new space of 42x42x42 cells: 74088
        ControlP5 0.5.4 infos, comments, questions at http://www.sojamo.de/libraries/controlP5
        Hit low frame rate warning
        Exception in thread “Animation Thread” java.lang.NoSuchMethodError: processing.core.PImage.getCache(Ljava/lang/Object;)Ljava/lang/Object;
        at codeanticode.glgraphics.GLGraphics.renderTriangles(GLGraphics.java:1088)
        at processing.core.PGraphics3D.endShape(Unknown Source)
        at processing.core.PGraphics.endShape(Unknown Source)
        at processing.core.PGraphics.imageImpl(Unknown Source)
        at processing.core.PGraphics.textCharModelImpl(Unknown Source)
        at processing.core.PGraphics.textCharImpl(Unknown Source)
        at processing.opengl.PGraphicsOpenGL.textCharImpl(Unknown Source)
        at processing.core.PGraphics.textLineImpl(Unknown Source)
        at processing.core.PGraphics.textLineAlignImpl(Unknown Source)
        at processing.core.PGraphics.text(Unknown Source)
        at processing.core.PApplet.text(Unknown Source)
        at Twitterverisualization$RadialDisplay.arcText(Twitterverisualization.java:1227)
        at Twitterverisualization$RadialDisplay.draw(Twitterverisualization.java:1152)
        at Twitterverisualization.draw(Twitterverisualization.java:386)
        at processing.core.PApplet.handleDraw(Unknown Source)
        at processing.core.PApplet.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:662

        Could you please show me some steps on how to make it run in windows using processing ?

        Thanks…


Leave a Reply

*

Line
Line
Pony