All examples By author By category About

hwangmoretime

Bar Chart Composed of Images ("In One Chart")

What This Shows

A bar chart of Twitter images from Tweets that contain the phrase "in one [chart|graph]". The images themselves make up the areas of the bar chart. Because the Twitter Search API only indexes tweets from the previous seven days, the data is relatively uninteresting for a bar chart.Thus, this visualization is more a proof-of-concept and a unique talking piece for the increased prevalence of the phrase "in one chart". I wrote a short bit on this visualization here.

What's Under the Hood

For this visualization, I wanted to group unique charts/graphs on the day they were first published. I used the following pipeline to determined the tweets that first published unique charts/graphs:

  1. Get all tweets that contained the phrase "in one chart" or "in one graph". (6788 tweets to start)
  2. Then, get tweets that had a picture attached. (4716 tweets left)
  3. Then, group tweets by the image url. Find the earliest tweet from those groups. (204 tweets left)
  4. Then, compare tweets to one another to determine if any of the images were simply re-uploads of existing images. (153 tweets to end)

I did not know that step3 would yield such a reduction in the search space. I learned there is good reason for serious tweeters to not simply click retweet, but instead copy-paste the tweet of interest and fire it off themselves (i.e., manually retweet). Good thing, too, else the somewhat costly image comparison in step4 might have barred its fangs.

To avoid recomputation, each tweet that has gone through the pipeline is marked to state whether or not it is a unique-earliest-tweet. Thus, on successive runs of the pipeline, steps 1-3 only have to run on the new tweets. Step 4 is computationally lessened as well, but not to same degree because each new image must be compared to older images.

What Remains

For image comparison, I use a root-means-squared metric between the two image's PIL.Image.histogram(). Histographic comparison, at least in my implementation, is not robust to cropping or small annotations; those types of edits rarely justify uniqueness.

The slow loading of images is ugly. A loading screen would be nicer.

Interactivity. I've tried my hand at embedding corresponding Tweets upon mouseover() for the images with some success. Uncomment out the .on(mouseover) line and you will get that interactivity with the embedded Tweet off to the side. Unfortunately, you can only view the tweet because once you move your cursor you inevitably mouseover() some other tweet along the way. A better way to embed the tweet is with a tooltip. I struggled in attempting that in large part due to the asynchrnous nature of embedding tweets. Current d3-tooltip libraries are unsuited for that. jQuery lacks support for svgs for this purpose, and I would much prefer a d3 native solution over other external libraries.