- If you can’t choose wisely, pick at random
Randomness can hedge against decisions made for bad reasons
- Small multiple goodness: What would you do next?
On the merits of using a small grid of images to convey information
- Not disruptive, and proud of it
A critique of the current infatuation with disruption
We’ve got social psychology, robotics and AI, personal uses of data algorithms, and a historical call for voluntary self-restraint:
- Latitudes of Acceptance
Matthew Lieberman talks about finding common ground between people who don’t share the same beliefs (The text is a transcript of the video)
- ‘Boris’ the robot could load up a dishwasher
A robot that can see unfamiliar objects and decide how to pick them up using a human-like hand
- How Twitter’s new “BotMaker” filter flushes spam out of timelines
Twitter is working on ways to fight spam and still let people use social media aggregation tools
- Spotify Knows Me Better Than I Know Myself
A look at Spotify’s listening pattern analysis tool
- Matchmaker, Matchmaker, Make Me A Spreadsheet
A book review of Dataclysm, a book by the CEO of OKCupid, who was into experimenting on users long before Facebook was
- A World Split Apart
Alexander Solzhenitsyn delivered a commencement address at Harvard University in 1978 in which he critiques the press of his day in a way that is every bit as relevant today
The best time to post is generally between 12pm and 5pm EST (UTC–5). Other times also work well, but most of the successful posts on /r/dataisbeautiful were posted in that time range.
I thought I’d test that assertion. So I wrote a simple tool in Clojure and gnuplot that queries the Reddit API for a particular subreddit, groups recent submissions by the submission hour (UTC), and creates a chart displaying the percentage of high-scoring submissions per hour. Here’s the current chart for DataIsBeautiful:
As you can see, the guide, highlighted in orange, was wrong! The best time to post, at least according to recent data, is actually between 8am and 12pm EDT (UTC–4), which is highlighted in green.
You can run the same analysis yourself on any subreddit by using the tool I wrote. Pull requests welcome!
Update 2014-09-16 2:07 PM
The Data is Beautiful sub-Reddit has updated their guide to reflect the new data. Cheers!
Among other things we’ve been reading about neuroscience, society, financial market prediction, and statistical research results this week:
- How to see into the future
By keeping score and trying to improve your predictions based on past errors
- Play the long game
Conversion is only one part of marketing and sales
- Please Block the Way: Campaigning Against Courtesy
Not all important quantities in a system are necessarily quantifiable
- Millennials’ Political Views Don’t Make Any Sense
The results of multiple polls show that Millennials aren’t a homogenous group and hold conflicting views, unlike all those nice generations before
- On Facebook, Nobody Knows You’re a Voter. Well, Almost Nobody.
Facebook is trying out some tools that will make political advertising much easier (for politicians)
- Do we want Minority Report policing?
Exploring the limits of what should be allowed when it comes to predictive crime prevention
A critique of Sillicon Valley and Big Data culture
- How extreme isolation warps the mind
“…we are, as a rule, considerably diminished when disengaged from others”
- The Fizz Buzz from Outer Space
Programmer interview antics
We’ve been hard at work on a new tool, which we are proud to announce: the Altometrics Dynamic Thresholding Tool.
Many data sets have periodic patterns. For example, in network management, the network usage (in bytes per second) varies greatly by the time of day. When, say, everybody is at work, the usage goes up. When everybody leaves, usage goes down.
Trying to create notification thresholds for this sort of data is frustrating. If you set a static threshold, it’s prone to trigger false alarms during the morning high-traffic time, because only a little bit more traffic than normal would exceed the threshold. Conversely, even a pretty serious anomaly might not trigger an alarm if it happens in the middle of the night, when there is typically little traffic.
Clearly, we need thresholds that vary by the time of day. But we shouldn’t burden the network managers with creating such complex thresholds. Instead, we can do that automatically.
…there is data lurking around every corner. There are a few longer reads this week. This is what we have been reading and thinking about in the realm of technology, history, business, and data:
- The Rise and Fall and Rise of Virtual Reality
The Verge made this 9 page celebration of the history and near future of VR
- What Happened to Motorola
A look back at Motorola from its founding in 1928 through its recent difficulties
- The Quirks of Smallness
Big is not always better
- Lessons Learned from 13 Failed Software Projects
Thirteen small developers each reflect on a failed project
- The Thermodynamic Theory of Ecology
Simple models are often more accurate than complex models
- show the full picture!
Adjusting a chart from the news to better reflect the facts
- How to Stimulate the Open Web
A simple idea to keep the open web relevant