Discover and share the best of the web!
Learn more about Digg by taking the tour.
Reddit, Stumbleupon, Del.icio.us ... Algorithms Exposed!
seomoz.org — Exposing how popular social media websites use algorithms to utilize user data.
- 1183 diggs
- digg it
- pingpants, on 07/03/2008, -2/+45if the algorithm's taking into account the digger's geo location why are there so many popular submissions from new jersey
- menwuur, on 07/03/2008, -3/+13i lol'ed
- topcat31, on 07/03/2008, -3/+92Digg also has this line in the algorithm:
if(submission=xkcd) then position = frontpage + snarky comments about how it always gets there- DarthDaddy, on 07/03/2008, -5/+43** Visual Basic Syntax **
If Digg.Submitter = "MrBabyMan" Then
blnFrontPage = True
Else
MsgBox "Haha Sucker, better luck next time!"
blnFrontPage = False
End If- hoodmonkey, on 07/03/2008, -1/+16if(perferredLanguage == "VB"){
careerStatus = "toilet";
}
- hoodmonkey, on 07/03/2008, -1/+16if(perferredLanguage == "VB"){
- adidos, on 07/03/2008, -8/+6Pssstt...In most languages = is assignment and == is equality...better double check your if condition :)
- nreynolds, on 07/03/2008, -1/+10not in VB. I learned to program in java with = and == and it's WAY better than VB's = and =...
- adidos, on 07/03/2008, -2/+8and VB sucks...
- nreynolds, on 07/03/2008, -1/+1I wouldn't say that it sucks. I started using VB.NET after one year of high-school java and layout-wise it's almost identical. I didn't even have to look at a book or online or anywhere and I already knew how to program in it.
Of course, I've never taken classes for VB.NET so I don't know much about any of the technical stuff, but for the web-app work I do(I helped develop - and am currently upgrading - a web-app that helps teachers submit progress reports and various things like that for my high school - I just graduated this year and need dollar bills yo for a laptop), it seems to work fine. (other than = and instead of == and != - but that's really just preferance)
I'm at work right now kinda slacking off because I'm waiting for some dll's to dl (haha, i'm clever... *cry*) so if anyone wants to teach me why VB sucks, go ahead.
- aComa, on 07/03/2008, -1/+0position = frontpage | snarkyComments;
- DarthDaddy, on 07/03/2008, -5/+43** Visual Basic Syntax **
- liuite, on 07/03/2008, -4/+3data mining uses algorithm...it's a form of intelligence gathering
- jpaul6, on 07/03/2008, -1/+2Is it just me, or does the reddit algorithm make no sense at all - the (verbal) description seems to indicate a per-voter score, which is absent from the actual mathematical description
- HillerMylife, on 07/24/2008, -1/+7The StumbleUpon algorithm is more complicated than any math class I've ever taken.
- twiztidsinz, on 07/03/2008, -4/+8Wow.... All that just to bash digg at the end?
- PullingTeeth, on 07/03/2008, -2/+11K. Rose will be pleased
- cowsgonemadd3, on 07/03/2008, -0/+2For 5 minutes somebody might know the algorithm then it gets posted on the site they are trying to beat so all can see and then it gets changed. Genius!
- Pillage, on 07/03/2008, -0/+43Isn't Reddit open source? How much "exposing" did he have to do?
- Canute, on 07/03/2008, -0/+2It sure is. http://code.reddit.com/
- Megane, on 07/03/2008, -4/+3if digg source of submission doesn't belong to the set {cnn, bbc, wired, usatoday, reuters, livescience and similar} then no frontpage.
- dcmcderm, on 07/03/2008, -3/+2dude you gotta write that ***** in pseudocode or diggers won't understand
if (submission.source IN {cnn, bbc, wired, usatoday, reuters, livescience})
then
submission.location = frontpage
else
then
submission.location = null
submission.status = buried
endif- greeniemeani, on 07/03/2008, -0/+3What the hell language has an "IN" construct?
- tacojohn48, on 07/04/2008, -0/+1greeniemeani - I think I have seen it in some sql, maybe he picked it up from that. Also might be matlab or something obscure like that.
- FearMoth, on 07/04/2008, -0/+1greeniemeani: Pascal?
if i in [1..100] then ...
- lordewoks, on 07/03/2008, -0/+2lets not forget XKCD, huffingtonpost and divinecaroline
- tigerglebe, on 07/03/2008, -0/+1Digg has a lot of gaming going on but at least you get a chance to hit the frontpage if unknown. On Fark you don't hit the frontpage unless you are a paid Total Fark. Period.
- dcmcderm, on 07/03/2008, -3/+2dude you gotta write that ***** in pseudocode or diggers won't understand
- greeniemeani, on 07/03/2008, -0/+8I honestly wasn't expecting any logarithms...
- andreshb, on 07/03/2008, -1/+4As I understood from Peter Norvig from Google Research - with a large data set, the simpler the algorithm can be - in other words, its all about the data set so I would not be surprised that if someone reverse-engineers the algorithm (up to that point) of Digg, it would be rather simple.
More info on algorithms and data sets:
http://www.omnisio.com/startupschool08/peter-norvi ...- sfrench, on 07/03/2008, -0/+2Large data sets make outliers in your data set contribute less overall, and reduce spikes. Think about the two following averaging exercises and how the one outlier (1000) affects the overall result based on the sample size and how the average really relates to representing the true nature of the series being averaged.
Avg of 1 2 3 1000 = 1006/4 = 251.5
Avg of 1, 2, 3, ... , 98, 99, 100, 1000 = 6050/101 = 59.9
- sfrench, on 07/03/2008, -0/+2Large data sets make outliers in your data set contribute less overall, and reduce spikes. Think about the two following averaging exercises and how the one outlier (1000) affects the overall result based on the sample size and how the average really relates to representing the true nature of the series being averaged.
- amrush4th, on 07/03/2008, -2/+2Much of life works the same way. Make people feel like they have a level of input and control, but actually have most everything planned out and working in the exact fashion you desire.
- paintist, on 07/03/2008, -0/+4I wonder if he'll have to change his own suspected digg algorithm now that his own story became popular.
- ddover, on 07/03/2008, -0/+1Nope, I still think it applies.
- SlapAyoda, on 07/03/2008, -1/+1Neat.
- TheDeepFriar, on 07/03/2008, -1/+2Digg Effect.............mirror anyone?
- seanieb, on 07/03/2008, -1/+1Google Cache got it:
http://216.239.59.104/search?q=cache:CAt5E0kb7rcJ: ... - seanieb, on 07/03/2008, -0/+4I think he has gotten the Digg algorithm wrong, some of those "factors" clearly don't have any effect.
- fattehboi, on 07/03/2008, -0/+2its back up.
digg is more than just a 10,000 pound gorilla - shondell, on 07/03/2008, -2/+4I'm pretty sure Digg's algorithm can't be explained period.
- zadadka, on 07/03/2008, -0/+5Eh?
Al Gore what?- D14BL0, on 07/03/2008, -0/+3Rhythm.
- zadadka, on 07/03/2008, -0/+1Pah!
Sounds like a no-hoper jazz quartet.....
;)
- zadadka, on 07/03/2008, -0/+1Pah!
- D14BL0, on 07/03/2008, -0/+3Rhythm.
- D14BL0, on 07/03/2008, -11/+2So it DOESN'T expose the Digg algorithm like the title says? Buried as inaccurate.
- theutopian, on 07/03/2008, -1/+7It doesn't say that in the title.... buried as retarded.
- D14BL0, on 07/04/2008, -1/+1How does it NOT say that in the title?
"Digg.... Algorithms Exposed!"
One would kind of hope they had SOMETHING to back up their article. - theutopian, on 07/04/2008, -0/+2Sigh... apparently the retardation is contagious. Since you're incapable of scrolling up and reading. I paste the title here for you.
"Reddit, Stumbleupon, Del.icio.us ... Algorithms Exposed!"
Do you see Digg there? No. That's what I thought. - badjoke, on 07/04/2008, -0/+2That's the page title...not the article title.
CNN - Man breaks out of jail != CNN Man breaks out of jail!
- D14BL0, on 07/04/2008, -1/+1How does it NOT say that in the title?
- theutopian, on 07/03/2008, -1/+7It doesn't say that in the title.... buried as retarded.
- mlbwebdesign, on 07/03/2008, -1/+2Boy! Delicious really has a complex algorithm there.
- billbugger, on 07/03/2008, -0/+1it works
- mlbwebdesign, on 07/03/2008, -0/+1How long did they take to figure it out tho?
$count++;
- mlbwebdesign, on 07/03/2008, -0/+1How long did they take to figure it out tho?
- billbugger, on 07/03/2008, -0/+1it works
- yanwg, on 07/03/2008, -8/+0Buried because ...
an algorithm is the only way you can get a computer to do anything more than just suck on electrons. What, are Oompa-Loompas supposed to be ranking the content on these sites?
p.s.
Almost dug because my image text for this post is: SLUTY- hakz, on 07/03/2008, -0/+3what???
- hakz, on 07/03/2008, -1/+8funny how everything BUT digg is exposed
- arjie, on 07/03/2008, -1/+1Wait, let me get this right, let's say that at a certain time after submitting an article, 10 more negative votes than positive votes are made. Then:
ts = A - B < 0, because the time now B, is after the time of submission A. Hence B is greater than A.
x = U - D = -10 < 0, because there are more negatives than positives.
y = -1, by definition of y it is -1 when x < 0.
z = 10 > 0, z = absolute value of x
rating f = 1 + (y)x(ts) = 1 + (-1)x(ts) = 1 - (negative number of increasing magnitude) = 1 + (positive number of increasing magnitude)
Rating increases over time for stories with negative votes?
The reason I didn't think he meant absolute difference when he said difference is because if he did then x would never be < 0.- Rojahon, on 07/03/2008, -0/+2No, the time B is constant to that specific time they listed.
- arjie, on 07/04/2008, -1/+2Yeesh, I'm an idiot. Yeah, it's written right there, I thought it was some miscellaneous info and left it out.
Dugg you up, thanks.
- arjie, on 07/04/2008, -1/+2Yeesh, I'm an idiot. Yeah, it's written right there, I thought it was some miscellaneous info and left it out.
- Rojahon, on 07/03/2008, -0/+2No, the time B is constant to that specific time they listed.
- Lolosway, on 07/03/2008, -0/+2Really amazed to see the digger's geographical location along with a few other factors taken into consideration as a possibility. With all the additional variables considered I am curious now as to how each adds their own effect into the equation and at what percent; or if its just speculation as the sources for digg were not included.
- Rojahon, on 07/03/2008, -0/+2FTA: 45000 is the amount of seconds in 12.5 hours. This constant is used in combination with yts to "water down" votes as they are made farther and farther from the time the article was submitted.
As near as I can tell, y*ts / 45000 has nothing to do with the number of votes that are recieved, assuming that the balance doesn't switch between up and down votes. It doesn't "water down" votes at all. It just provides a way of putting newer and positively voted articles above older articles. Although, somewhat strangely, it puts newer, negatively voted articles below older, negatively voted articles.
To be honest, that algorithm seems kind of strange to me. The number of votes is essentially meaningless, and becomes more meaningless as time passes from that fixed date of December 8th, 2005. Why they even bother taking the log is beyond me. The magnitude of y*ts /45000 given 3 years since that date will be over 2000, while the log(10000) -- that's counting 10000 votes -- is only 4...- lancehomer, on 08/18/2008, -0/+1"Although, somewhat strangely, it puts newer, negatively voted articles below older, negatively voted articles." I've come to the same conclusion about this equation and therefore question how accurate it really is.
- spud311, on 07/04/2008, -0/+1I think that while the formula might be good for a few tips on tightening up your profile and posts / content. Unless you can get 500 users to Digg an item - there is not much value.
- zman14321, on 07/04/2008, -0/+1I'm pretty sure reddit is now open source.
- andrewglover87, on 07/04/2008, -0/+3Strange how in the list of factors which affect Digg it doesn't mention amount of diggs...
- Inquisition, on 07/04/2008, -0/+1Personally, I think that the "Submitter's authority" and "Submitter's friends and fans" should be discarded altogether. That would put a stop once and for all the gaming that still happens on Digg. I haven't seen many babyman submissions that I enjoyed that weren't already on Reddit, and I don't see why something should make the front page because the submitter has spent waaaaay too much time adding friends to his list that aren't really interested in what they post,and probably don't take too much time trying to figure out how to keep this guy from shouting every new submission to them.
Then, I think about the tantrum that the "Digg elites" (as they call themselves) threw when Digg announced the change in the algorithm. They boycotted for about a day before they came back and figured out that they could still game Digg. They don't game Reddit or fark simply because the algorithms are simple, and transparent. it is a lot harder to game a simple system, because there are less variables that can be exploited. Don't believeme? I dare the "Digg elite" to try to become Reddit elite. - broalexinfo, on 07/22/2008, -0/+1I don`t know much about algorithms but the same guys always drive stories to the homepage, that is not gaming , its simply because they have a lot of mutuals and fans that will digg and share their story, pretty much legal if we consider Digg`s TOS. Otherwise they wouldn`t have implemented the shouting feature from the start, agree? Now add me to your friend list if you really want to share interesting stories and drive them to the homepage.. ;)
Browsing Digg on your phone just got easier with our enhancements to the