If you"re reading this, you"re most likely a programmer. And, like any programmer, you had to search for programming questions online. I"m sure you noticed something interesting: in the last few years, when we search for a programming question online, a link to StackOverflow will usually be somewhere among the first 3 results from Google.
This is no coincidence: StackOverflow has somehow entered the live of programmers, slowly but surely. We use it practically every day, but I noticed that most programmers don"t know too much about how this site was born, what principles it works on and why it"s so successful.
StackOverflow is just one of the 119 sites of the StackExchange network, the two are not the same thing. In this article, we"ll discuss the philosophy on which this network is built and we"ll take a quick overview on how its mechanics allows it to basically function independently.
I hope the details I will present here will offer more comfort in using this network and I also hope that we"ll see a stronger participation by the Romanian programmers.
Before StackOverflow, it was very hard to find a (correct) solution to a programming problem, except to the relatively common ones. The reasons for this are:
To fix these problems and to have a much better availability of solutions online, Joel Spolsky and Jeff Atwood decided in January 2008 to launch a Q&A website called StackOverflow. The site"s development started in April 2008 and was launched in August 2008 as a private beta site. After 4 weeks, in September 2008, StackOverflow became public.
Joel"s blog was joelonsoftware.com and Jeff had his own blog too: codinghorror.com. These blogs were fairly popular and they proved to be part of StackOverflow"s success because, through these blogs, Jeff and Joel increased the popularity of their idea. This was important because they wanted new visitors to feel welcome and to actually find useful content when they reached the site.
StackOverflow wants to be a combination between a forum, a blog, a wiki page and a news aggregator. The basic idea is for people to ask and receive answers, not just to add useless comments. It"s a place where quality is voted up and promoted and where useless content is pushed down and disappears.
StackOverflow wants to collect as much knowledge and as many programming solutions as possible. The community evaluates them through voting. As the votes accumulate, experts and trustworthy people will surface and the community will trust them more and more.
It was an instant success and this convinced the founders to launch ServerFault in April 2009, a site for system administrators based on the same philosophy as StackOverflow. SuperUser followed in July 2009, a site for computer enthusiasts and power users.
The success of these sites has laid the foundation for the StackExchange network, which now includes a variety of sites, all following the same structure and philosophy that StackOverflow was built on.
Editing and maintaining the content in an up-to-date state is crucial on StackExchange sites. Content accessibility is also very important, there are strong SEO techniques applied on the sites.
StackOverflow"s popularity and the fact that most programmers hang out online has an interesting consequence: when a new technology or programming language is launched, support sites or forums no longer being created for them; instead, users are redirected to the relevant tags on StackOverflow.
As mentioned in the introduction, the StackExchange network currently has 119 sites. This number fluctuates (see the "Area51" chapter).
Each site works on the same basic principles: they"re Q&A sites, they use the same platform and they have, as target audience, people that work in a professional capacity in a certain field.
The difference is that each site has its own community that drives and administrates it, completely independent of the other sites. In fact, the only collaboration is when questions migrate from one site to another; this is possible since they all rely on the same platform.
Each site has its own subject and variety mentioned in this chapter"s title is an understatement. The most popular subjects revolve around programmers and technologies, but there is great diversity: from math, computer games, poker, sports, politics and photography up to financial management, chess, graphic design, parental advice, history, religion and linguistics. There"s a very low chance for someone not to find at least one or two hobbies or interests among all of StackExchange"s site subjects.
As mentioned above, every site"s purpose is to gather as much information from that site"s experts as possible. When someone is looking for something, the result must be a professional, objective and complete answer. That is StackExchange"s ideal.
The philosophy mentioned in the previous chapter to constantly edit and improve the content makes the above ideal possible. In many cases, this is achieved.
The network"s philosophy includes the concept of making the entire information completely public. Any question or answer that is posted on the network is automatically subject to the Creative Commons Attribution-ShareAlike license. This means that each author receives the appropriate credit on his contribution, that the content must stay 100 % public and anyone can use and modify it (with the condition that the modified version remains subject to the same license), even for commercial purposes.
Making the content available under this license allows using the data in many ways. See the chapter "Big Data" for more details.
The system is very simple: a person asks a question and others post answers. Each question and each answer can be upvoted or downvoted. Depending on these votes, the author receives reputation points. As a person accumulates more points, he or she will unlock privileges and will gain more trust from the community.
An upvote brings the author +5 reputation if it"s a question and +10 reputation if it"s an answer. The difference is because answers are the ones that provide the highest quality content, so they are more valuable.
A downvote reduces the author"s reputation by 2 points, no matter if it"s a question or answer. However, when you downvote an answer, your reputation will also go down with 1 point. This decision was made to encourage improving the answers as opposed to just marking them as low quality.
Every question has to have tags: at least 1, at most 5. These tags help categorize the questions so they"ll be easier to sort and find. For example, a question about how to apply a Look-and-feel in Java will probably have the "java", "swing" and "look-and-feel" tags.
The question"s author is encouraged to pick an answer that he considers to be the best. In this case, the answer"s author receives +15 reputation and the question"s author receives +2 reputation.
The privileges obtained as the reputation grows are diverse: from creating bouties, moderator flags and chat rooms, up to voting to close a question and more and more advanced editing possibilities and content protection.
Since StackExchange is based on a social network, moderation is done a little different. It is divided in 3 levels:
The network is very wiki-like. Users are encouraged to constantly edit and improve the content on the sites. This is so strong that users are encouraged to add their knowledge in the form of questions and answers, basically to answer their own questions. This way, a question and its answers are considered to be like a wiki page on a certain topic. To get an idea: 39 % of questions and 19 % of answers are modified at least once after they"re posted [1].
There are situations when a question is so complex that it requires a long list or needs a lot of research and a very long answer or even a great number of authors for it to be answered. For example: "What are the best programming books?".
In this case, the question will receive a lot of answers, which will be edited by a great number of people. This, together with the popularity and the huge number of upvotes such answers will receive, raise an interesting question: if so many people contribute to that content, is it fair for the original author to receive all that reputation?
To fix this problem, such questions are marked as Community Wiki. This means that the reputation generated by the content won"t be attributed to anyone. The original authors will no longer be listed; instead, the members that contribute the most to the question or answer are displayed. Also, editing such questions will be much more accessible since a member won"t need 2000 reputation do it, like they normally do. Instead, only 100 reputation is needed for this.
In the network"s early days, you could only add questions and answers. Members, though, needed a place to discuss the rules of the sites, the various exceptional situations that occurred and the overall content quality.
So a system was introduced that allowed commenting on both questions and answers. These comments can receive upvotes from others but they won"t generate any reputation.
Another similar system is the chat, where members can talk about anything and everything. The chat is divided in rooms, each with its own discussion. Generally, each site has its own room, but new rooms can be created by members that have sufficient privileges to do so. There are also rooms that are moderator-exclusive, so they can talk moderation issues without regular members interfering.
Every site on the network has an associated Meta site. These are separate, but they work on the same principles. The only thing they have in common with their parent site is their moderators. Here, the discussions are only about the parent site, adding or removing rules, posting announcements and many more. Meta sites are created when their parent sites reach the private beta stage (see "Area51").
As I mentioned in the previous chapter, the network has a lot of sites, each with its own subject. But how are these sites born? And who decides which sites are launched and what subjects they"ll have? The answer is: you, the regular member. The network is social at its core, so the community decides which new sites are launched and what their rules will be.
This all takes place in Area51, a special place where new sites are defined and launched by following these steps:
StackOverflow currently has 3 million members and 6.6 million daily visits, is the most popular site of the network and generates over 80 % of the network"s content and traffic [2]. Having such a huge audience of programmers opens the doors for a unique opportunity: jobs and careers. This is how Careers.StackOverflow was born, a kind of LinkedIn only for programmers and IT professionals.
Every member can create an account, an electronic résumé. On it, you can add all kinds of information: from job history, known technologies and authored articles up to projects you were involved and books you"ve read. For optimal functionality, the site integrates APIs from very popular 3rd-party sites like LinkedIn, GitHub, BitBucket, Amazon, SourceForge and many more. Various StackExchange profiles can also be included here, together with the best and highest voted answers.
Companies are not neglected on this site either: they can create their own pages where they can include a company description, a map with its location, currently available jobs, pictures, accounts of key employees, benefits, technologies used in its projects and many more.
All accounts, questions, answers, comments etc. added to the StackExchange network are publicly available through a series of special sites and APIs. This is possible because of the license, see the "The content"s license" chapter.
This is a special site that allows access to the network"s content. Members can write SQL queries in a big text area, execute those queries and see the results in real-time. To help members in writing these queries, the site allows viewing the complete structure of the database tables that have the content.
Because StackExchange"s architecture works using SQL Server databases, the queries must respect this vendor"s syntax.
The databases used on this site are not the same as the ones used by the live StackExchange sites; instead, they"re just a copy. This means that data is not entirely up-to-date. An update of Data.StackExchange"s databases usually happens once a month.
Members that log in on this site can save their queries and then come back to change and improve them.
Given the public nature of the content, some information is not available on this site, like for example people"s email addresses.
Another way to access the network"s content is through the StackExchange API, a REST webservice that returns data in JSON or JSONP (padded JSON) format.
This API can be used in 3rd-party applications that rely on the StackExchange network. There are a lot of such applications already published, especially for mobile devices.
There"s a lot of content that can be accessed through this API only by authentication. To do this, the application must be registered in StackApps, at which point you"ll receive an authentication key. With this key, there will be a much higher allowed traffic limit for the application.
StackApps is, like I said above, a site where the applications that use the API can be registered. Authors present their applications here, together with installation instructions. Discussions about the API and how to use it are also present here.
The format and philosophy on which the StackExchange network was built is a real success, its popularity cannot be questioned. The sites that it includes made the work much easier for millions of people of all professions. Personally, I save many hours by using these sites, hours which I would otherwise spend digging through the Internet"s far away corners trying to find answers to my questions. It"s practically impossible to calculate how much money StackEchange saves, but it"s pretty clear we"re talking about many billions of dollars [3].
[1] Cristoph Treude, Ohad Barzilay, Margaret-Anne Storey. How do programmers ask and answer questions on the web? In ACM, 21-28 May 2011.