Reciprocity, social curation and the emergence of blogging: A study in community formation


Little empiricism has been brought to bear on the original network of weblogs, and as a consequence the early ‘history of blogging’ has remained anecdotal to a very large degree. The present study works from a dataset of the earliest networked weblogs and uses social network analysis to account for the dynamics that brought them together. The study assumes that direct or indirect reciprocity is an indispensable precondition of community formation; in a longitudinal analysis it finds that the introduction and widespread adoption of link crediting, a form of direct reciprocity, precipitated the emergence of the original ‘weblog community’ in 1998 as an exchange network engaged in the social curation of the web.

1. Introduction

Early anecdotal testimonies suggest that Jorn Barger of Robot Wisdom Weblog and Dave Winer of Scripting News were the two leading actors in the formation of the original ‘weblog community’ – the collectivity of bloggers that preceded the ‘blogosphere’. Testimonies to Barger’s prominence as a community builder rather than mere inventor of the term ‘weblog’ include the assertion that ‘he coined the term, started the trend’ (Wallace, 1999); one journalist also maintained that Barger ‘inspired the Web Log community’ (Gatlin, 1999), a claim that was backed by one blogger’s assertion that Barger ‘more or less created weblogging and then defined it by his excellent example’ (Graham, 1999). Similarly, Winer has been identified as ‘the protoblogger’ (Mitchell, 2006), the ‘Johnny Appleseed (if not the Gutenberg) of weblogs’ (Kottke, 2002), the ‘blogfather’ (Sifry, 2002), and ‘one of the founders and leading practitioners of the Weblogging movement’ (Udell, 2002). One member of the original weblog community gave both men equal credit, naming Barger and Winer as ‘the two that kind of started it all’ (Krahn, 1999a).

Dave Winer, a software developer specialising in software tools for writing and a prolific writer himself, launched Scripting News in February 1997 as a complement to the DaveNet column he had been self-publishing electronically since late 1994 (Rosenberg, 2009, p. 47 – 69). This novel feature of his site, which he called a ‘news page’, consisted of a date-stamped and reverse-chronologically ordered stream of briefly annotated links to newsworthy items from all over the web; its guiding editorial principle appeared to be Winer’s preoccupations as a software developer and a citizen. In January 1997, immediately preceding the Scripting News launch, Winer also released NewsPage, the software that created such pages, as part of the 4.2 release of his content management system Frontier, which was distributed free of charge at the time (Winer, 1997a)⁠. Winer’s example and his software inspired a handful of people to start news pages that were knit to the same pattern; some of them were using Winer’s software, others were re-implementing its information architecture in scripting environments other than Winer’s, or even in hand-coded HTML (Ammann, 2009a).

Winer had no intention of networking these news pages, however. The earliest hint at such a network came from NewsPage adopter Chris Gulker, who suggested in May 1997 that news pages ‘proliferate […] and cross refer’ (Gulker, 1997a). In October 1997, Gulker also added a hyperlinked list of other news pages to his site’s sidebar, dubbing it the ‘NewsPage Network’ (Gulker, 1997b).

Jorn Barger, a programmer and avid reader with a work history in cognitive science, launched his own news page in December 1997, the Robot Wisdom Weblog (Rosenberg, 2009, p. 70 – 90). Thrilled at his discovery in January 1998 of Gulker’s list of news pages – or weblogs, as he preferred to call them – Barger envisaged a ‘new network of web-surfers’ (Barger, 1998a) and set himself up as ‘a leader among the growing network of weblogs and news-pages’ (Barger, 1998b), recruiting new practitioners of the form as he went along (Ammann, 2009a).

When launching Robot Wisdom Weblog in December 1997, Barger enjoined his readers to visit the site ‘every day or so for new discoveries’ (Barger, 1997). He promised to offer ‘daily commentary on new discoveries all around the web’ (Barger, 1998c) and ventured the prediction that ‘there’ll be hundreds of people maintaining pages like this, and that this will allow good URLs to spread much more quickly’ (Barger, 1997). Barger anticipated that weblogs would challenge the mass media; his ‘growing network of freelance editors’ (Barger, 1998d) would form a ‘community of non-corporate truth-tellers’ (Barger, 1999a) that would engage in the practice of ‘linking to the best articles from every possible source, accompanied by honest summaries’ (Barger, 1998d); as each of these editors would ‘re-filter’ (Barger, 1998e) the work of a dozen of their peers, they would cause relevant news to ‘propagate thousands of times more efficiently’ (Barger, 1998e) and thus irreversibly shift ‘the seat of power from well-financed publishers to essentially unfinanced editors’ (Barger, 1998d). Consequently, a ‘merry band of linkers’ (Bogart, 1998a) sprang up who aimed to be ‘a useful filter for the vast amount of news and information on the Web’ (Bogart, 1997). This collectivity sought to counter the market-based attention economy of the mass media with a social process ‘for topically related and interest-based clusters to form a peer-reviewed system of filtering, accreditation, and salience generation’ (Benkler, 2006, p. 252). Put differently, the original network of bloggers was engaged in the social curation (Liu, 2010) of the web.

2. Reciprocity

How did the ‘movement’ (Humphries, 1998) of the early bloggers come together as the ‘weblog community’ (Barger, 1999b)? Viewing the collectivity as an exchange network (Cook & Emerson, 1978), one can start looking for the structure of reciprocity (Molm, 2010) as an indicator of social relations within the network, and one can look out for signs of its routinisation, of reciprocity turning into a shared norm.

Reciprocity as a pattern of mutual obligation in social exchange has been discussed since the classical period of sociology (Komter, 2005, p. 108 ‒ 113), when it came to be understood as ‘the pattern of exchange through which the mutual dependence of people, brought about by the division of labor, is realized’ (Gouldner, 1960, p. 169 ‒ 70). As such, it has long been recognised as a ‘starting mechanism’ (Gouldner, 1960, p. 177) that allows social relations to be taken up.

More recently, reciprocity has been identified as a dominant motivator of user contribution in online communities generally (Wellman & Gulia, 1999; Wasko & Faraj, 2000), but also specifically in a large corporate e-mail network (Constant, Sproull, & Kiesler, 1996), in file-sharing networks (Giesler, 2006; Gu, Huang, Duan, & Whinston, 2009) and in blogging networks (Gaudeul & Peroni, 2010).

Reciprocity has been called ‘an underappreciated and highly important aspect of creating social capital through networks’ (Molm, 2010, p. 126) and has been described as indispensable: ‘networks must include either direct or generalized reciprocity’ (Molm, 2010, p. 126) because reciprocated exchange allows actors in an emerging collectivity to ‘develop the trust and affective bonds that promote productive exchange relations’ (Molm, 2010, p. 127). Reciprocity, therefore, is ‘both a defining feature of social exchange and a source of societal cooperation and solidarity’ (Molm, 2010, p. 129).

As an emerging collectivity is marked by the onset of ‘routinized social relations’ (Lin, 2001, p. 136) as soon as ‘social relations and sharing of resources are established and maintained’ (Lin, 2001, p. 137), it is the routinisation of reciprocity that needs to be detected in the data.

In the present study, reciprocity is examined under static and dynamic aspects. Under the static aspect, the network data is checked for reciprocal dyads, which are then related to conventional network measures such as degree centrality and to the ad-hoc measures of outlink count and peer discovery that qualify the reciprocated dyad count. To shed light on the longitudinal dynamics of community formation, the incidence of link types over time is also plotted and the distribution of the inherently reciprocal link type of attribution is compared against a qualitative reading of the archival data.

3. Data

The data set that underpins this study attempts to be a complete record of the links that connected the earliest network of weblogs.1 The reported period ranges from January 1997, when Winer’s NewsPage software was first released (Winer, 1997b), to 31 December 1998, by which time weblogs had come to be hailed as a ‘movement’ (Humphries, 1998) following the launch of the first weblog directory in November 1998 (Carter, 1998a).

Few of the first weblogs are extant in their original locations, so the data could not be gathered using straightforward spidering and needed to be collected using a patient, quasi-archaeological process of discovery. Much of the data is sourced from the Internet Archive (About, n.d.) and from archives maintained by the bloggers themselves, sometimes in locations other than the original ones. Gathered from such widely distributed sources, the data was compiled into a list manually, one link at a time, using a variety of search strategies.

As the data was gathered from widely distributed locations and is more than a decade old, its preservation is inevitably less than perfect, which somewhat diminishes the integrity of the data set. Contrary to what might be expected, however, it is fairly robust: archives that are wholly missing tend to belong to sites located on the periphery of the network,2 whereas the archives of central sites are well preserved. Overall, it can be asserted that the missing data affects the results only marginally.

To identify the sites that were part of the original weblog network, three weblog lists were chosen that are roughly contemporaneous with the reported period: Chris Gulker’s ‘Newspage Network’ list (1998) in its final version of February 1998, Michal Wallace’s list (1998a) of December 1998 and Cameron Barrett’s list (1999a) of January 1999.3 The high degree of overlap between these lists suggests a strong genealogical continuity from the NewsPage network to the earliest weblog network (Ammann, 2009a).4

Starting from these three lists as seeds, the archives of all the sites referenced were searched, adding other sites thus found if there was a minimal amount of similarity and cross-linking with the other sites in the network, and removing sites if there was no such cross-linking. Following this iterative process, the network was delimited through the application of two criteria: a site had to display a basic family resemblance with the other sites in the network, and a site had to have an eventual degree centrality of 2 or greater, meaning that a node, to be considered part of the network, needed to have two or more edges attached to it. The first criterion of family resemblance prioritises formal characteristics such as reverse-chronological sorting of entries consisting primarily of annotated links. It is intended to allow some leeway for variation in site design, to eliminate false positives and to allow for sites unmentioned on any of the seed lists.5

The second criterion, the degree centrality threshold of 2, is a safeguard against the need to account for the extensive penumbra beyond the network’s periphery. This criterion eliminates the news pages in Gulker’s list that chose not to link to their fellow NewsPage users or were ignored by them for failing to post links that matched their interests. The threshold also counteracts Barger’s tendency to endorse and credit sites that were highly obscure and did not get referenced by other members of the network, even if Barger explicitly called them ‘weblogs’ in some cases. Thus, if a site didn’t link or credit back and wasn’t linked or credited by other network members, it was excluded from the data set.

The links in the data set are of three types: endorsement, list and link attribution. An endorsement6 is a link in the body of a weblog post that points to another weblog. Granular addressability of individual posts, known as permalinks, did not come to be implemented until the year 2000 (Coates, 2003), so endorsement links in this early weblog network were necessarily links to a weblog as such rather than to anything in particular that had been posted to it. In the data set, only links to the weblog are included, not to any other parts of a site.

A list link is part of a compilation of other weblogs, known as ‘blogroll’ since 2001. List links are situated outside the stream of weblog postings, often but not necessarily in a sidebar. In the data set, they include the links in Chris Gulker’s ‘Newspage Network’ list of October 1997 as well as more recent lists, such as the one placed at the top of John Wilson’s Untitled Weblog (Wilson, 1998) or at the bottom of Steve Bogart’s News, Pointers & Commentary (Bogart, 1998c), or indeed on a separate ‘sources page.’ (Barger, 1999c) as in Barger’s Robot Wisdom Weblog.7

A link attribution is a credit for a ‘borrowed’ link to its source, often another weblog. Such attribution need not involve an HTML anchor tag. Barger’s original style of link attributions, for instance, did not offer a direct hyperlink to the source, but used a non-hyperlinked citation key referencing a ‘sources page’. This style was adopted and used throughout 1998 by some weblogs. In May 1998 Raphael Carter introduced the more familiar credit link style that uses a direct hyperlink to the source site (Carter, 1998b), often introduced with the preposition ‘via’.8 In the data set, both of these attribution styles are treated as equivalent despite the fact that Barger’s attributions were not links in a narrow, technical definition of the term. It bears pointing out that link attributions are innately reciprocal: they are a commendatory acknowledgement of a fellow blogger from whose site a link was taken.

4. Static measures

The network of early weblogs yields a number of static measures that shed light on the community formation process. For one thing, the network satisfies the criterion that a ‘dense subgraph is a signature of a blog community’ (Kumar, Novak, Raghavan, & Tomkins, 2003): according to the bowtie model of the web (Broder et al., 2000), it has a strongly connected component9 of 28 out of 37 nodes, equalling 76 percent of the network, with an ‘out’ component of 8 nodes, accounting for 22 percent. These numbers strongly contradict the claim that in early 1999 there was ‘no strongly connected component of more than a few nodes’ (Kumar et al., 2003), and they challenge anecdotal histories that arbitrarily locate the emergence of the weblog community in early 1999 (Blood, 2000, 2002).

Degree centralities offer further insights. Dave Winer’s indegree centrality is the highest in the graph, which readily supports the generalisation that the original network of weblogs coalesced around the NewsPage model and the example that Winer set in Scripting News. Jorn Barger has the highest overall degree centrality in the graph, however, with Winer coming second. Barger also managed to form twelve reciprocal dyads within the graph, which is twice as many as the second-ranked Winer (see Table 1); thus, Barger and Winer are indubitably the two most central actors in the network, but Barger’s reciprocity is twice as high as Winer’s.

Barger Winer Barrett Gulker Wallace Bogart Affleck
Indegree centrality 0.39 0.50 0.22 0.14 0.11 0.11 0.08
Outdegree
centrality
0.56 0.31 0.42 0.25 0.39 0.31 0.06
Degree
centrality
0.47 0.40 0.32 0.19 0.25 0.21 0.07
Peer
discoveries
10 8 3 6 0 0 0
Reciprocal
dyads
12 6 5 4 4 3 2
Outlinks 307 12 36 18 65 44 4
Link
credits
267 0 8 2 30 13 0

Table 1: Network measures of some key actors (centralities normalised)

Reciprocal dyads can usefully be related in this graph to the ad-hoc measure of peer discovery count, the number of previously undiscovered new network peers a blogger managed to introduce to the network by linking to them before another established peer in the network did. Winer linked to newly established news pages as he became aware of them, simply by way of identifying users of his software, Gulker maintained a metaphorical ‘radar’ (1998) looking out for new additions to this network, and Barger continued Winer and Gulker’s practice, highlighting new weblogs among his ‘discoveries’ (Barger, 1997, 1998c) whenever he found a new one that met his criteria for inclusion (see Table 1). Barger has the highest peer discovery count in the graph, followed by Winer and Gulker (see Table 1). Gulker’s relatively high count of peer discoveries contrast with his comparatively low count of reciprocal dyads; this contrast is a measure of how little social engagement his ‘Newspage Network’ of late 1997 and early 1998 managed to sustain.

Conversely, the fact that Barger has the largest number of reciprocal dyads, the highest overall centrality degree and highest peer discovery count underscores the most striking of the network’s quantifiable characteristics, the great disparity in outlink count between Barger and any of the other actors. Barger’s outlink count, by far the highest throughout the graph, exceeds the outlinks of the second-ranked in this measure, Michal Wallace of Manifestation.com, by a factor of more than three. Barger’s outlinks also account for nearly half the links in the network, Wallace’s for less than 10% (see Table 1).

Winer’s outlinks, by contrast, amount to less than 2% of the graph’s total outlinks. Winer was the beneficiary of large quantities of inlinks but, unlike Barger, did not offer a corresponding number of outlinks, which suggests a comparatively low participatory involvement with the emerging network on Winer’s part. Of Barger’s copious outlinks, 87% were of the reciprocal credit link type. Offering these to new entrants from his central and highly reciprocal position in the network amounted to ‘accreditation’ (Benkler, 2006, p. 79) of new actors emerging on the network’s periphery, and to ‘recognition and legitimation of relations’ (Lin, 2001, p. 137). Through his abundant outlinking, Barger was actively promoting the growth of the network.

5. Longitudinal analysis

To account fully for the dynamics of community formation, it is necessary to examine the data diachronically and focus on the link types of endorsement, link attribution and list.

Graph: Endorsements, link attributions and list links per month, 1997 – 1998
Figure 1. Endorsements, link attributions and list links per month, 1997 – 1998

A time series plotted of the three link types shows sporadic links of all three types in 1997 (Figure 1). Endorsement links are the oldest type, represented by the dashed line, with links recommending other news pages emerging in the first half of 1997. A few link attributions, represented by the solid line, crop up in 1997 as well, and then, from October 1997 onwards, there are list links, represented by the dotted line, starting with Gulker’s original NewsPage Network (Gulker, 1997c) list. While endorsement links temporarily peaked in early 1998, attribution links began to increase dramatically a short while later, followed by a renewed and sustained increase in both endorsement links and list links. Link attributions got numerous before endorsement links got numerous, and for most of the year 1998, link attributions were the most numerous link type, accounting altogether for 61% of the edges in the graph.

As network activity suddenly proliferated in early 1998, Gulker’s dormant NewsPage Network sprang to life. The link type that increased sooner, climbed higher, and topped the other link types for the remainder of the year is the attribution link. Innately reciprocal, attribution links increased earlier than other link types, and remained the most numerous link type throughout most of the year. This strongly suggests that attribution links precipitated the emergence of ‘routinized social relations’ (Lin, 2001, p. 136), marking the emergence of the weblog community in the process.

The earliest massed link attributions were introduced and championed by Jorn Barger. When Barger, shortly after launching Robot Wisdom Weblog in December 1997, first discovered Gulker’s NewsPage Network in early January 1998, he returned from his exploration of the listed sites with a handful of ‘cribbed links’ (Barger, 1998g) that he posted to Robot Wisdom without attribution. A few weeks after becoming aware of the NewsPage Network, Barger noticed Gulker’s discovery of Steve Bogart’s news page (Gulker, 1998)⁠, and he endorsed Bogart’s site instantly, on 13 Feb 1998, as ‘another fine weblog’ (Barger, 1998h)⁠, while crediting Gulker for the find. Bogart responded the same day and linked back to both sites with a thank-you note (Bogart, 1998d). A few days later, on 16 February, Barger re-posted a link from another member of the NewsPage Network, Daniel Berlinger. This time, he not only credited his source, he introduced an entire scheme that was designed to credit all his link sources, using citation keys in square brackets that referenced a separate ‘sources page’ (Barger, 1998h). The innovation of giving credit for borrowed links had originally been used in two isolated cases by Gulker in the summer of 1997 (1997c, 1997d), but when Barger re-introduced it in February 1998, he made it a systematic, meticulously observed part of his posting routine.

It took a few weeks before Barger managed to find imitators of the practice. On 24 April 1998, Steve Bogart adopted link attribution and promised that whenever he was going to borrow a link from another weblog he would ‘credit it’ (Bogart, 1998e). As a consequence, Bogart and Barger became the first two bloggers to trade links routinely and acknowledge their borrowings from each other.

Adoption spread from there. In late July 1998, Barger maintained that ‘crediting links borrowed from other weblogs is good etiquette’ (Barger, 1998i) and Michal Wallace praised link crediting as ‘cross-pollination’ (1998b)⁠ between weblogs; by that time, link crediting had been adopted by 11 peers in the network. By the end of the year 1998, it had been adopted by 18 out of the 28 peers in the network’s strongly connected component, which amounts to an adoption rate of 64%. Thus, the norm that bloggers should ‘make an effort to acknowledge sources’ (Krahn, 1999b) was firmly established at the outset of 1999, and the suggestion that links that had already been shared on several other weblogs should be excepted from the norm (Barrett, 1999b) underscores the sense of an established practice.

Bogart originally adopted link attribution from Barger out of consideration for his peers and a desire to practice social curation in a way that respected the contributions of others: he was unwilling to forgo the privilege of propagating links that had already featured on other weblogs, yet wished to avoid giving umbrage by ‘stealing’ such links, so concluded that link attributions were ‘the best choice as far as balancing my peace of mind with not wanting to be limited in what I could write about’ (Bogart, 2010). As discovering links of interest required an effort that shouldn’t go unrewarded, credit links answered the need to reward proficiency in discovering relevant links, the skill that the network valued most in its peers. The direct reciprocity of link attribution thus emerged as a facilitator of exchange relations within a peer network.

Barger for his part intended social curation to ‘make the web as a whole more transparent, via a sort of “mesh network,” where each weblog amplifies just those signals (or links) its author likes best’ (Barger, 2007). He advised that bloggers ‘ought to give enough credit that readers can check out that source for themselves’ (Barger, 1999d) because he understood that the direct reciprocity of link attribution was a facilitator of exchange relations: ‘We vacuum the Net for stories that the major outlets haven’t noticed yet, and pass along our sources so we can all get more and more efficient at this vacuuming’ (Barger, 1998b). As an attributed link amounted to a ‘vote of confidence’ (Barger, 1999e) in the source being credited, the norm of reciprocity facilitated exchange relations through the establishment of mutual trust.

The connection between link attribution and community formation was most succinctly stated by Brad Graham of BradLands, who noted: ‘by crediting a weblog where a borrowed link was originally found, we introduce a means of scalability to our expanding community’ (2000). In Graham’s analysis, link attribution had one simple purpose: ‘we’re recruiting’ (2000). The direct reciprocity inherent in link attribution would therefore ‘knit together a new social Web’ (Rosenberg, 2009, p. 83), driving community formation and growth.

6. Conclusion

The original network of weblogs coalesced around Dave Winer’s NewsPage model, adopting and evolving the example that Winer set in Scripting News. It was Chris Gulker and Jorn Barger, however, who articulated the need for such sites to be networked to allow for the social curation of the web, and it was Barger and Steve Bogart who instigated and forged this network into a collectivity by introducing and establishing link attribution as a form of direct reciprocity, a shared norm that precipitated the routinisation of the network’s social relations.

Further research will need to provide a complementary account of how the disintegration of the original weblog community in early 2000 gave rise to blogging in today’s sense of the term.

Legal Note

© Elsevier, 2011. This is the author’s pre-print of a paper presented at ASNA 10, the 7th Conference on Applications of Social Network Analysis. The paper is published in the conference proceedings.

References

Footnotes

1 For human readability and ease of access to the archival sources, the data set is maintained as a chronologically ordered list on a web page (Ammann, 2009b). List items consist of a datestamp followed by a plain English Subject-Verb-Object sentence. For every item, the datestamp is hyperlinked to the URL of the respective link being reported, the subject and object of the sentence are hyperlinked to the source and the target of the link, respectively. The verb in any given line, i.e. list, credit or endorse, corresponds to one of the three link types identified in the data set.

2 The sites that are wholly or partly missing are listed below, in order of increasing degree centrality:

Psyberspace (degree centrality: 2) In March 1998, Andy Edmonds promised Barger ‘some nice crosslink action’ (Edmonds, 1998a) from Psyberspace. It cannot be ascertained if this ever happened: only one isolated weblog page has been preserved from around the same time (Edmonds, 1998b). The bulk of Edmonds’ early weblog is, unfortunately, not retrievable, as the current registrant of Psyberspace.net has the Internet Archive’s holdings blocked by a robots.txt query exclusion. Psyberspace has an overall inlink count of 4.

Phil Suh (degree centrality: 2) Phil Suh’s news page archives from 1997 and 1998 have disappeared without a trace. The Internet Archive has no matches. The site’s overall inlink count of only 4 does not suggest that the missing archives contain a large number of relevant outlinks.

Ragged Castle (degree centrality: 3) Andy Affleck (né Williams) states in his archives that the posts from May 1998 to July 2000 are missing due to a ‘hard drive incident and a lack of backup problem’ (Affleck, 2001). Ragged Castle’s inlink count is 7.

Drudge (degree centrality: 4) The Drudge Report never maintained archives, but the Internet Archive has a few pages. The absence of links from Drudge into this network is unproven but strongly presumed on the basis of Matt Drudge’s known resistance to being identified as a blogger. Drudge’s inlink count is 20.

Obscure Store (degree centrality: 6) Jim Romenesko’s Obscure Store & Reading Room was first noted on 2 July 1998 as a ‘very promising, professional-looking weblog-like page’ (Barger, 1998f), but the Internet Archive has pages only for 2, 5 and 11 Dec 1998 (Romenesko, 1998a, 1998b, 1998c). It is strongly presumed that, except for the list link to Robot Wisdom, Obscure Store did not link into this network. Obscure Store’s inlink count is 67.

The Obvious (degree centrality: 10) Michael Sippey started the web zine Stating the Obvious in 1995. In May 1997, Sippey added a new feature to his site, the Obvious Filter, which he nested one directory into his site. The Obvious Filter adopted the news page model, a fact that Dave Winer immediately recognised and celebrated in a brief note (Winer, 1997c). The Obvious Filter ran until mid-September 1997 and appears to be fully preserved by the Internet Archive in these instalments: 28 – 31 May 1997 (Sippey, 1997a), 1 – 13 Jun 1997 (Sippey, 1997b), 16 – 30 Jun 1997 (Sippey, 1997c), 1 – 15 Jul 1997 (Sippey, 1997d), 16 – 31 Jul 1997 (Sippey, 1997e), 1 – 15 Aug 1997(Sippey, 1997f), 15 – 31 Aug 1997 (Sippey, 1997g), 2 – 15 Sep 1997 (Sippey, 1997h), 16 Sep – 13 Oct 1997 (Sippey, 1997i)

In mid-October, Sippey chose to shut down his site for a while (Anuff, 1997; Hudson, 1997) and relaunched in late December (Sippey, 1997j), offering a new implementation of the ‘Filter’ as a weekly feature under the name ‘Filtered for Purity.’ The archives of that feature run until late April 1998 (Sippey, 1998a). In another redesign of the site in early May, Sippey promoted the ‘Filtered for Purity’ feature to the front page and restored it to its original daily publication schedule (Merholz, 1998), which allowed him to drive, according to one testimony, a ‘huge amount’ (Eisenberg, 1998) of traffic. The ‘Filtered for Purity’ feature remained on Stating the Obvious until the end of the year but fell victim to Sippey’s new-year resolution for 1999, in which he forswore ‘the self-induced stress of producing daily content, even if that content wasn’t really content at all, but merely meta-content – links to and smartass commentary on other people’s content’ (Sippey, 1999). The Internet Archive has preserved a sampling of Filtered for Purity in its waning days (Sippey, 1998b), but its contents from May to late October 1998 are unaccounted for. As the extant Filter material has an overall outlink count of only 4, the missing months are unlikely to contain a larger number of outlinks to the rest of the network. The Obvious‘ inlink count is 13.

3 In some of the literature on early weblogs, Barrett’s list of January 1999 is discussed as a founding document of the ‘weblog community’ (Ammann, 2009a). None of that literature ever managed to produce the actual list, however. Having found it in an unexpected wrinkle of the Internet Archive, I noticed its high degree of overlap both with Wallace’s earlier list and Gulker’s original compilation, neither of which had previously been identified and discussed as descriptions of the weblog network.

Jesse James Garrett’s Ye olde skool list on his ‘Page of only weblogs’ (1999a)⁠ has been a standby in the discourse on early weblogs ever since Rebecca Blood mentioned it in her first essay on the history of weblogs (2000). Dennis Jerz has called it ‘canonical’ (2007), and it forms the basis of press reports that put the number of weblogs in 1998 at twenty-three (‘Weblogs rack up a decade of posts,’ 2007). Garrett’s list is left out of this study, however. Having been compiled between April (Merholz, 1999)⁠ and October 1999 (Garrett, 1999b)⁠, the list is not contemporaneous with the reported period. Moreover, it contains only one site that isn’t featured in at least one of the three seed lists: the LTSeek news page, which ran from March 1998 (Rakestraw, 2001)⁠ to January 2002 (Rakestraw, 2002)⁠ and served as a current awareness resource on educational matters. However, there do not appear to be any links between LTSeek and any of the sites in the weblog network, so its degree centrality of zero disqualifies the site from inclusion in the network. Garrett’s list, therefore, does not add anything new to the older lists.

4 No list on its own offers a reliable delimitation of this network, as each is skewed by its own bias. Gulker’s list of the ‘NewsPage Network’ is limited to users of Winer’s NewsPage software. As such, it includes sites whose creators did not share any affinity with, or interest in, other news pages. Usage of Winer’s NewsPage software as the sole criterion of inclusion makes Gulker’s list too inclusive. By contrast, Wallace’s and Barrett’s respective lists offer two virtually identical views of an evolved network some 10 to 12 months later, but either of these views is only a statement of arbitrary personal preference; Wallace’s and Barrett’s respective lists are too exclusive.

5 The criterion of family resemblance also applies within sites that were included in the network. It bears remembering that the earliest weblogs operated under a much narrower definition of the form than is generally accepted today: they held the provision and brief annotation of links to be the defining feature, indeed the primary purpose, of a weblog (Blood, 2000). Yet the bloggers of the period often maintained essay pages that were separate from their weblogs, e.g. Dave Winer’s DaveNet, (Winer, 2004) Steve Bogart’s Scribbles (Bogart, 1998b) or Cameron Barrett’s Rants (Barrett, 2001). In keeping with the contemporaneous definition of what a weblog was, a network site’s non-weblog parts were excluded from the data set.

6 The term ‘endorsement’ is suggested by Kleinberg (1999, p. 617).

7 Barger’s list is not preserved in a copy of 1998, the earliest extant version dating from April 1999. In the data set, a new list link is inferred to have been added to the Robot Wisdom Weblog sources page whenever Barger credited a weblog for the first time.

8 In the data set, credit links with a direct hyperlink are marked with an asterisk. Link attributions, although by far the most numerous type throughout the data, are under-reported, as multiple credits per day, especially on Robot Wisdom, were counted as a single credit only.

9 The size of the strongly connected component and the normalised degree centrality measures were calculated in the Pajek network analysis application (Batagelj & Mrvar, 2008). All other measures were calculated using the grep utility (Josey, Cragun, Stoughton, Brown, & Hughes, 2004).