Work Header

(Almost) Ten Years of Steve/Tony in Numbers

Work Text:


During some ten years of fandom activity, Steve/Tony fanfiction has been posted on many different sites. The very first stories were probably posted on livejournal in 2007. Nowadays, one of the main platforms for Steve/Tony fic is Archive of Our Own, commonly known as AO3. In this post, I'm going to take a look at the stories posted about our favorite Avengers on the archive.

For this analysis, I collected the details of all the complete, publically available stories posted on AO3 where Steve/Tony is tagged as the first pairing. If you are interested in the technical details, further Materials and Methods can be found at the end of the text.

Steve/Tony Times X

The total number of stories in my data set is 12238. This is a little less than what you see if you go to check out the Steve/Tony tag on AO3, because I did my best to cut out stories where they're a background pairing. The first stories are from 2007, but since AO3 only went into open beta in November 2009, these must be stories that were added later and backdated to match their publication date on other sites.

Looking at the amount of stories posted per year, there was a major upsurge in 2012, when The Avengers (the movie) brought Steve/Tony to the wider attention of non-comic fans. Promisingly, 2016 saw the highest number of stories posted per year so far, with over 2500 new Steve/Tony fics. Our ship keeps on sailing!

стив роджерс/тони старк

The majority (89.2%) of Steve/Tony stories posted on AO3 are in English. What I find more interesting, though, is the distribution of non-English language stories.

In total, there are stories written in 19 languages other than English, with the top 10 shown above. The most common language after English turns out to be Russian, followed by Chinese and Spanish. Some of these are undoubtedly translations; I didn't try to quantify how many.

Author Chose Not To…

Moving on with the information that's available for every story, we arrive at ratings and warnings.

The most common rating is Teen And Up, followed by General Audiences and then Explicit, the least used is Not Rated. As for warnings, there's a large number of possible combinations, but most stories are marked either "Choose Not To Use Archive Warnings" or "No Archive Warnings Apply". Funnily enough, the third most common set of tags is both, even though they kind of should be mutually exclusive.

Size Matters

As an author myself, I've taken part in quite a few conversations about the merits of shorter vs longer fic.

Most Steve/Tony stories posted on AO3 (74.8%, to be exact) are 5000 words or less in length. Out of the remaining 25.2%, most fall in the 5-10k range, and only some rare outliers are in the over 100k category. Most stories have one chapter.

The Numbers We Love and Hate

Next, let's take a quick look at a series of plots concerning those stats that at least I've personally spent far too much time thinking about: the numbers created by the readers, which certainly tell something of popularity, and may or may not have anything to do with quality.

The hit and kudos counts seem to be following a roughly similar pattern, suggesting that there's definitely some kind of correlation going on there. The question of how these measures relate to one another and to other variables is a very interesting one, but I won't be attempting further analyses of it here. Maybe at some later date! The patterns for comments and bookmarks look somewhat different: most stories have 0-5 comments (including authors' replies, since these are the total amounts of comments, not the amounts of comment threads), and there are generally more bookmarks than comments.

Across the Multiverse

The basic numbers that we looked at above probably follow similar patterns for a lot of fandoms, but one thing specific to Marvel fandom is that our heroes exist in a variety of canons.

The original, of course, is Marvel 616, without which none of the others would exist. Considering the total number of stories, it's currently in second place, since Steve and Tony's rise to wider fame with the Marvel Cinematic Universe brought in a massive amount of new authors, making MCU by far the largest canon. The spike in MCU stories in 2012 certainly looks dramatic!

With the canon sorting I used here, a lot of stories ended up in the "Ambiguous" category, which mostly includes stories that were tagged either "ambiguous" or "all media types". More interesting are the stories in the "Other Canons" category:

Currently, the largest "minor canon" by total amount of stories is Marvel Adventures: Avengers, which seems to have had a surge in popularity in 2016. The second is the Ultimates 'verse. Unfortunately, it hasn't seen that many new stories posted recently. Avengers Assemble, which has been gaining popularity every year since it started airing, may soon take over second place—unless Avengers Academy gets there first! The game only launched last year, but spurred a lot of stories right away.

I Don't Know How To Tag

The part of this data exploration that was simultaneously the trickiest and the most fun was handling the tags. Overall, the ten most common tags used in stories were the following:

1. Fluff 2180
2. Angst 1469
3. Established Relationship 856
4. Hurt/Comfort 729
5. Humor 604
6. Alternate Universe 551
7. Romance 539
8. Stony - Freeform 511
9. Pre-Slash 425
10. Getting Together 408

These are based on entries in a plain text tag list, so the count for "Fluff" only includes instances where the exact tag that says "Fluff" was used. This, of course, isn't the whole truth. To get a better look at the genres that the tags suggest, I decided to categorize the stories based on the genre tags used at the Cap-IM livejournal community.

This is probably the most complicated figure in this entire post. As a technical explanation, in this comparison, a story can count as multiple entries: for example, a fic tagged both Angst and Crack will count towards the totals of both. Multiple tags that fall into the same category, such as a fic tagged "Angst", "So Much Angst", "All the Angst", only add 1 to the total sum of angst tag use. Percentages stand for number of uses of specific tag / total number of tags used.

The distribution of genre tag use seems to vary by 'verse in ways that aren't entirely surprising: for example, the Action/adventure tag is more common in Noir than in any other verse, and 1872 has the largest percentage of Angst and Fix-Its. 3490, which is a canonical genderswap alternate universe, is often tagged both AU and Rule 63. Fluff is particularly popular in the happy and sunny Marvel Adventures: Avengers (MA:A), Avengers Assemble (AA) and Avengers Academy (AvAc) 'verses.

I did a few additional fun things with the tags: first, I figured that since alternate universes are quite popular, I'd like to check which ones are the most common. Here's the top 10:

1. Canon Divergence 271
2. No Powers 210
3. High School 170
4. College/University 139
5. Modern Setting 105
6. Soulmates 95
7. Fantasy 56
8. Gender Changes 53
9. Fusion 45
10. Historical 39

Another interesting set of tags are the tags used in relation to our two heroes, so I searched specifically for tags that mention Steve or Tony by name, and made tag clouds of them.

Clearly, we like FEELS, since by far the biggest tags for each word frequency table were the ones with name + Feels. Tony also really, really needs all the hugs—the "Tony Stark Needs a Hug" tag gave me trouble when creating the cloud, since it's long and also very common, and threatened to make everything else too small to read. Steve also needs hugs, but apparently not quite as often, and seems to spend more of his time being protective.

The Supporting Cast

Our two main characters are often the only ones to appear in a story (around 32% of stories have only Steve and Tony or Steve and Natasha Stark in them), but one must not forget their teammates and other close friends. The median amount of other characters in stories is one, the maximum a cast of Steve, Tony and 32 others. The most common secondary characters vary across the different canons, with the top five for each as follows:

As for secondary relationships, most stories don't have any. Actually, only 616, MCU and AvAc had more than 10 occasions of using a secondary pairing tag in addition to Steve/Tony. The top 3 for each of these:

What's In A Name

Last but not least, since I had a list of the titles of all stories, I decided to figure out what words are most common in titles. To do this, I trimmed the list down a bit, removing punctuation and cutting out common, non-informative words (or "stop words", for those who are familiar with the term). Here's what the results look like as a word cloud:

The Median Steve/Tony Story

To conclude this statistical tour of Steve/Tony stories, I've put together what the statistics say is a median Steve/Tony story. (I'm using the median and not the mean/average, because the mean tends to get skewed by outliers, and this data has a bunch, such as stories with unusually high counts of words or kudos).

The Median Steve/Tony Story is a Teen-rated fic with no archive warnings, and obviously in the M/M category, since its only relationship tag is Steve Rogers/Tony Stark. The fic was posted on a Sunday in May (the most common day and month of posting), and is written in English. It's set in the Marvel Cinematic Universe. The cast includes, in addition to Steve and Tony, one other character, and that other character is Natasha Romanova. The story has five other tags (which is the median number of tags): Fluff, Angst, Established Relationship, Hurt/Comfort and Humor. The story is 2107 words long, with one chapter, and it has so far gained 10 comments, 128 kudos, 16 bookmarks, and 2769 hits.

Since "Median Steve/Tony Story" is a rather dull title, I also used the numbers to come up with a title. The median number of words in titles is three, and the three most common words in titles are Tony, Steve and Love, to which I added some appropriate punctuation!

Materials and Methods

I collected the data on March 13th 2017 from the AO3 search results for all works tagged Steve Rogers/Tony Stark that are complete, can be seen without logging in to AO3, have at least one kudo, and at least 100 words (as a crude way to trim out art & podfic posts). As Steve Rogers/Natasha Stark is not synned to Steve/Tony, I ran a separate search for that tag, keeping the other parameters the same. I then further trimmed the data to include only stories where Steve/Tony (or Steve/Natasha Stark) is the first pairing. I did the data scraping & all analyses in R, using custom scripts and the following packages: rvest for harvesting data, tidyr and reshape2 for tidying it up and rearranging it, ggplot2 for plotting, and wordcloud2 for word clouds.

Since I was handling the data as plain text extracted from the html source code, tag sorting (for 'verses, characters, genres and other freeform tags) didn't take into account AO3's tag system (synning/metatagging etc). This meant that I needed to bin stories into specific canons/genres using hand-picked lists of tags, and the choice of what tags went into each category (e.g. what fandoms count as "MCU" or which tags are considered "fluff") was based on personal judgement. For characters, I sorted out some of the most common problematic cases (such as the various names used for Natasha Romanova/Romanoff/Romanov, and characters tagged both "name" and "name (Marvel)"). Of course, this isn't a very exact process (but neither is AO3's tagging). If you want to know what tags went into any of the specific categories/lists, feel free to ask.

Finally, if anyone is interested in the raw data and/or my messy R scripts, I'll be happy to share them; I can be reached at veldeia (at)