You have a unique and rare name: How rare and unique is my name really?

by Dominik Freunberger



      Dominik, you have a unique and rare name…


I’ve heard that sentence once too often. So I got the data from Statistik Austria to see if there’s something to it.

More specifically, I tried to answer the following questions:

I also made a little app in which you can check your own name’s popularity!

* Please note that Statistik Austria uses a binary classification for the names and the associated data, and there is criticism from many fields regarding this approach.

library(tidyverse)
library(knitr)
library(lubridate)
library(readxl)
library(ggrepel)
library(ggpubr)
library(hrbrthemes)
library(shiny)
library(bslib)

my_red = "#DC2F1E"
sonic_blue = "#C3DADC"

Statistik Austria provides the top 60 names per sex for the years 1984 through 2020. I’m born 1986, so I’ll get my birth year but not much before that. Note that the names are etymological names (as Statistik Austria calls it), meaning that the different spellings (Dominic, Dominique, Domenik, etc.) are collapsed to one form.

file = "statistik_der_60_haeufigsten_vornamen_1984-2020_in_oesterreich_-_etymologi.xlsx"
names = read_xlsx(file, skip = 3)

# it's the top 60 names from 1984 to 2020
# Important: Some years have multiple rank 60s.

names %<>%
  rename(boys_rank = "Rang...1",
         boys_name = "Vorname...2" ,
         boys_absolute = "Absolut...3" ,
         boys_percent = "in %...4",
         boys_cumulative = "% kumulativ...5",
         girls_rank = "Rang...6",
         girls_name = "Vorname...7",
         girls_absolute = "Absolut...8",
         girls_percent = "in %...9",
         girls_cumulative = "% kumulativ...10") %>%
  mutate(boys_rank = as.numeric(boys_rank),
         year = boys_rank,
         year = ifelse(year < 61, NA, year)) %>%
  fill(year) %>% 
  # at the botton we have summaries for 1984-2010 and 2010-2020
  slice_head(n= 2273) %>% 
  filter(boys_rank < 61)

boys = names %>% 
  select(11, 1:5) %>% 
  rename_all(
      funs(stringr::str_replace_all(., 'boys_', ''))) %>% 
  mutate(sex = "boys")

girls = names %>% 
  select(11, 6:10) %>% 
  rename_all(
      funs(stringr::str_replace_all(., 'girls_', ''))) %>% 
    mutate(sex = "girls")

names = rbind(girls, boys) %>%
  mutate(percent = round(percent, 2),
         cumulative = round(cumulative, 2))

# now we have a tidy df, yay!

head(names) %>% kable()
year rank name absolute percent cumulative sex
2020 1 Anna 1925 4.73 4.73 girls
2020 2 Marie 1304 3.21 7.94 girls
2020 3 Sophie 1287 3.17 11.11 girls
2020 4 Emilia 1105 2.72 13.82 girls
2020 5 Elena 1013 2.49 16.31 girls
2020 6 Lena 777 1.91 18.23 girls

Is “Dominik” even part of this list?

Since we get only the 60 most popular names per year adn sex, my name has to be at least the 60th most popular in any given year to be in this list (which is already quite popular, given that in 2020 babies in Austria got 1772 different first names). Let’s see.

names %>% 
  filter(name == "Dominik") %>% 
  kable()
year rank name absolute percent cumulative sex
2020 45 Dominik 266 0.62 54.65 boys
2019 48 Dominik 261 0.60 56.58 boys
2018 39 Dominik 307 0.70 50.53 boys
2017 40 Dominik 325 0.72 52.50 boys
2016 39 Dominik 319 0.71 51.64 boys
2015 35 Dominik 358 0.82 49.96 boys
2014 35 Dominik 326 0.77 50.37 boys
2013 32 Dominik 336 0.82 48.58 boys
2012 32 Dominik 362 0.89 50.02 boys
2011 30 Dominik 384 0.95 48.90 boys
2010 29 Dominik 371 0.92 48.11 boys
2009 27 Dominik 412 1.23 49.94 boys
2008 27 Dominik 413 1.19 50.20 boys
2007 27 Dominik 401 1.18 50.49 boys
2006 28 Dominik 417 1.19 52.01 boys
2005 23 Dominik 506 1.44 46.14 boys
2004 24 Dominik 515 1.44 48.99 boys
2003 19 Dominik 581 1.67 42.57 boys
2002 18 Dominik 657 1.88 42.49 boys
2001 17 Dominik 650 1.94 42.71 boys
2000 10 Dominik 767 2.22 29.23 boys
1999 10 Dominik 813 2.35 32.81 boys
1998 10 Dominik 945 2.63 32.27 boys
1997 9 Dominik 1105 2.97 30.90 boys
1996 5 Dominik 1295 3.30 18.98 boys
1995 6 Dominik 1381 3.52 23.83 boys
1994 4 Dominik 1561 3.83 16.37 boys
1993 8 Dominik 1518 3.61 32.06 boys
1992 8 Dominik 1459 3.39 32.38 boys
1991 8 Dominik 1415 3.22 33.55 boys
1990 10 Dominik 1277 2.97 40.97 boys
1989 12 Dominik 1179 2.77 47.09 boys
1988 16 Dominik 927 2.17 57.41 boys
1987 20 Dominik 691 1.65 65.31 boys
1986 20 Dominik 702 1.66 64.84 boys
1985 22 Dominik 598 1.40 67.00 boys
1984 25 Dominik 432 1.01 70.64 boys

We can see that it’s not only in the data but in every single year (which makes it already consistently popular). Let’s see how this developed over time.

Popularity of the name Dominik from 1984 to 2020: Rise and fall

Below you see the proportion of boys named Dominik from 1984-2020, with my birth year and the most and least popular years highlighted and labeled.

names %>%
  filter(name == "Dominik") %>% 
  ggplot(aes(x = year, y = percent)) +
  geom_line(size = .5, alpha = .5) +
  geom_point(shape = 1, size = 4, color = my_red, alpha = 2/3) +
  geom_point(data = . %>% filter(percent == max(percent)), color = my_red, size = 5) +
  geom_point(data = . %>% filter(percent == min(percent)), color = my_red, size = 5) +
  geom_point(data = . %>% filter(year == 1986), color = my_red, size = 5) +
  geom_text(data = . %>% 
              filter(percent == min(percent)), 
              aes(label = paste0(as.character(year), ":")), hjust = 1.1, vjust = -1.6) +
  geom_text(data = . %>% 
              filter(percent == min(percent)), 
              aes(label = paste0(as.character(percent), "%")), hjust = 0, vjust = -1.6) +
  geom_text(data = . %>% 
              filter(percent == max(percent)), 
              aes(label = paste0(as.character(year), ":")), hjust = -0.2) +
  geom_text(data = . %>% 
              filter(percent == max(percent)), 
              aes(label = paste0(as.character(percent), "%")), hjust = -1.1) +
  geom_text(data = . %>% 
              filter(year == 1986), 
              aes(label = paste0(as.character(year), ":")), hjust = -0.9) +
  geom_text(data = . %>% 
              filter(year == 1986), 
              aes(label = paste0(as.character(percent), "%")), hjust = -1.8) +
  labs(title = "How frequent is the name Dominik?", 
       subtitle = "Proportion of boys in Austria named Dominik by year",
       y = "Proportion in %", 
       x = "Year") +
  theme_ft_rc()

Wait what? While I suspected that the name Dominik was more popular than I experienced it in the remote mountain village I grew up in, this pattern is quite a surprise. The steady rise and the steady fall after 1994. I tried to find external events, such as scandals involving Dominiks, that could have triggered the decline after 1994 but I couldn’t find anything involving a somewhat popular Dominik.

Now that we know the percentage of Dominiks each year, let’s see how Dominik ranked over the years.

What would be my girl name based on popularity?

What if my parents chose my name purely based on its popularity? Maybe they thought “the baby’s name shouldn’t be too popular but also not too unpopular, something like rank 20”. Well, well, here I am, Dominik. If I were born as a girl, what would my name be?

names %>% 
  filter(year == 1986 & rank == 20 & sex == "girls") %>% 
  kable()
year rank name absolute percent cumulative sex
1986 20 Bettina 596 1.48 45.33 girls

Bettina! Let’s see how this name did over the years.

names %>%
  drop_na() %>% 
  ggplot(aes(x = year, y = rank)) +
  geom_line(data = . %>% filter(name == "Bettina"),size = .5, color = my_red) +
  geom_point(data = . %>% filter(name == "Bettina" & year == 1986), color = my_red, size = 5) +
  geom_point(data = . %>% filter(year == 1997 & name == "Bettina"), color = my_red, size = 5) +
  geom_text(data = . %>% 
              filter(year == 1997 & name == "Bettina"), 
              aes(label = paste0(as.character(year), ": Rank ", as.character(rank))), hjust = -0.1, vjust = 0) +
  geom_text(data = . %>% 
              filter(year == 1986 & name == "Bettina"), 
              aes(label = paste0(as.character(year), ": Rank ", as.character(rank))), hjust = 0, vjust = 2.2) +
  labs(title = "How does the female 1986 rank equivalent to Dominik\ndevelop over the years?", 
       subtitle = "Rank of the name Bettina by year",
       y = "Rank", 
       x = "Year") +
  xlim(1984, 2019) +
  scale_y_reverse()+
  theme_ft_rc()

Oh. A short rise after 1986 and then a steep decline, leaving the top 60 names in the late 90s.

So, we’ve seen some names’ popularity over the years but an urging question is of course:

Conclusion: Not so rare after all

We’ve seen quite some different trends over time. The rise and fall of Dominik, the rise of Tobias and the fall of Stefan, the steep decline of Bettina, and the increase to the top of Anna and Lukas. But back to the initial question if Dominik indeed is as rare a name as I’ve been told throughout my childhood: Nes. Yo. While it was somewhat popular-ish when I was born (rank 20), it had a steep rise in popularity, peaking at being the 4th most popular name for boys in Austria in 1994. Now, 27 later, the name is struggling to stay in the top 60 list.

Check your own name’s popularity!

If you wanna look at your own or other names and how they developed between 1984 and 2020 in Austria, have a look at the little app I made.

If you have any questions or feedback, don’t hesitate to contact me.