Sunday, November 30, 2025

Visible summaries of inhabitants in Pacific islands


This would be the first of a number of posts the place I submit some code and visualisations of inhabitants points within the Pacific. The evaluation and visualisations are fairly easy. Between them, they’ll present make (with publicly obtainable knowledge) all of the statistical pictures utilized in a presentation I not too long ago gave in Wellington on migration and mobility within the Pacific.

This was for a aspect occasion earlier than the Pacific “Heads of Planning and Statistics” assembly, which takes place each two years and is the largest occasion my staff on the Pacific Group (SPC) organises. All of the papers and displays thought of the assembly can be found on-line, which is certainly transparency in motion.

It was enjoyable at this aspect occasion to have the possibility for as soon as to speak in regards to the substantive points the info exhibits, fairly than (as is the same old focus of my conferences) enhance the info, enhance its use, and usually strategise and prioritise to enhance statistics. This stuff are necessary and (arguably) enjoyable too, however it’s good to place them apart and discuss some precise improvement points from time to time. My discuss was adopted by an amazing panel dialogue with audio system from academia, a UN organisation, Stats NZ and a Pacific island nationwide planner.

At present’s submit is fairly easy and is nearly producing two statistical charts (certainly one of them with each a “naked” and a “highlighted” model), setting the scene for inhabitants within the Pacific.

Downloading knowledge

First, I obtain and tidy up the info. Every part I would like for these charts is already within the Pacific Information Hub, making this gorgeous easy. The factor that takes a little bit of fiddling is changing the nation codes to user-friendly nation names; and classifying every nation into certainly one of Melanesia, Polynesia or Micronesia.

# This script produces a few basic use plots on inhabitants development within the Pacific
# to be used in displays on knowledge points

library(tidyverse)
library(rsdmx)
library(scales)
library(janitor)
library(ISOcodes)
library(glue)
library(spcstyle)
library(extrafont)
library(Cairo)
library(ggrepel)

# basic use caption and font:
the_caption <- "Supply: UN World Inhabitants Prospects, through the Pacific Information Hub"
the_font <- "Roboto" 

# Obtain all of the mid 12 months inhabitants estimates from PDH.stat
d <- readSDMX("https://stats-sdmx-disseminate.pacificdata.org/relaxation/knowledge/SPC,DF_POP_PROJ,3.0/A.AS+CK+FJ+PF+GU+KI+MH+FM+NR+NC+NU+MP+PW+PG+PN+WS+SB+TK+TO+TV+VU+WF+_T+MEL+MIC+POL+_TXPNG+MELXPNG.MIDYEARPOPEST._T._T?startPeriod=1950&endPeriod=2050&dimensionAtObservation=AllDimensions") |> 
  as_tibble() |> 
  clean_names() |> 
  mutate(time_period = as.numeric(time_period))

# Some subregional classifications.
mel <- c("Melanesia", "Papua New Guinea", "Fiji", "Solomon Islands", "Vanuatu", "New Caledonia")
pol <- c("Polynesia", "Tonga", "Samoa", "Prepare dinner Islands", "Tuvalu", "American Samoa", "Pitcairn", "Wallis and Futuna", "French Polynesia", "Niue", "Tokelau")

# lookup desk with nation codes, names, and which subregion they're in
pict_names <- tribble(~Alpha_2, ~Identify,
                      "_T", "All PICTs",
                      "MEL", "Melanesia",
                      "_TXPNG", "Complete excluding PNG",
                      "POL", "Polynesia",
                      "MIC", "Micronesia")  |> 
  bind_rows(choose(ISO_3166_1, Alpha_2, Identify)) |> 
  rename(geo_pict = Alpha_2,
         pict = Identify) |> 
  mutate(area = case_when(
    pict %in% mel ~ "Melanesia",
    pict %in% pol ~ "Polynesia",
    grepl("^_T", geo_pict) ~ "Complete",
    TRUE ~ "Micronesia"
  ))

# Dataset that mixes the unique PDH.stat knowledge with the nation names and regional classifications
d2 <- d |> 
  mutate(period = ifelse(time_period <= 2025, "Previous", "Future")) |> 
  inner_join(pict_names, by = "geo_pict") |> 
  mutate(pict = gsub("Federated States of", "Fed. States", pict)) |> 
  # Order nation names from smallest to largest inhabitants in 2050:
  mutate(pict = fct_reorder(pict, obs_value, .enjoyable = final)) 

Line plot

This places us ready to only draw our first plot:

It’s very intuitive, and I believe a crucial introduction to all of the international locations and territories we’re speaking about. Once we first made a model of this plot I assumed it might by no means be neat sufficient to make use of in a presentation, however in truth it really works okay on an enormous convention display, as long as we exclude (as I’ve) the varied regional and sub-regional totals.

All of the laborious work to provide this plot had been accomplished earlier within the knowledge administration, so producing the plot is only a single chunk of code:

#----------------------time collection line plot-------------

# This model simply has 21 particular person PICTs, no subregional totals. 21 matches
# okay on the display in 3 rows of seven:
d2 |> 
  # take away subregional and regional totals, so solely precise international locations
  filter(!(pict %in% c("Micronesia", "Polynesia", "Melanesia") | 
             grepl("complete", pict, ignore.case = TRUE) | 
             pict %in% c("All PICTs", "Pitcairn"))) |> 
  ggplot(aes(x = time_period, y = obs_value, color = period)) +
  facet_wrap(~pict, scales = "free_y", ncol = 7) +
  geom_line() +
  theme(legend.place = "none",
        panel.grid.minor = element_blank(),
        strip.textual content = element_text(face = 'plain'),
        plot.caption = element_text(color = "grey50")) +
  scale_y_continuous(label = comma) +
  scale_colour_manual(values = spc_cols(c(4, 2))) +
  # power all y axes to go to zero (however as a result of free_y within the facet_wrap name, 
  # they are going to be on totally different scales for readability):
  expand_limits(y = 0) +
  labs(x = "", y = "",
       title = "Inhabitants within the Pacific, 1950 to 2050",
       subtitle = "Nations listed in sequence of projected inhabitants in 2050",
       caption = the_caption) 

Scatter plot

The road plot’s a pleasant introduction to inhabitants and most significantly, it’s simply understood. However except individuals look fastidiously on the vertical axis labels it offers no sense of absolutely the dimension of the totally different international locations, and solely a really tough visible sense of the differing development charges.

In on the lookout for a single picture that might summarise two issues I got here up with this chart:

That is one thing we’d ready earlier nicely earlier than this discuss and never but wanted to make use of, however it was for precisely this kind of use case—a single slide abstract of Pacific island international locations and territories’ absolute dimension and development charges.

It takes just a little little bit of rationalization for, and focus from, an viewers—specifically, explaining why the unfavourable development space is shaded and what which means. The logarithmic scale for inhabitants dimension means individuals in all probability gained’t realise simply how overwhelmingly massive Papua New Guinea is in comparison with the remainder of the Pacific; to point out that correctly, we actually want a unique chart. However total, that is easy sufficient for individuals to know.

What I like about this plot is that it makes clear the 2 broad classes of Pacific island international locations and territories in inhabitants phrases: comparatively giant (which means >100,000 individuals!) and rising, which is all of Melanesia and some others; and small and shrinking, comprising most of Polynesia and components of Micronesia. Tonga, inhabitants estimated round 104,000, is the borderline case—all of the international locations bigger than Tonga are rising in inhabitants phrases; and almost all these smaller than it are shrinking.

There’s two territories I dropped from this plot as a result of the UN 2024 inhabitants projections, which is the info used, are materially old-fashioned and I didn’t wish to get side-tracked into explaining why within the discuss. We’ll be capable to embody them in future variations of the pot hopefully quickly.

Once more, it was fairly easy to create the plot with the info we’ve already received. Right here’s the R code to try this:

#----------------scatter plot evaluating development to totals---------------
# Abstract knowledge as one row per nation to be used in scatter plot
d3 <- d2 |> 
  group_by(pict, area) |> 
  summarise(pop2025 = obs_value[time_period == 2025],
            pop2020 = obs_value[time_period == 2020]) |> 
  mutate(cagr = (pop2025 / pop2020) ^ (1/5) - 1) |> 
  mutate(point_type = if_else(pict %in% c("Micronesia", "Polynesia", "Melanesia") | area == "Complete", "total_like", "nation"),
         # font kind has to make use of id scale, no scale to map it        
         font_type = ifelse(point_type == "total_like", 4, 1),
         # couldnt' get Melanesia in the fitting spot with ggrepel so should make a selected adjustment for it:
         adjusted_x = ifelse(pict == "Melanesia", pop2025 * 1.35, pop2025))

# For a presentation used at HOPS7, I would like
# 1) scatter plot however with out the area and subregions, to keep away from litter
# 2) as 1 however with the shared sovereign international locations highlighted eg with a circle round them.
#
# I additionally excldued two international locations that had conspicuously out-of-date knowledge that I did not
# need visually outstanding.

d4 <- d3 |> 
  filter(point_type == "nation") |> 
  # two international locations/territories have materially unsuitable estimates that
  # are distracting, higher to only drop them from the chart
  filter(!pict %in% c("Tokelau", "Micronesia, Fed. States"))

p2b <- d4 |> 
  ggplot(aes(x = pop2025, y = cagr, color = area)) +
  # Draw a pale (clear, alpha) background rectangle for the unfavourable development international locations:
  annotate("rect", xmin = 30, xmax = Inf, ymin = 0, ymax = -Inf, alpha = 0.1, fill = "purple") +
  # Largish factors for every nation:
  geom_point(dimension = 2.5) +
  # labels for every nation:
  # geom_label_repel(aes(label = pict), seed = 7, household = the_font, dimension = 2.7, label.dimension = 0, fill = "clear") +
  geom_text_repel(aes(x = adjusted_x, label = pict, fontface = font_type), seed = 6, household = the_font, dimension = 2.7) +
  # For the smaller international locations, use precise populations because the factors for markers on the axis.
  # For bigger than 10,000, there are too many international locations and it might be cluttered, so use 3, 10, 30, 100, and so on.
  scale_x_log10(label = comma, 
                breaks = signif(c(type(distinctive(d3$pop2025))[c(1:4, 8, 9, 12, 23:25)], 3e5), 3)) +
  scale_y_continuous(label = %) +
  # Use SPC colors for the 4 subregion varieties:
  scale_colour_manual(values = c("Micronesia" = spc_cols(1), "Polynesia" = spc_cols(3), "Melanesia" = spc_cols(4), "Complete" = "grey50")) +
  # Readable x axis tick marks (at an angle); and never too many vertical gridlines:
  theme(axis.textual content.x = element_text(angle = 45, hjust = 1),
        panel.grid.minor = element_blank(),
        plot.caption = element_text(color = "grey50")) +
  # labels for the axes, plot title, legend:
  labs(x = "Inhabitants in 2025 (logarithmic scale)",
       y = "Compound annual inhabitants development fee 2020 to 2025",
       color = "",
       title = "Present inhabitants and up to date development within the Pacific",
       subtitle = "Populations of the Pacific Island nation and territory members of the Pacific Group (SPC). 
",
       caption = the_caption)

There’s a couple of methods used right here, most necessary of which might be the best way I’ve used the precise inhabitants sizes as horizontal axis labels. That is one thing that works nicely with a small variety of factors, and which I discovered from a Tufte ebook.

Scatter plot with highlights

Lastly for in the present day, I needed a model of the identical plot that highlighted the international locations which have simple mobility to a bigger a richer nation—that’s, France (three territories), the USA (three territories and three self-governing international locations), New Zealand (three members of the “Realm of New Zealand”) or the UK (Pitcairn). One of many themes of my discuss was the best way that in international locations the place individuals can transfer, a sure variety of them usually do. This can be a very politically and culturally delicate level, and it’s not one I’m going to attempt to discover the explanations for right here, however we are able to definitely word it as a dominant truth of significance for understanding the demographic dynamics of the Pacific. It’s certainly one of two or three crucial massive image factors that specify most of the variations between Kiribati (very densely populated on Tarawa and comparatively poor) and Marshall Islands (much less apparent extreme inhabitants density, increased way of life), for instance.

My plot with the highlights—that are simply outsized level geoms utilizing form number one, a hole circle—exhibits this properly I imagine:

And right here is the code for that plot:

easy_mobility <- c("Pitcairn", 
                   "Niue", "Tokelau", "Prepare dinner Islands", 
                   "Wallis and Futuna", "New Caledonia", "French Polynesia",
                   "Guam", "Northern Mariana Islands", "American Samoa",
                   "Marshall Islands", "Palau", "Micronesia, Fed. States")

# test all are in knowledge other than the 2 we intentionally dropped
stopifnot(sum(!easy_mobility %in% d4$pict) == 2)

d4 |> 
  ggplot(aes(x = pop2025, y = cagr, color = area)) +
  # Draw a pale (clear, alpha) background rectangle for the unfavourable development international locations:
  annotate("rect", xmin = 30, xmax = Inf, ymin = 0, ymax = -Inf, alpha = 0.1, fill = "purple") +
  # Largish factors for every nation:
  geom_point(dimension = 2.5, alpha = 0.5) +
  # labels for every nation:
  # geom_label_repel(aes(label = pict), seed = 7, household = the_font, dimension = 2.7, label.dimension = 0, fill = "clear") +
  geom_text_repel(aes(x = adjusted_x, label = pict, fontface = font_type), seed = 6, household = the_font, dimension = 2.7) +
  # For the smaller international locations, use precise populations because the factors for markers on the axis.
  # For bigger than 10,000, there are too many international locations and it might be cluttered, so use 3, 10, 30, 100, and so on.
  geom_point(knowledge = filter(d4, pict %in% easy_mobility), dimension = 6, form = 1, color = "black") +
  scale_x_log10(label = comma, 
                breaks = signif(c(type(distinctive(d3$pop2025))[c(1:4, 8, 9, 12, 23:25)], 3e5), 3)) +
  scale_y_continuous(label = %) +
  # Use SPC colors for the 4 subregion varieties:
  scale_colour_manual(values = c("Micronesia" = spc_cols(1), "Polynesia" = spc_cols(3), "Melanesia" = spc_cols(4), "Complete" = "grey50")) +
  theme_minimal(base_family = the_font) +
  # Readable x axis tick marks (at an angle); and never too many vertical gridlines:
  theme(axis.textual content.x = element_text(angle = 45, hjust = 1),
        panel.grid.minor = element_blank(),
        plot.caption = element_text(color = "grey50")) +
  # labels for the axes, plot tile, legend:
  labs(x = "Inhabitants in 2025 (logarithmic scale)",
       y = "Compound annual inhabitants development fee 2020 to 2025",
       color = "",
       title = "Present inhabitants and up to date development within the Pacific",
       subtitle = "Populations of the Pacific Island nation and territory members of the Pacific Group (SPC). 
Nations and territories with simple migration entry to a bigger nation are highlighted.",
       caption = the_caption)

That’s all for in the present day. In subsequent blogs I’ll present how I drew the opposite charts within the unique presentation, with web migration, diaspora sizes, Pacific Islander populations in numerous world cities, and remittances.



Related Articles

Latest Articles