| Title: | Classifications for Statistics Norway |
|---|---|
| Description: | Functions to search, retrieve, apply and update classification standards and code lists using Statistics Norway's API <https://www.ssb.no/klass> from the system 'KLASS'. Retrieves classifications by date with options to choose language, hierarchical level and formatting. |
| Authors: | Susie Jentoft [aut], Diana-Cristina Iancu [aut], Lisa Li [aut], Øyvind I. Berntsen [aut, cre], Statistics Norway [cph] |
| Maintainer: | Øyvind I. Berntsen <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.6 |
| Built: | 2026-05-28 11:05:58 UTC |
| Source: | https://github.com/statisticsnorway/ssb-klassr |
Match and convert a classification
apply_klass( x, classification, date = NULL, variant = NULL, correspond = NULL, language = "nb", output_level = NULL, output = "name", format = TRUE ) ApplyKlass( x, klass, date = NULL, variant = NULL, correspond = NULL, language = "nb", output_level = NULL, output = "name", format = TRUE )apply_klass( x, classification, date = NULL, variant = NULL, correspond = NULL, language = "nb", output_level = NULL, output = "name", format = TRUE ) ApplyKlass( x, klass, date = NULL, variant = NULL, correspond = NULL, language = "nb", output_level = NULL, output = "name", format = TRUE )
x |
Input vector of classification codes. Vector must match "code" column from a call to get_klass(). |
classification |
Classification number |
date |
String for the required date of the classification. Format must be "yyyy-mm-dd". For an inverval, provide two dates as a vector. If blank, will default to today's date. |
variant |
The classification variant to fetch (if a variant is wanted). |
correspond |
ID number for target in correspondence table. For correspondence between two dates within the same classification, use correspond = TRUE. |
language |
Default "nb" for Norwegian (Bokmål). Also "nn" (Nynorsk) and "en" (English available for some classifications) |
output_level |
Desired output level |
output |
String describing output. May be "name" (default), "code" or "both". |
format |
Logical for whther to run formatting av input vector x (Default = TRUE), important to check if formatting is in one level. |
klass |
Deprecated; use |
A vector or data frame is returned with names and/or code of the desired output level.
data(klassdata) kommune_names <- apply_klass( x = klassdata$kommune, classification = 131, language = "en", format = FALSE )data(klassdata) kommune_names <- apply_klass( x = klassdata$kommune, classification = 131, language = "en", format = FALSE )
Correspondence list Print a list of correspondence tables for a given classification with source and target IDs
correspond_list(classification, date = NULL) CorrespondList(klass, date = NULL)correspond_list(classification, date = NULL) CorrespondList(klass, date = NULL)
classification |
Classification number |
date |
Date for classification (format = "YYYY-mm-dd"). Default is current date |
klass |
Deprecated; use |
Data frame with list of corrsepondence tables, source ID and target ID.
correspond_list("7")correspond_list("7")
Find which codes should be combined to reconstruct older or newer codes across multiple dates. Creates groups dynamically to fit the provided date range, with flexible labelling functionality.
find_equivalent_codes( classification, dates, labels = TRUE, graph = klass_graph(classification), date_format = "%Y" ) find_equivalents( classification, dates, labels = TRUE, graph = klass_graph(classification), date_format = "%Y" )find_equivalent_codes( classification, dates, labels = TRUE, graph = klass_graph(classification), date_format = "%Y" ) find_equivalents( classification, dates, labels = TRUE, graph = klass_graph(classification), date_format = "%Y" )
classification |
The Klass classification to be used |
dates |
The dates that equivalent sets of codes should be found for. |
labels |
This parameter controls whether or not to add group labels to groups of
equivalent sets. If "1508 Ålesund, 1580 Haram" If This parameter also accepts a named list of labelling functions. The resulting dataset will contain a label column for each of the supplied functions. The names of the label columns are specified using the names of the list of functions. The functions provided in this parameter can accept any of the following
parameters: The following list, when supplied to this parameter, creates two label
columns: one containing the codes and names ( list(
label1 = function(code, name, date, ...) {
label_codes <- date == max(date)
paste(code[label_codes], name[label_codes], collapse = ", ")
},
label2 = function(code, date, ...) {
label_codes <- date == max(date)
paste(code[label_codes], collapse = ", ")
}
)
|
graph |
Optional. Generating the graph using |
date_format |
Optional. Passed directly to format, this is used to
specify the output format for the |
This function provides a solution to the problem of split or
combined codes in Klass classifications. When using update_klass to ask
"what is this code in this version of the classification in this other
version of the classification?", the answer is sometimes that the code has
been split into two or more codes (or combined from two or more codes, if
trying to back-date a code), and therefore that the code cannot be updated.
The solution provided by find_equivalent_codes is answering the question: "in
these versions of the classification, which codes were equivalent to this
code in this other version of the classification?".
Consider the following example of two codes combining into one. Here, "a"
and "b" are valid at t1, and are combined into "c" at t2.
t1 t2
a ──┰─> c
┃
b ──┚
update_klass would inform us that "a" can be updated to
"c" at t2, unless we specified combine = FALSE, in which case the
result would be NA. find_equivalent_codes() would inform us that the
equivalent of the codes "a" and "b" in t1 at t2 is "c".
We can also consider a code splitting into two. In this example, "a" is
valid at t1, and splits into "b" and "c" at t2.
t1 t2 a ├─────> b └─────> c
update_klass is unable to provide an updated code due to the
split, and would return NA. find_equivalent_codes would inform us that the
equivalent codes of "a" at t1 is "b" and "c" at t2.
find_equivalent_codes can handle more than two dates. In the following
example, "a" splits into "b" and "c" at t2, and "b" and "c"
combine into "d" at t3. find_equivalent_codes can inform us that "a" is
equivalent to "b" and "c" at t2, and "d" at t3.
t1 t2 t3 a ├─────> b ┐ └─────> c ┴─> d
find_equivalent_codes will only search in the time range we specify. As a
consequence, generating sets of equivalent codes over longer time spans
will generally create larger sets than using shorter time spans.
To illustrate this behavior, we can add a new code "e" to the previous
example, and have "d" and "e" combine into "f" at t4.
t1 t2 t3 t4 a ├─────> b ┐ └─────> c ┴─> d ┐ e ──────────────┴──> f
Finding the
equivalents of "a" in t1 at t2 and t3 returns the same sets as
before:
t1: "a"
t2: "b" and "c"
t3: "d"
However, if we also wanted to know the equivalent set for t4, the result would be:
t1: "a" and "e"
t2: "b", "c" and "e"
t3: "d" and "e"
t4: "f"
find_equivalents is a legacy alias for find_equivalent_codes.
A data.frame with columns:
date containing the input dates
code containing the set of equivalent codes in each date
name containing the names of each code
validFrom and validTo values for each code returned
By default, labels, giving a unique group label for each group of equivalent sets.
Find the equivalent sets of a node at various dates
find_equivalent_nodes(node, dates, graph)find_equivalent_nodes(node, dates, graph)
node |
The node that we're finding the equivalent sets of |
dates |
The dates that we want to find the equivalent sets in |
graph |
The graph that the nodes come from |
A named list of length(dates) containing the equivalent nodes for
each date. The names of the returned list are the dates provided in date,
coerced to character.
Identify corresponding family from a classification number
get_family(classification) GetFamily(klass)get_family(classification) GetFamily(klass)
classification |
Classification number |
klass |
Deprecated; use |
Family number
get_family(classification = 7)get_family(classification = 7)
Fetch Statistics Norway classification data using API
get_klass( classification, date = NULL, correspond = NULL, correspondID = NULL, variant = NULL, output_level = NULL, language = "nb", output_style = "normal", notes = FALSE, quiet = TRUE ) GetKlass( klass, date = NULL, correspond = NULL, correspondID = NULL, variant = NULL, output_level = NULL, language = "nb", output_style = "normal", notes = FALSE, quiet = TRUE )get_klass( classification, date = NULL, correspond = NULL, correspondID = NULL, variant = NULL, output_level = NULL, language = "nb", output_style = "normal", notes = FALSE, quiet = TRUE ) GetKlass( klass, date = NULL, correspond = NULL, correspondID = NULL, variant = NULL, output_level = NULL, language = "nb", output_style = "normal", notes = FALSE, quiet = TRUE )
classification |
Number/string of the classification ID/number. (use klass_list() to find this) |
date |
String for the required date of the classification. Format must be "yyyy-mm-dd". For an inverval, provide two dates as a vector. If blank, will default to today's date. |
correspond |
Number/string of the target classification for correspondence table (if a correspondence table is requested). |
correspondID |
ID number of the correspondence table to retrieve. Use as an alternative to correspond. |
variant |
The classification variant to fetch (if a variant is wanted). |
output_level |
Number/string specifying the requested hierarchy level (optional). |
language |
Two letter string for the requested language output. Default is Bokmål ("nb"). Nynorsk ("nn") and English ("en") also available for some classification.) |
output_style |
String variable for the output type. Default is "normal". Specify "wide" for a wide formatted table output. |
notes |
Logical for if notes should be returned as a column. Default FALSE |
quiet |
Logical for whether to suppress the printing of the API address. Default TRUE. |
klass |
Deprecated; use |
The function returns a data frame of the specified classification/correspondence table. Output variables include: code, parentCode, level, and name for standard lists. For correspondence tables variables include: sourceCode, sourceName, targetCode and targetName. For date correspondence tables variables include: oldCode, oldName, newCode and newName. For "wide" output, code and name with level suffixes is specified. For date ranges, validFromInRequestedRange and validToInRequestedRange give the dates for the classification. Variable ChangeOccured gives the effective date for classification change in classification change tables.
# Get classification for occupation classifications head(get_klass(classification = "7")) # Get classification for occupation classifications in English head(get_klass(classification = "7", language = "en"))# Get classification for occupation classifications head(get_klass(classification = "7")) # Get classification for occupation classifications in English head(get_klass(classification = "7", language = "en"))
Get the name of a classification version
get_name(version) GetName(version)get_name(version) GetName(version)
version |
Version number |
string or vector of strings with name of version
get_name("33")get_name("33")
Get version number of a class given a date
get_version(classification = NULL, date = NULL, family = NULL, klassNr = FALSE) GetVersion(klass = NULL, date = NULL, family = NULL, klassNr = FALSE)get_version(classification = NULL, date = NULL, family = NULL, klassNr = FALSE) GetVersion(klass = NULL, date = NULL, family = NULL, klassNr = FALSE)
classification |
Classification number |
date |
Date for version to be valid |
family |
Family ID number if a list of version number for all classes is desired |
klassNr |
True/False for whether to output classification numbers. Default = FALSE |
klass |
Deprecated; use |
Number, vector or data frame with version numbers and calssification numbers if specified.
get_version(7)get_version(7)
A nested list of graph data for using in testing
klass_131_1964_graphklass_131_1964_graph
An object of class igraph of length 2000.
A nested list of graph data for using in testing
klass_131_2020_graphklass_131_2020_graph
An object of class igraph of length 2000.
A nested list of graph data for using in testing
klass_131_graphklass_131_graph
An object of class igraph of length 2000.
Build a directed graph of code changes based on a Klass classification
klass_graph(classification, date = NULL)klass_graph(classification, date = NULL)
classification |
The ID of the desired classification. |
date |
The date which the edges of the graph should be directed towards. Defaults to the current year plus one, which ensures the graph is directed to the most recent codes. |
An igraph object with the vertexes representing codes, and
edges representing changes between codes. The direction of the edges
represent changes towards the date specified in date.
library(klassR) # Build a graph directed towards the most recent codes ## Not run: klass_131 <- klass_graph(131) ## End(Not run) # Build a graph directed towards valid codes in 2020. ## Not run: klass_131_2020 <- klass_graph(131, "2020-01-01") ## End(Not run)library(klassR) # Build a graph directed towards the most recent codes ## Not run: klass_131 <- klass_graph(131) ## End(Not run) # Build a graph directed towards valid codes in 2020. ## Not run: klass_131_2020 <- klass_graph(131, "2020-01-01") ## End(Not run)
Given a Klass graph, find the node corresponding to a code and (optionally) a date.
klass_node(graph, x, date = NA)klass_node(graph, x, date = NA)
graph |
A graph generated by |
x |
The code to search for. |
date |
Optional. The specific date the supplied code is valid in. |
The node in the graph corresponding to the supplied code. If date is
not provided, the node with the most recent code is returned. If date is
provided, the code with date between validFrom and validTo is
returned.
## Not run: # Build a graph directed towards the most recent codes. library(klassR) klass_131 <- klass_graph(131) # Find the most recent node in the graph representing the code "0101" (Halden, # valid to 2020.) halden_node <- klass_node(klass_131, "0101") ## End(Not run)## Not run: # Build a graph directed towards the most recent codes. library(klassR) klass_131 <- klass_graph(131) # Find the most recent node in the graph representing the code "0101" (Halden, # valid to 2020.) halden_node <- klass_node(klass_131, "0101") ## End(Not run)
A dataset containing variables for testing of Statistics Norways classification API with the klassR package. Some observations are missing or incorrect for testing and demonstrations.
klassdataklassdata
A data frame containing 100 rows and 7 variables:
Identification number
1/2 variable for sex
4-digit number for education standard ISCED97 (level and subject area) NUS (klass = 66) 2015.01.01
4-digit code for Norwegian municipality (klass = 131). Based on 2015.01.01
Numeric variable for Norwegian municipality with dropped leading zero's for testing (klass = 131). Based on 2015.01.01
5-digit code for industry (NACE). Based on 01.01.2015 standard industry codes (klass = 7)
4-digit occupation codes using standard for STYRK-08 (klass = 7) 2015.01.01
Classification family list Print a list of all families and the number of classifications in each
list_family(family = NULL, codelists = FALSE, language = "nb") ListFamily(family = NULL, codelists = FALSE, language = "nb")list_family(family = NULL, codelists = FALSE, language = "nb") ListFamily(family = NULL, codelists = FALSE, language = "nb")
family |
Input family ID number to get a list of classifications in that family |
codelists |
True/False for whether to include codelists. Default = FALSE |
language |
Two letter string for the requested language output. Default is Bokmål ("nb"). Nynorsk ("nn") and English ("en"). |
dataset containing a list of families
list_family(family = 1)list_family(family = 1)
Classification list Get a full list of all classifications and codelists
list_klass(codelists = FALSE, language = "nb") ListKlass(codelists = FALSE, language = "nb")list_klass(codelists = FALSE, language = "nb") ListKlass(codelists = FALSE, language = "nb")
codelists |
True/False for whether to include codelists. Default = FALSE |
language |
Two letter string for the requested language output. Default is Bokmål ("nb"). Nynorsk ("nn") and English ("en"). |
A data frame containing a full list of classifications. The data frame includes the classification name, number, family and type.
head(list_klass(codelists = TRUE))head(list_klass(codelists = TRUE))
Search Klass
search_klass(query, codelists = FALSE, size = 20) SearchKlass(query, codelists = FALSE, size = 20)search_klass(query, codelists = FALSE, size = 20) SearchKlass(query, codelists = FALSE, size = 20)
query |
String with key word to search for |
codelists |
True/False for whether to include codelists. Default = FALSE |
size |
The number of results to show. Default = 20. |
Data frame of possible classifications that match the query
search_klass("occupation")search_klass("occupation")
Update multiple Klass codes to a desired date.
update_klass( codes, dates = NA, classification = NULL, date = NULL, graph = klass_graph(classification, date), output = "code", report = FALSE, combine = TRUE )update_klass( codes, dates = NA, classification = NULL, date = NULL, graph = klass_graph(classification, date), output = "code", report = FALSE, combine = TRUE )
codes |
Codes to be updated. |
dates |
Optional. Can be used to specify what date each of the codes was
valid in. Supply a character vector of either length 1 to specify the same
valid date for all codes, or of the same length as |
classification |
The ID of the desired classification. |
date |
Optional. Can be used to specify the date the codes should be
updated to, e.g. if you have codes that are valid in year |
graph |
Optional. A graph object generated by |
output |
Either a character vector, containing one or more of the items
in the list below, or
|
report |
|
combine |
|
If output = "code", a vector of length length(codes)
containing either a code if the update is successful or NA if the
code has been split. If combine = FALSE, a code being combined with
another code will also return NA.
If output == TRUE, a list of length length(codes) containing
data.frames detailing the codes visited through the node search. The
tables have the following columns.
If report == TRUE and length(output) > 1 | TRUE, the result
will be a list of data.frames with number of rows equal to the
number of codes in the sequence of changes between the input codes and
output codes. The columns in the data.frames are specified with
output.
If report == TRUE and length(output) == 1, the result will be
a list of character vectors with length equal to the number of codes in the
sequence of changes between the input code and output code. The contents of
the character vectors is specified with output.
If report == FALSE and length(output) > 1 | TRUE the result
will be a list of data.frames with one row representing the last
code in the change sequence and columns specified by output. If a
code has been split, the result will be NA. If combine ==
FALSE and a code is the result of a combination of codes, the result will
be NA.
If report == FALSE and length(output) == 1, the result will
be a character vector containing information about the updated codes
specified by output. If a code has been split, the result will be
NA. If combine == FALSE and a code is the result of a
combination of codes, the result will be NA.
library(klassR) codes <- get_klass(131, date = "2020-01-01")[["code"]] ## Not run: updated_codes <- update_klass(codes, dates = "2020-01-01", classification = 131 ) ## End(Not run)library(klassR) codes <- get_klass(131, date = "2020-01-01")[["code"]] ## Not run: updated_codes <- update_klass(codes, dates = "2020-01-01", classification = 131 ) ## End(Not run)
Given a node and a graph, find the node at the end of a sequence of changes.
update_klass_node(graph, node)update_klass_node(graph, node)
graph |
A graph generated by |
node |
A node as returned by |
A sequence of vertices, starting with node and ending with the
last visited node.
## Not run: # Build a graph directed towards the most recent codes. library(klassR) klass_131 <- klass_graph(131) # Find the most recent node in the graph representing the code "0101" (Halden, # valid to 2020.) halden_node <- klass_node(klass_131, "0101") # Find the most recent code corresponding to 0101 Halden halden_node_updated <- update_klass_node(klass_131, halden_node) ## End(Not run)## Not run: # Build a graph directed towards the most recent codes. library(klassR) klass_131 <- klass_graph(131) # Find the most recent node in the graph representing the code "0101" (Halden, # valid to 2020.) halden_node <- klass_node(klass_131, "0101") # Find the most recent code corresponding to 0101 Halden halden_node_updated <- update_klass_node(klass_131, halden_node) ## End(Not run)