... using
ellipsis::check_dots_used(),
so that misspelled or irrelevant arguments are not silently ignored.
PLSrounding():
action_unused_dots controls how unused arguments are handled.allowed_unused_dots specifies argument names to ignore in the unused-argument check.SmallCountRounding.action_unused_dots and
SmallCountRounding.allowed_unused_dots.action_unused_dots is "inform" as a cautious starting point.
This may change to "warn" in a future release.preAggregate in PLSrounding().
preAggregate = NA (new default), the function now decides automatically: aggregation is applied unless freqVar is present
and the data contain no duplicated rows with respect to the relevant variables.rounded solution and a different output for inner.
In particular, this ensures that the output of type inner does not contain duplicate rows,
which is often the desired behavior.
However, be aware that it is no longer guaranteed that the inner output matches the input data.
Specify preAggregate = FALSE if this behavior is desired.?SmallCountRounding::reexports.tables_by_formulas(), which is reexported, is demonstrated in a PLSrounding() example.Extend0fromModelMatrixInput() is now used in data pre-processing.
As a result, hierarchical_extend0 is now a possible parameter, as illustrated in a PLSrounding() example.dimVar, hierarchies, or formula is specified.
dimVar was automatically generated from the remaining columns.tibble and data.table input (parameter data).
as.data.frame() where necessary to ensure consistent behavior.preAggregate is TRUE and aggregatePackage is "data.table", the use of as.data.frame() is skipped to avoid unnecessary back-and-forth conversion of data.table objects, preserving efficiency.PLSrounding() and its wrappers.get_klass() in the
klassR package
or hier_create() in the
sdcHierarchies package
can now be used directly as input. Example of usage:
a <- get_klass(classification = "24")
b <- hier_create(root = "Total", nodes = LETTERS[1:5])
mydata <- data.frame(tree = sample(a$code[nchar(a$code) > 1], 200, replace = TRUE),
letter = LETTERS[1:5])
PLSroundingPublish(mydata, roundBase = 5, hierarchies = list(tree = a, letter = b))
map_hierarchies_to_data() function.Formula2ModelMatrix() parameter avoidHierarchical = TRUE,
thanks to the new total_collapse() function which can be applied to output.FormulaSelection() now works with the output from PLSrounding().
extend0 is new parameter to PLSrounding(), enabling data to be automatically extended by zero frequency rows.
zeroCandidates = TRUE.PLSroundingFits() has been renamed from extend0 to extend0Fits. Code that used the old parameter will now behave differently.extend0 and extend0Fits can now be specified in more advanced ways beyond just TRUE/FALSE.step parameter, which can be passed to PLSrounding() and is documented in the underlying function RoundViaDummy():
step has been fixed.step parameter can now be specified as a vector for greater control.step parameter can significantly impact performance on large datasets. For example, using step = list(100) may be a useful approach.NAomit to SSBtools::Formula2ModelMatrix():
TRUE, NAs in the grouping variables are omitted in output and not included as a separate category.PLSrounding() and its wrappers.aggregateNA is new parameter to PLSrounding():
TRUE (default) to utilize the above NAomit parameter.aggregatePackage to "data.table" to utilize this possibility.
aggregatePackage is parameter to PLSrounding() and its wrappers.aggregateBaseOrder.R versions where the isFALSE function is not defined.identifyNew parameter when the maxRound parameter is used.
identifyNew parameter:
When TRUE, new cells may be identified after initial rounding to ensure all rounded publishable
cells equal to or less than maxRound to be roundBase multiples. Use NA for the a less conservative
behavior (old behavior). Then it is ensured that no nonzero rounded publishable cells are smaller
than roundBase. When maxRound is default, there is no difference between TRUE and NA.PLSroundingLoop: PLSrounding on portions of data at a time.
preDifference)zeroCandidates, forceInner, preRounded and plsWeights can now be specified as functions.
PLSroundingLoop.allSmall.
<= maxRound) are rounded. A simplified alternative to specifying forceInner.PLSroundingFits, for post-processing to expected frequencies
plsWeights is new parameter to RoundViaDummy (and PLSrounding)
freqVar in input.preAggregate: When TRUE, the data will be aggregated beforehand within the function by the dimensional variables.avoidHierarchical to Formula2ModelMatrix in the SSBtools package.rndSeed, a new parameter to RoundViaDummy (and PLSrounding).rndSeed = 123. This means that repeated runs with equal input will result in equal output.rndSeed to NULL."inner" or "publish".
output, a new parameter to PLSrounding.PLSroundingInner and PLSroundingPublish.dimVar is new parameter to RoundViaDummy and PLSrounding
preRounded is new parameter to RoundViaDummy (and PLSrounding)
HierarchiesAndFormula2ModelMatrix in the SSBtools packageleverageCheck and easyCheck are new parameters to RoundViaDummy
Reduce0exact in the SSBtools package is utilisedprintInc is new parameter to PLSrounding and RoundViaDummy
removeEmpty=TRUE to omit empty combinations
Hierarchies2ModelMatrix and HierarchiesAndFormula2ModelMatrix in the SSBtools packageinputInOutput is also mentioned in the RoundViaDummy documentation