Content
- We discussed
microbenchmark
, which helps to test and compare time execution.
- Example to calculate number of week between 2 variables (casting as dates vs strings) using difftime
- Example to calculate mean based on a grouped column using (Base r aggregate vs dplyr group_by and summarise_at)
- Example to compare vector initialisation (x <- c() vs x <- vector(“integer”, n)) to calculate acumulative addition.
- Example to calculate mean in a dataframe column (mean(dt[dt\(b > .5, ]\)a) vs mean(dt\(a[dt\)b > .5]))
- Example to compare 1:n and seq(n)
- Example to compare old pipe and new pipe
- We discussed
data.table
, which speeds up data manipulation.
- We discussed different file format ‘csv’, ‘RDS’ and ‘Parquet’, their compatibilities, vulnerabilities and storage compression.
- We discussed ‘arrow’ package Parquet compression types: ‘gzip’, ‘snappy’ and ‘uncompressed’.