Every day more data is generated on digital media than all the information stored in the history of humanity until 1970. If all the current number of bytes (common unit of storage made up of eight binary digits (0 and 1) or bits) on stacked compact discs, the tower would reach beyond the Moon (more than 384,400 kilometers), according to a study published this year in Science. And this work only analyzes what was stored between 1986 and 2007. The pandemic increased the use of digital technology by 400% and the forecast for the end of the decade is that this imaginary stack of disks would reach Mars only with the data generated in one year. : 10²⁴ bytes. It is not only a physical problem, but also a scientific one. The designations of the units of measurement have had to be updated with new prefixes since 1964 to accommodate these unimaginable figures, both above and below. The latter are ronna (10²⁷, symbol R) ronto (10⁻²⁷, r), quetta (10³⁰, Q) and quecto (10⁻³⁰, q).

Every time a border of the microscopic, physical, biological or mathematical world is crossed, a problem is generated to designate figures that allow their study, dissemination or application. It is not only because of the amount of data, which is the clearest example to approach these new magnitudes, but also because of the universal distances or, on the contrary, the mass of subatomic particles. Martin Hilbert, author of the study Science and professor at the University of Southern California (USA), explains that “human DNA in a single body can contain around 300 times more information than all technological devices store.”

The rapidity of discoveries or exceeding known limits leads to the adoption of informal terms. Many pages on the internet refer to the hellabyte (10²⁷ bytes) or at brontobyteunofficial terms and symbols (hyb) that can add confusion to investigations, since h is used for hecto (10²) and H for henry, the unit of inductance, while b stands for a barn (10⁻²⁸ m²) and B , a bel, unit of sound intensity and other physical magnitudes.

To tackle this conflict, “representatives of governments from around the world, gathered in the General Conference on Weights and Measures (CGPM)”, according to the Paris-based institution, approved last week “to introduce four new prefixes to the International System of Units (SI) with immediate effect”. They are the already mentioned ronna, quetta, ronto and quecto. In this way, the mass of the Earth is approximately six ronnagrams (5.975 trillion tons) and that of an electron, one quectogram.

The same Conference justifies the decision in “the essential role of the International System of Units to provide confidence in the accuracy and global comparability of measurements”, fundamental for industry, commerce, health or security. He also admits that “scientific communities depend on measurements that are not covered by the current range”, and gives examples of the amounts of digital information that already require magnitudes greater than 10²⁴, as well as the proliferation of “unofficial” terms.

The addition of prefixes is common in the measurement system. The CGPM already adopted peta and exa in 1975, to which zetta (10²¹), zepto (10⁻²¹), yotta (10²⁴) and yocto (10⁻²⁴) would be added years later. But the same organization admits that the main trigger for the incorporation of new magnitude denominations has been due to “the growing requirements of data science and digital storage, which already uses prefixes at the top of the existing range [yottabyte y zettabyte] to express large amounts of digital information.

In this sense, Richard Brown, promoter of the new terms and chief metrologist of the UK National Physical Laboratory in Teddington, explains that “the prefix system has expanded over the years in response to advances in science and technology that require access to a larger range of orders of magnitude related to measurement.” With this argument, Brown presented the proposal to the CGPM on November 17 after five years studying options and detecting unofficial names.

The metrologist, according to Naturesought to propose his prefixes terms and symbols that were not in use for units and that followed the tradition of ending in the letter a for multiplications, such as mega (1,000,000 bytes), popular for telephone offers, and in the letter o for smaller scales, such as micro(gram) or nano(meter).

Brown agrees with the CGPM in considering the measure adopted after his initiative “essential” due to “the demands of data science, with constant growth accelerated by widespread digitization and the arrival of new technologies, such as quantum computing.” “These new prefixes,” he argues, “will allow clear and unambiguous communication of these measurements for many years to come.”

The problem will be to identify new prefixes and symbols for magnitudes higher or lower than those recently approved. The common thing will be to resort to its numerical expression with a positive or negative major exponent or to a compound particle, such as kiloquetta or kiloronna.

For Brown, this conflict will take time. However, the speed of computing could shorten the timeframe. According to Hilbert, “the fastest growing area of ​​information processing is computing, which has increased by 58% in computing capacity in two decades.” In this sense, according to a Epoch researchers article, an as-yet-unreviewed AI forecasting organization, as more powerful and capable computer models are built, there is a lack of adequate data to train them. This is the case of language model researchers who, according to Teven Le Scao, from the artificial intelligence company Hugging Face, MIT Technology Review, “they are increasingly worried about running out of the data they need.” In this way, more and more information will be required to be able to discriminate the relevant and suitable one for machine learning.

